Passa al contenuto principale

Production Deploy Procedure - Akira

Pre-deploy checklist

  • Master branch CI green: lint, tests, ZAP, and Trivy pass.
  • SIP-only smoke test green on staging.
  • Recent DR backup available, less than 24h old.
  • On-call notification sent 24h before the maintenance window.
  • No active SEV1/SEV2 incident.

Operator context

The standard operator entrypoint is a Mac connected to the Akira tailnet.

cd /Users/massimobagnoli/Documents/Claude/Projects/Akira
git pull origin master
git log --oneline -5

When running from the VPS or task runner, use the local repo path instead:

cd /home/devcomm/akira
git pull origin master
git log --oneline -5

Verify vault secrets

ansible-vault view infra/group_vars/all/vault.yml \
--vault-password-file ~/.akira-vault-pass.txt | grep -c "_"

Expected result: the count matches the current infra/group_vars/all/_secrets_manifest.yml inventory.

Dry-run playbook

ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt \
--check --diff

Review the diff. Stop the deploy if the diff includes unexpected service, network, vault, or inventory changes.

Deploy order

Run tiers in this order.

1. Stateful

Database, cache, NATS, and stateful dependencies:

ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_stateful.yml \
--vault-password-file ~/.akira-vault-pass.txt

2. Signaling

Kamailio, RTPengine, FreeSWITCH, and signaling sidecars:

ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt

3. Management

Backend, frontend, AgentCore-facing MCP endpoint, Prometheus, Grafana, Loki, Tempo, and related management services:

ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_management.yml \
--vault-password-file ~/.akira-vault-pass.txt

Post-deploy smoke test

bash infra/load-test/run-pilot-profile.sh --duration 60

Expected result:

  • ASR greater than 95%.
  • FAS lower than 2%.
  • PDD p95 lower than 500ms.

Also run the standard Ansible smoke test when the change touches multiple tiers:

ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/smoketest.yml \
--vault-password-file ~/.akira-vault-pass.txt

Verify Sentry

Open the Akira Sentry project and verify there are no new critical or error events for 5 minutes after deploy.

Rollback triggers

Rollback if any condition is true:

  • Smoke test fails.
  • Sentry critical or error events spike more than 5x baseline.
  • Customer-affecting issue is reported within 10 minutes.
  • p95 latency is more than 2x baseline.
  • CDR pipeline lag continues increasing for 10 minutes.

Rollback procedure

Select the previous stable commit:

git log --oneline -10
git checkout <previous-stable-commit>

Re-run only the affected tier:

ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/<affected-playbook>.yml \
--vault-password-file ~/.akira-vault-pass.txt

If the issue is signaling-specific, also follow deploy-signaling.md.

Post-deploy

  • Update CHANGELOG.md.
  • Notify Telegram channel akira-deploys.
  • Schedule the next deploy at least 7 days out.
  • Avoid Friday deploys unless approved as an emergency fix.