Production Deploy Procedure - Akira
Pre-deploy checklist
- Master branch CI green: lint, tests, ZAP, and Trivy pass.
- SIP-only smoke test green on staging.
- Recent DR backup available, less than 24h old.
- On-call notification sent 24h before the maintenance window.
- No active SEV1/SEV2 incident.
Operator context
The standard operator entrypoint is a Mac connected to the Akira tailnet.
cd /Users/massimobagnoli/Documents/Claude/Projects/Akira
git pull origin master
git log --oneline -5
When running from the VPS or task runner, use the local repo path instead:
cd /home/devcomm/akira
git pull origin master
git log --oneline -5
Verify vault secrets
ansible-vault view infra/group_vars/all/vault.yml \
--vault-password-file ~/.akira-vault-pass.txt | grep -c "_"
Expected result: the count matches the current
infra/group_vars/all/_secrets_manifest.yml inventory.
Dry-run playbook
ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt \
--check --diff
Review the diff. Stop the deploy if the diff includes unexpected service, network, vault, or inventory changes.
Deploy order
Run tiers in this order.
1. Stateful
Database, cache, NATS, and stateful dependencies:
ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_stateful.yml \
--vault-password-file ~/.akira-vault-pass.txt
2. Signaling
Kamailio, RTPengine, FreeSWITCH, and signaling sidecars:
ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt
3. Management
Backend, frontend, AgentCore-facing MCP endpoint, Prometheus, Grafana, Loki, Tempo, and related management services:
ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/deploy_management.yml \
--vault-password-file ~/.akira-vault-pass.txt
Post-deploy smoke test
bash infra/load-test/run-pilot-profile.sh --duration 60
Expected result:
- ASR greater than 95%.
- FAS lower than 2%.
- PDD p95 lower than 500ms.
Also run the standard Ansible smoke test when the change touches multiple tiers:
ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/smoketest.yml \
--vault-password-file ~/.akira-vault-pass.txt
Verify Sentry
Open the Akira Sentry project and verify there are no new critical or error events for 5 minutes after deploy.
Rollback triggers
Rollback if any condition is true:
- Smoke test fails.
- Sentry critical or error events spike more than 5x baseline.
- Customer-affecting issue is reported within 10 minutes.
- p95 latency is more than 2x baseline.
- CDR pipeline lag continues increasing for 10 minutes.
Rollback procedure
Select the previous stable commit:
git log --oneline -10
git checkout <previous-stable-commit>
Re-run only the affected tier:
ansible-playbook -i infra/inventory/staging.yml \
infra/playbooks/<affected-playbook>.yml \
--vault-password-file ~/.akira-vault-pass.txt
If the issue is signaling-specific, also follow deploy-signaling.md.
Post-deploy
- Update
CHANGELOG.md. - Notify Telegram channel
akira-deploys. - Schedule the next deploy at least 7 days out.
- Avoid Friday deploys unless approved as an emergency fix.