Deploy Akira Signaling Layer
Pre-flight checks
- Verify all Trigger #2 VMs are online:
hcloud server list | grep -c staging
- Verify Tailscale/SSH reachability:
ansible trigger2 -i infra/inventory/staging.yml -m ping
- Verify Hetzner firewall rules are attached in the Hetzner console.
- Verify Magic DNS names resolve on Trigger #2 hosts:
ansible trigger2 -i infra/inventory/staging.yml -m ansible.builtin.command -a "hostname -f"
- Build the Kamailio htable sync wheelhouse used by Step 4:
scripts/build_kam_sync_wheelhouse.sh
Deploy
Run the full signaling orchestration from the repo root:
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt
deploy_signaling.yml loads infra/group_vars/all/main.yml and
infra/group_vars/all/vault.yml through vars_files; do not pass those files
again with --extra-vars.
Expected duration is about 15-25 minutes, mostly package install, service restart, and healthcheck time.
Step-by-step debug
Each orchestration phase has a tag:
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt --tags step1
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt --tags step3,step4
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt --tags smoke
Use --limit for a single host while debugging:
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt --tags step4 --limit akira-sip-01
Rollback
For one problematic node, stop and disable the affected service:
ansible <node> -i infra/inventory/staging.yml -b -m ansible.builtin.systemd -a "name=<service> state=stopped enabled=false"
Service names used by this playbook:
fail2banrtpenginefreeswitchfs-esl-gatewaykamailiokamailio-htable-sync.timer
For a full Trigger #2 staging rollback, power off the Trigger #2 VMs in Hetzner. Firewall and tailnet configuration can remain in place for retry.
Smoke test post-deploy
Step 7 runs one SIPp UAC success call:
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt --tags step7
Step 6 normally verifies backend health without restarting it. ADR-0011 moved
FreeSWITCH ESL handling to fs-esl-gateway, deployed on the FreeSWITCH nodes in
step 3. If a deploy also changes backend environment variables, explicitly
request a Docker Compose backend reload:
ansible-playbook -i infra/inventory/staging.yml infra/playbooks/deploy_signaling.yml \
--vault-password-file ~/.akira-vault-pass.txt --tags step6 \
-e deploy_signaling_requires_backend_reload=true
If it fails, check these first:
ansible akira-sip-01 -i infra/inventory/staging.yml -b -m ansible.builtin.command -a "journalctl -u kamailio -n 50 --no-pager"
ansible akira-sip-01 -i infra/inventory/staging.yml -b -m ansible.builtin.command -a "kamcmd htable.dump destinations"
ansible akira-fs-01 -i infra/inventory/staging.yml -b -m ansible.builtin.command -a "journalctl -u freeswitch -n 50 --no-pager"
ansible akira-rtp-01 -i infra/inventory/staging.yml -b -m ansible.builtin.command -a "journalctl -u rtpengine -n 50 --no-pager"
For packet-level SIP debugging during a re-run:
ansible akira-sip-01 -i infra/inventory/staging.yml -b -m ansible.builtin.command -a "tcpdump -i any port 5060 -nn -c 50"
Alert post-deploy
After the full deploy, verify Grafana has the signaling dashboard imported for Kamailio CPS, FreeSWITCH sessions, and RTPengine media port usage.