Capacity Scaling Runbook
Use this runbook when a capacity warning fires or when weekly baseline reports show sustained growth.
Triage
- Open the Grafana dashboard
Akira Capacity Sizing. - Confirm the alert is sustained over the configured
forwindow. - Compare current load with
docs/architecture/capacity-planning.md. - Check whether the same node has CPU, RAM, disk, and application-level growth.
- Record the decision in the weekly capacity report under
Recommendations.
Scale actions
| Signal | First response | Scale action |
|---|---|---|
| CPU >70% on Kamailio | Check CPS, retransmits, and routing reloads | Resize sip nodes cx33 to cx43 |
| RTP sessions >800 | Check active dialogs and media distribution | Add rtp-03 or resize existing rtp nodes |
| Postgres connections >80% | Check pgbouncer pools and backend pool settings | Tune pools before resizing DB |
| Root disk >80% | Check Prometheus, Loki, logs, and Timescale size | Expand volume or reduce retention |
| CDR growth >50GB hot | Check compression policy and retention plan | Review Timescale compression/offload |
Weekly baseline
Run manually from the management node if the timer is not enabled:
/usr/local/bin/akira-capacity-baseline
The report is written to /opt/akira/reports by default when deployed through
Ansible, or reports/ when run from the repository checkout.