On-Call Rotation - Akira Pilot
Schedule
- Pilot Phase 1: solo Massimo, 24/7, no rotation.
- Pilot Phase 2: two-engineer weekly rotation, business hours plus best-effort weekend.
- GA: three-engineer rotation, 24/7 PagerDuty-like coverage.
Tools
- Pager: Telegram bot
@akira_oncall. - Dashboard: Grafana at
http://grafana.akira.local. - Status: internal
/statuspage. - Comms: Telegram channel
akira-oncall. - Runbooks: this directory, starting from README.md.
On-call duties
- Acknowledge alerts within 15 minutes.
- Triage and respond through incident-response.md.
- Escalate through the matrix in README.md.
- Update the status page for customer-affecting incidents.
- Log incidents in
docs/incidents/YYYY-MM-DD-summary.md. - Convert post-incident action items into tracked tasks.
Handoff
At shift start:
- Review incidents from the last 7 days.
- Review open NOC tickets.
- Check deploy calendar and upcoming maintenance windows.
- Verify alerting and Telegram reachability.
At shift end:
- Send a summary to the next on-call engineer.
- Include open incidents, risky alerts, pending deploys, and customer-impacting tickets.
- Confirm the next on-call engineer has acknowledged the handoff.
Escalation
Use the escalation matrix in README.md. For SEV1, page secondary on-call and Massimo after 15 minutes if mitigation is not already in progress.