ADR-0019 - NATS placement: cache-01 stateful tier
- Status: Accepted (2026-05-21)
- Deciders: Massimo Bagnoli, Claude
- Implementation tasks: TASK-111, TASK-202
- Supersedes: ADR-0007 staging deployment note that placed NATS on
mgmt-01 - Superseded by: nessuno
Context
NATS broker deployment had two candidate hosts for the pilot:
akira-mgmt-01-staging, the app tier that runs backend FastAPI, Caddy, Vault and Prometheus.akira-cache-01-staging, the stateful tier that already hosts Redis.
NATS JetStream is not a purely stateless broker in this design because it uses file storage for durable streams.
Decision
Place NATS on akira-cache-01-staging.
The pilot uses a single-node NATS deployment with a connection string of the
form nats://akira:pass@100.x.x.x:4222. TASK-202 prepares the GA migration to
a 3-node NATS cluster on distinct VMs. cache-01 can later evolve into a
dedicated nats-01 node if the workload justifies it.
Rationale
Akira follows a 3-tier operational model:
- Management/app tier: backend FastAPI, Caddy, Vault and Prometheus.
- Signaling/media tier: Kamailio, RTPengine and FreeSWITCH.
- Stateful tier: Postgres, Redis and NATS JetStream.
NATS belongs to the stateful tier because JetStream persists messages on file
storage. Keeping it off mgmt-01 also preserves the app tier's future
horizontal scaling path and isolates app-tier failures from the event bus.
cache-01 already has Docker installed for Qdrant, so the marginal pilot
deployment cost is low.
Consequences
Positive
- State-bearing services stay grouped on the stateful tier.
mgmt-01remains closer to a stateless app and observability node.- Failure of
mgmt-01does not necessarily take NATS down. - Pilot cost stays lower than provisioning a dedicated NATS VM immediately.
Negative
cache-01carries another stateful service and must be monitored for disk, CPU and memory contention.- Backup procedures for
cache-01must include JetStream state. - The pilot remains single-node until TASK-202 introduces HA.
Alternatives considered
Place NATS on mgmt-01
Rejected. It couples stateful event-bus storage to the app tier and conflicts with the intended scaling model.
Create a dedicated nats-01 VM immediately
Rejected for the pilot. The additional monthly cost is not justified before HA cluster work and production load require it.
References
- ADR-0007: original CDR pipeline NATS decision.
- ADR-0016: concrete CDR pipeline implementation.
- TASK-111: NATS role on
cache-01. - TASK-202: NATS cluster HA preparation.
feedback_runner_executes_code_not_deploy.md: 3-tier deployment pattern.