Passa al contenuto principale

ADR-0016 - CDR pipeline implementation: NATS + sidecar + worker

  • Status: Accepted (2026-05-21)
  • Deciders: Massimo Bagnoli, Claude
  • Implementation tasks: TASK-111, TASK-112, TASK-113
  • Supersedes: nessuno
  • Superseded by: nessuno

Context

ADR-0007 and ADR-0010 define the asynchronous CDR pipeline pattern, but they do not fully document the concrete Wave 9-11 implementation. The smoke test on 2026-05-21 showed that the pattern existed as architectural guidance while the deployable pieces were still missing: no NATS role, no Kamailio bridge role and no CDR worker role.

This ADR records the implementation decision retroactively for audit trail and GA readiness.

Decision

Implement the CDR pipeline as three separate components.

NATS JetStream broker

infra/roles/nats deploys nats:2.10-alpine on akira-cache-01-staging, the stateful tier selected by ADR-0019.

The pilot deployment creates JetStream stream AKIRA_CDR with subjects cdr.raw and cdr.>, single-node topology and replicas=1. GA preparation is covered by TASK-202 with a 3-node NATS cluster.

kam-cdr-bridge sidecar

infra/roles/kam_cdr_bridge deploys the bridge on sip-01 and sip-02, co-located with Kamailio.

The sidecar follows the offline wheelhouse install pattern already used by kam-sync in TASK-106. It tails /var/log/kamailio/acc/cdr.jsonl and publishes records to NATS subject cdr.raw.

TASK-192 extends the same bridge to publish live-call events on call.events.<phase> for the Live Calls SSE path.

cdr-worker

infra/roles/cdr_worker runs a durable JetStream consumer on stream AKIRA_CDR and subject cdr.raw.

The worker inserts records into the TimescaleDB cdr hypertable and is idempotent through the cdr.call_id unique constraint. TASK-190 extends this worker with rating-engine execution and balance debit, documented by ADR-0017.

Rationale

  • The design is compliant with ADR-0010: database writes stay out of the signaling path.
  • JetStream provides durable acknowledgement and replay, so worker restarts do not drop CDR records.
  • The sidecar pattern keeps the bridge close to Kamailio while preserving process isolation.
  • NATS belongs on the stateful tier because JetStream persists messages on disk; ADR-0019 formalizes that placement.

Consequences

Positive

  • The pilot has a small operational surface: one stream, one consumer and a single-node NATS server.
  • CDR ingestion, live-call events and rating can share the same event boundary without putting work in Kamailio.
  • Worker redelivery plus database uniqueness gives at-least-once processing without double insertion.

Negative

  • The pipeline adds NATS, a bridge and a worker to the deployment surface.
  • The single-node pilot is not HA until TASK-202 completes.
  • Operational runbooks must cover NATS backlog, bridge lag and worker failures.

Alternatives considered

Kafka

Rejected. Kafka is operationally heavier than needed for the pilot scale and team footprint.

PostgreSQL NOTIFY/LISTEN

Rejected. It does not provide the durable replay and consumer backlog model required for billing-sensitive CDR processing.

Direct insert from a Kamailio plugin

Rejected. It would couple SIP signaling to database latency and violates ADR-0010.

Redis Streams

Rejected for this path. Redis remains useful elsewhere, but Akira already uses NATS as the durable event bus for CDR and future event flows.

References

  • ADR-0007: CDR pipeline with NATS JetStream.
  • ADR-0010: Kamailio CDR emit pattern.
  • ADR-0017: billing rating engine integration in cdr-worker.
  • ADR-0018: live calls SSE event stream.
  • ADR-0019: NATS placement on the stateful tier.
  • TASK-111: NATS role.
  • TASK-112: kam-cdr-bridge role and acc_json fix.
  • TASK-113: cdr-worker role.