Passa al contenuto principale

Runbook - Vault Auto-Unseal Recovery

Normal Operation

Akira primary Vault runs on akira-mgmt-01-staging and uses transit auto-unseal from the transit Vault on akira-cache-01-staging.

RTO targets:

  • Auto-unseal after restart: under 5 seconds.
  • Manual transit unseal: 5 to 10 minutes if 1Password and quorum keys are available.

Prerequisites

  • SSH access to akira-mgmt-01-staging and akira-cache-01-staging.
  • 1Password vault Akira Staging access for unseal keys and break-glass material.
  • ~/.akira-vault-pass.txt available for redeploys.

Symptoms

  • Backend or worker crash loop mentioning Vault sealed.
  • vault status reports Sealed: true.
  • Secret reads fail during deploy or service startup.
ssh root@akira-mgmt-01-staging '
VAULT_ADDR=https://127.0.0.1:8200 vault status
systemctl status vault --no-pager
journalctl -u vault --since "30 min ago" --no-pager | tail -120
'

Cause A: Transit Vault Down Or Sealed

Diagnostics:

ssh root@akira-cache-01-staging '
systemctl status vault --no-pager
VAULT_ADDR=https://127.0.0.1:8200 vault status
'

Recovery:

ssh root@akira-cache-01-staging '
systemctl start vault
VAULT_ADDR=https://127.0.0.1:8200 vault status
'

If transit Vault is sealed, unseal it with 3 Shamir keys from 1Password:

ssh root@akira-cache-01-staging '
export VAULT_ADDR=https://127.0.0.1:8200
vault operator unseal <key_1>
vault operator unseal <key_2>
vault operator unseal <key_3>
vault status
'

Then restart primary Vault:

ssh root@akira-mgmt-01-staging '
systemctl restart vault
sleep 3
VAULT_ADDR=https://127.0.0.1:8200 vault status
'

Expected: Sealed is false.

Cause B: Transit Token Revoked Or Expired

Symptoms:

  • Transit Vault is unsealed and reachable.
  • Primary Vault remains sealed after restart.
  • Logs mention permission denied or token invalid for transit seal.

Recovery:

  1. Recreate the transit token using the existing setup workflow.
  2. Update vault_transit_token in encrypted Ansible vault.
  3. Redeploy Vault configuration to management.
  4. Restart primary Vault and validate.
cd /home/devcomm/akira
TRANSIT_HOST=akira-cache-01-staging.tail5f9c92.ts.net \
scripts/setup-vault-transit.sh

ansible-vault edit infra/group_vars/all/vault.yml \
--vault-password-file ~/.akira-vault-pass.txt

cd /home/devcomm/akira/infra
ansible-playbook -i inventory/staging.yml playbooks/deploy_management.yml \
--vault-password-file ~/.akira-vault-pass.txt \
--tags vault

ssh root@akira-mgmt-01-staging '
systemctl restart vault
sleep 3
VAULT_ADDR=https://127.0.0.1:8200 vault status
'

Cause C: Primary Manual Recovery

Use this only if transit recovery is not possible and the original primary unseal material is available.

ssh root@akira-mgmt-01-staging '
export VAULT_ADDR=https://127.0.0.1:8200
vault operator unseal <key_1>
vault operator unseal <key_2>
vault operator unseal <key_3>
vault status
'

Caveat: if the primary was initialized with transit seal, compatible transit state is still part of the recovery model. Do not rotate or recreate transit keys during an incident unless you are following the documented setup path.

Validation

ssh root@akira-mgmt-01-staging '
VAULT_ADDR=https://127.0.0.1:8200 vault status | grep "Sealed.*false"
docker compose -f /opt/akira/docker-compose.yml restart backend cdr-worker
sleep 5
docker ps --filter "name=akira-backend\\|akira-cdr-worker" \
--format "{{.Names}} {{.Status}}"
'

Expected:

  • Primary Vault reports unsealed.
  • Backend and CDR worker stay running.
  • Deploys can decrypt and read required secrets.

Escalation

Elapsed TimeAction
T+5 minEscalate if transit is sealed and keys are unavailable.
T+10 minEscalate to Massimo if primary Vault remains sealed.
T+20 minEscalate to Francesco and stop token/key changes until reviewed.

Caveats

  • Do not paste unseal keys into tickets, logs, or chat.
  • Do not commit generated Vault tokens.
  • Do not initialize a replacement transit Vault unless the current transit data is confirmed lost.
  • Do not restart dependent services repeatedly while Vault remains sealed.