
Ledgering uptime keeps owners accountable
We treat SLOs like a balance sheet. Every dependency gets a budget, every breach gets a root-cause memo, and we expose it to product so scope trades are explicit.
service: messaging-hub
slo:
availability: 99.9
latency_p95_ms: 600
dependencies:
- name: sendgrid
budget: 25%
- name: auth0
budget: 15%
alerts:
burn_rate: 4h
paging: squad-reliabilityOps rituals
- Golden-path checks live in CI and block merges when red
- Incident PR templates demand hypothesis + rollback path
- Blameless review within 48h with SLO debit/credit updates
What makes a good ledger entry
Every dependency should have an owner, budget, and current burn rate. Capture noisy neighbors and vendor risk in the same view so product can trade scope with eyes open.
slo_ledger:
service: api-gateway
owner: platform
budgets:
auth0: 15%
payments: 25%
postgres: 35%
alerts:
burn_2h: page platform-oncall
burn_24h: open ticket + slack #reliability
runbook: https://runbooks.ndi/api-gateway-sloWhat makes a good ledger entry
Every dependency should have an owner, budget, and current burn rate. Capture noisy neighbors and vendor risk in the same view so product can trade scope with eyes open.
Key takeaways
- Ledger SLOs
- Dependencies with budgets
- Actionable runbooks


