NDI

Shipping AI copilots with safety rails banner

Field note8 min read

Guardrails live closest to the user intent

We keep every copilot flow observable by default. Prompts are versioned, evals run on golden sets nightly, and we ship chat transcripts with redaction and policy outcomes so product and security see the same truth.

Policy gate with typed tool calls

type Escalation = { owner: string; reason: string };

const approveToolCall = definePolicy<Escalation>({
  id: "pii-blocker",
  onViolation: ({ tool, input, reasons }) => ({
    owner: "trust-and-safety@ndi.studio",
    reason:       "PII caught before calling " + tool + " => " + JSON.stringify(reasons) + "",
  }),
});

const result = await guardedAgent.run({
  prompt,
  tools: [crmSearch, createTicket],
  policies: [approveToolCall],
});

if (!result.allowed) audit.log(result.violation);

What ships with every copilot

Offline eval harness with regressions tracked per prompt version
Streaming traces + heatmaps so latency budgets stay honest
Fallback UX that keeps agents quiet when confidence drops

“The difference was the safety telemetry—we could show legal exactly where guardrails triggered before launch.”
Director of Product, fintech client

Architecture sketch

Treat the copilot like a pipeline: intent → policy → plan → tools → answer. Observability wraps every hop with traces and redaction to keep legal and ops in the same loop.

Debug checklist

Trace spans for tool calls with input/output masking
Golden set pass rate by prompt version
Fallback success rate when policy blocks execution

Launch criteria

Evals ≥ 0.86 accuracy on golden set, P95 latency < 1.2s, policy hit rate < 6%, and CSAT ≥ 4.4/5 in pilot.

Architecture sketch

Treat the copilot like a pipeline: intent → policy → plan → tools → answer. Observability wraps every hop with traces and redaction to keep legal and ops in the same loop.

Debug checklist

Trace spans for tool calls with input/output masking
Golden set pass rate by prompt version
Fallback success rate when policy blocks execution

Key takeaways

Guardrails before launch
Evals on every prompt
Human-in-the-loop defaults

LLMEvaluationProduct

Shipping AI copilots with safety rails

Guardrails live closest to the user intent

Architecture sketch

Architecture sketch

Keep exploring

Designing SaaS uptime like a reliability ledger

Content engines that don’t burn out your team

Building multilingual chat for regulated industries