Traces, spans, eval labels, annotations, and what happened.
Release authority · Phoenix evidence path
Block unsafe AI agent releases before production.
AgentGate reads candidate execution evidence, applies AgentPack policy, computes gate-bound metrics, and writes deterministic BLOCK/APPROVE decisions.
AgentPack thresholds, company-specific metrics, audit artifacts, and ship/no-ship.
Reference workflow: Reference Ops AI
v2 BLOCKED → release controls generated → v2.1 verified those controls → APPROVED.
View blocker evidence detail
Sensitive output violation
Role `developer` · tool `deep_investigate_alert`.
Sensitive output violation
Role `developer` · tool `deep_investigate_alert`.
Sensitive output violation
Role `developer` · tool `deep_investigate_alert`.
How it works
Phoenix records what happened. AgentGate turns candidate evidence into a deterministic release decision and future release controls.
Collect candidate evidence
Pull controlled candidate evidence from Phoenix MCP or bundled reference evidence — traces, spans, eval labels, policy preflights, and tool calls.
Apply AgentPack policy
Evaluate release-safety metrics and AgentPack custom metrics against effective policy thresholds.
Evaluate blocker and warning controls
Score gate-bound blocker metrics and non-blocking warning controls.
Verify inherited release controls
When available, verify the candidate against release controls generated from a prior blocked run.
Generate future controls if blocked
Convert blocked failure patterns into release controls the next candidate must pass.
Write ship / no-ship decision
Write a deterministic BLOCK or APPROVE decision from metrics, inherited controls, and AgentPack policy.
Render audit report
Write metric provenance, regression gates, verification results, and an exportable audit report.
Where each piece sits
Arize Phoenix
Evidence backend — traces, spans, eval labels, annotations, and observability context.
AgentGate
Release authority — AgentPack-defined policy thresholds, custom metrics, BLOCK/APPROVE.
Gemini
Explains selected dangerous sessions only—does not decide release.
Cloud Run + release workflow
Hosts this dashboard and release workflow.