Phase 12 of 12 · Security Operator

Agent Governance & Security

Phase 12 is the work of defining what agents can do and under whose authority, where humans set threat models, least privilege, and audit trails at the tool boundary.

Define what agents can do, under whose authority, with which guardrails and audit trail.

Decision rules

Each rule connects a real situation to the skill or playbook that fits it. Linked terms open canonical sources.

Decision rules for Agent Governance & Security
Situation Missing skill Recommended playbook Alternatives Why
Agent-generated code is shipping to production without a security pass on it. Automated security scanning codex-security:security-scan Snyk / Semgrep Codex-security:security-scan is tuned for agent-authored diffs and blocks on high-severity findings; Snyk and Semgrep are general-purpose and need policy work to be as strict.
An agent can call any tool it has access to using the user's full identity. Threat modelling codex-security:threat-model STRIDE workshop Codex-security:threat-model defines agent identity, scope and policy at the tool boundary; a STRIDE workshop is the broader org-wide exercise when multiple systems are in scope.
Production agents have tool access in place but no runtime policy enforcing it. Runtime guardrails Kong Agent Gateway AWS Bedrock Guardrails Kong Agent Gateway sits between agent and tools and denies by default; Bedrock Guardrails is the right pick when the stack is already on AWS and Bedrock is the inference layer.
An agent reads external content as part of its job and could be hijacked through it. Prompt injection testing Prompt injection defense Lakera / HiddenLayer The prompt-injection-defense playbook tests indirect injection before launch as part of the eval suite; Lakera and HiddenLayer are runtime services that catch attacks in production but don't replace the pre-launch test.

Watch

Reality

Write-access agents and MCP-style tool use create confused-deputy and indirect prompt-injection risks that standard IAM does not fully solve.

Required skills

  • Agent threat modelling
  • Least-privilege tool design
  • Prompt injection testing
  • Policy-as-code review
  • Audit trail design

Failure modes

  • Confused deputy
  • Indirect prompt injection
  • Overprivileged agents
  • Missing audit trail

Next operating step

Set controls at the tool boundary: agent identity, least-privilege permissions, policy checks, sandboxing, prompt-injection tests, and audit trails.

Working through Agent Governance & Security?

I advise teams on this part of the lifecycle. Get in touch → if you want a direct, vendor-free conversation about what's worth doing next.