Inside Microsoft Agent Governance Toolkit: What It Does, And What Still Missing

When Microsoft open-sourced the Agent Governance Toolkit (AGT) in April 2026, it filled a gap the industry had been circling for two years. As enterprise AI moved from chat to action — agents calling tools, mutating data, talking to other agents — security teams kept asking the same question: who governs what these agents actually do? AGT is Microsoft’s answer, and it’s a serious one.

This post is a practitioner’s read of AGT. What it is, what it gets right, and — just as important — where it deliberately stops, so teams know what they still have to bring to complete the picture.

What AGT Is

AGT is an MIT-licensed, open-source toolkit for governing the runtime behavior of autonomous AI agents. It’s polyglot (Python, TypeScript, .NET, Rust, Go) and integrates with the major agent frameworks the industry actually uses — LangChain, CrewAI, AutoGen, OpenAI Agents, Google ADK, Semantic Kernel, AWS Bedrock, Microsoft Agent Framework, and 20+ more. Microsoft has stated its intent to move governance of the project into a neutral foundation over time.

The headline claim is bold and specific: AGT is the first toolkit to address all ten OWASP Agentic AI risks with deterministic, sub-millisecond policy enforcement. Every tool call, resource access, and inter-agent message is evaluated against policy before execution — auditable and reproducible.

That framing matters. AGT is not a guardrail wrapper around prompts. It’s a runtime governance layer that intercepts what the agent actually does.

The Modules That Define AGT

AGT ships as seven independently installable packages, each addressing a distinct concern in the agent lifecycle:

Agent OS

The stateless policy engine that intercepts every agent action before it executes. Supports YAML, OPA Rego, and Cedar policies. Sub-millisecond evaluation. This is the heart of AGT.

Agent Mesh

Cryptographic identity for agents and inter-agent trust scoring. One agent calling another? Mesh verifies who’s calling and applies a trust policy. Notably, this gives AGT its own agent-identity primitives without requiring an external IdP.

Agent Runtime

Execution sandboxing with privilege rings, saga orchestration for multi-step workflows, and emergency termination. Contains what a misbehaving agent can touch.

Agent SRE

Production reliability practices applied to autonomous agents: SLO monitoring, circuit breakers, cost caps, chaos engineering.

Agent Compliance

Tamper-evident audit logging via OpenTelemetry, with automated mapping to EU AI Act, SOC 2, HIPAA, and GDPR controls.

Agent Marketplace

Plugin and MCP server lifecycle management with supply-chain trust scoring. Stops an agent from loading unvetted extensions at runtime.

PromptInjectionDetector

A local, in-memory regex and heuristic scanner that runs at the input boundary and after every tool fetch. Sub-millisecond, fail-closed.

A few additional packages — Agent Lightning (RL policy enforcement) and Agent Hypervisor (hardware-level isolation) — extend the model further but address more specialized concerns.

The architecture is deliberately modular. You don’t have to take everything. The policy engine alone is useful; so is the runtime; so is Mesh. They compose.

What AGT Gets Right

It’s worth being specific about Microsoft’s contribution here, because it’s substantial.

The license and the neutrality

AGT is MIT-licensed and explicitly intended for a foundation home. Microsoft could have made this proprietary, Azure-locked, or Copilot-only. They did the opposite. This is a real act of standards-building, and the industry benefits.

The framework reach

Most “agent platform” tools assume you’re on one stack. AGT hooks the native extension points of LangChain, CrewAI, AutoGen, Google ADK, AWS Bedrock, and others. You don’t rewrite your agent code to adopt it. That portability is genuinely difficult engineering.

The policy engine choices

Supporting YAML, OPA Rego, and Cedar means teams aren’t forced to learn a proprietary DSL. OPA Rego in particular is a CNCF-graduated standard with mature tooling and broad familiarity.

Coverage of OWASP Agentic Top 10

This isn’t a marketing claim — the toolkit maps explicit controls to each risk, and the modular packages line up with real categories of harm (tool misuse, prompt injection, supply chain, runaway execution, inter-agent abuse).

Deterministic enforcement

The decision to keep the policy gate deterministic and sub-millisecond is the right call. Authorization decisions need to be auditable and reproducible. Probabilistic gates can’t be either.

This is a credible foundation. Teams building agent platforms should adopt AGT rather than reinvent it.

A Concrete Walkthrough — Wiring AGT Into an Existing Agent

Take a common case: a LangGraph-powered project management assistant for Jira. It looks up tickets by key to retrieve summaries, descriptions, statuses, comments, and attachments; runs JQL queries across many issues to aggregate status updates, find blockers, and generate reports; manages a long-term key-value memory; and sends notifications to Slack.

An assistant like this typically has content guardrails in place — scanners running over user inputs, agent responses, and retrieved Jira attachments to catch prompt injection and malicious payloads. Those handle the data layer: the contents the agent sees and emits. What they cannot handle is the action layer: the fact that the LLM might decide to call a tool — write to memory, post to Slack, run a broad JQL query — for reasons that look benign on the surface but violate policy. The agent needs a deterministic gate on every action.

That is exactly the gap AGT fills. Here is what wiring AGT into an agent like this looks like in practice.

Step 1 — Define a governance policy

AGT policies are declarative. Start with a YAML file that explicitly whitelists the tools the agent is allowed to invoke. Tighter parameter constraints, sequence rules, and conditional logic come next, but the first cut is just the allowlist.

name: jira-agent-policy
version: "1.0"
defaults:
  action: allow

rules:
  - name: allow-jira-fetch
    condition: "tool_name == 'fetch_jira_ticket'"
    action: allow
    priority: 100
  # Additional rules for JQL search, memory writes, notifications, etc.

The policy file is living code: versioned, reviewed, with a regression suite alongside.

Step 2 — Import the governance entry point

Pull govern from the agentmesh package — AGT’s policy-engine integration:

from agentmesh.governance import govern

Step 3 — Wrap `govern` in a framework-aware decorator

This is the step that does the real work. govern itself is generic; LangGraph passes tool calls with its own schema, and the policy engine needs richer context — the tool name, a structured action object — than the default wrapper provides. The clean answer is a small custom decorator that does three things: defines how denials are surfaced back to the agent, wraps the function with govern, and patches the context builder so every call presents the right metadata to the policy engine.

def governed(policy: str):
    def decorator(fn):
        def handle_deny(decision):
            return f"Action denied by policy: {decision.reason}"

        wrapper = govern(fn, policy=policy, on_deny=handle_deny)

        original_build_context = wrapper._build_context
        def custom_build_context(args: tuple, kwargs: dict) -> dict:
            ctx = original_build_context(args, kwargs)
            ctx["tool_name"] = fn.__name__
            ctx["action"] = {"type": "call", "tool_name": fn.__name__}
            return ctx

        wrapper._build_context = custom_build_context
        wrapper.__name__ = fn.__name__
        return wrapper
    return decorator

This is the kind of glue most teams write once and reuse across every tool. It is also where framework-specific quirks get absorbed so the rest of the agent code stays untouched.

Step 4 — Stack the decorator alongside LangGraph’s `@tool`

Apply both decorators to every tool function. LangGraph keeps doing tool discovery; AGT intercepts the call before it executes.

@tool
@governed("agent/policy.yaml")
def fetch_jira_ticket(ticket_key: str) -> str:
    """Fetch a Jira ticket's details..."""
    ...

The agent code itself does not change. Only the tool boundaries gain a policy gate.

Step 5 — Optional: post-execution output scanning

The custom @governed wrapper is a natural place to also invoke post-execution scanning — passing the tool’s raw output back through the existing content guardrails before returning it to the LLM’s context. This closes the perimeter: AGT governs the intent to act; the content scanners govern the results of the action.

Before and After

Without AGT, an agent like this relies on probabilistic LLM behavior plus content filtering — a model that usually does the right thing, with guardrails on the inputs and outputs. With AGT in place, every tool call is intercepted by a deterministic, sub-millisecond policy engine. The agent cannot invoke a tool the policy has not authorized, regardless of what the LLM decides. That is the difference between trusting the model and verifying the action — and it is exactly what the action layer is supposed to do.

The Gaps Worth Knowing

AGT is a foundation, not a complete solution — and Microsoft is open about that. The toolkit scopes itself deliberately, expecting teams to layer in complementary tools for several important concerns. Here are the gaps that matter most when planning a real deployment.

1. PromptInjectionDetector — pattern-based, by design

AGT’s built-in injection detector is a local, regex-and-heuristic scanner with a configurable blocklist and risk-score threshold. Its strengths are exactly what its design promises: it’s fast, deterministic, runs in-process, and adds essentially no latency. It catches the obvious — ignore previous instructions, markers, suspicious tool names, hidden Unicode.

Where it stops is also a function of that design. It cannot reason about meaning. A paraphrased jailbreak, a multi-turn semantic injection, a ROT13- or base64-encoded payload, or a novel attack pattern the rules haven’t seen — these can pass through. In practice, teams have demonstrated bypasses against the default ruleset with simple obfuscation. Microsoft’s own documentation is candid: detection effectiveness depends on threshold tuning against your specific threat model.

This is a known tradeoff: rules are deterministic and fast; semantic understanding requires a model. For threat-detection on adversarial content, that tradeoff matters. Teams running agents over untrusted retrieved content typically need a semantic, model-based detector in addition.

2. Azure Prompt Shields — capable, but cloud-bound

Microsoft does have a model-based prompt-injection detector: Prompt Shields, part of Azure AI Content Safety. It’s a genuinely capable detector that handles cases the regex layer can’t.

It is not part of AGT. It’s an Azure cloud API. That introduces three architectural realities worth planning for:

Latency

Prompt Shields is a remote call. Adding it to the hot path of every tool fetch and every retrieved chunk adds round-trip cost to a flow that AGT itself designed to be sub-millisecond.

Cost

Prompt Shields bills per text record (up to 1,000 Unicode characters per record). In autonomous agents, scan counts compound fast — a workflow that retrieves 20 chunks, loops 5 times, across 3 tools, can easily generate 300 shield scans per user interaction. At fleet scale, the bill is real.

Scope

Prompt Shields analyzes text. It doesn’t process binary artifacts directly — you parse and OCR first, then send the extracted text. Attacks hidden in malformed encodings, tiny fonts, or invisible characters that defeat OCR will defeat the shield. And Spotlighting (the indirect-injection defense) base64-tags untrusted content to lower its trust weight, which increases token usage and sometimes surfaces the tagging in model output.

None of these are flaws — they’re the natural cost of a centralized cloud classifier. For some deployments they’re fine; for high-throughput or air-gapped environments, they push teams toward local, on-prem alternatives.

3. Microsoft Purview — designed for the cold path

A frequent question: can Purview classify content for agents in real time? The honest answer is no, and the architecture documentation says so directly.

Purview is an excellent data-at-rest catalog. It scans SharePoint, OneDrive, SQL, and other repositories asynchronously, applies sensitivity labels, and exposes those labels as metadata. For agent governance, the right pattern is the hybrid hot-path / cold-path topology Microsoft recommends:

Cold path

Purview pre-classifies documents in the background and writes sensitivity labels to metadata.

Hot path

When an agent retrieves a document, AGT’s policy engine reads the pre-cached Purview label — not the file content — and enforces the policy in sub-millisecond.

Audit

AGT exports OpenTelemetry logs back to Purview for compliance reporting.

This works well. What it doesn’t do is classify content that isn’t yet in a Purview-scanned repository — ephemeral tool outputs, real-time user inputs, freshly scraped web content. Synchronously calling Purview APIs from the agent’s execution path is documented as an anti-pattern: high latency, rate-limit exhaustion, runaway cost. For hot-path classification, teams need a separate runtime classifier.

Purview also does not detect prompt injection or jailbreaks. It is a data classification and governance tool, not a threat detector. Active AI attacks require runtime tools — AGT’s local detector, Prompt Shields, or an alternative.

4. Authorization — where AGT deliberately stops at the agent

This is the area teams most often misjudge. AGT’s policy engine governs what an agent does — tool calls, inter-agent messages, plugin loads, resource access at the URL or endpoint level. It does this very well.

What AGT does not do is verify the human user behind the agent or make per-record data-access decisions. There is no IdP in AGT, no source-system ACL projection, no notion of “this user is the manager of the data subject.”

For the Microsoft estate, Microsoft Entra (and Entra Agent ID) closes much of this. Entra provides verified human identity via OIDC, Conditional Access, Identity Protection signals, lifecycle revocation, and — the headline — the Agent User Account / on-behalf-of (OBO) flow. With OBO, the agent acts as the user, presents the user’s token to the source, and the source enforces its own ACLs. For SharePoint, OneDrive, Exchange, Teams, and Graph-connected data, this is a complete answer.

The residual is what lives outside the Microsoft estate: per-record authorization for Workday, Salesforce, GitHub, SAP, ServiceNow, and on-prem databases where OBO doesn’t reach or isn’t faithfully supported. Domain business relationships beyond the basic org chart — hrbp_for, account_owner_for, territory_manager_for. Field-level access (e.g., suppressing a salary field on an otherwise-permitted record). Purpose-bound access and data residency at the resource level. These are layered on by complementary authorization platforms or built in-house.

There is also the service-account case. Many real agent deployments — multi-tenant agents, background workflows, agents calling legacy non-OAuth APIs — cannot use OBO. In those cases, the source sees only the application identity, and the user-level decision has to be made somewhere else, in the agent runtime. AGT provides the enforcement gate; the per-record decision needs an external Policy Decision Point.

5. Content classification — AGT is content-blind by design

A related point: AGT does not natively detect PII, PCI, PHI, secrets, or API keys in the content flowing through an agent. This is intentional — AGT is a policy enforcer, not a DLP scanner. It expects classification signals to come from elsewhere.

For data at rest, that elsewhere is Purview (cold path). For data in motion — the user’s chat prompt, an ephemeral tool output, a freshly scraped page — a runtime classifier is required. AGT will then enforce policy on the classification tag, but it won’t produce the tag.

6. Multimodal scope

The current AGT detection layer is text-oriented. Agents increasingly process documents, images, audio, and other binary artifacts. Attacks hiding in steganographic images, malicious PDFs, or audio prompts are an active threat surface. AGT’s policy engine is binary-agnostic — it’ll enforce whatever policy you write — but the signals needed to drive that policy across modalities require complementary detection layers.

How To Think About Using AGT

Adopt AGT as the runtime enforcement foundation. Don’t rebuild what it already gives you well — the policy engine, the sandbox, the kill switch, the supply-chain trust scoring, the inter-agent mesh, the audit logging. Reinventing any of these is engineering time spent on commodity.

Then plan the layers around it deliberately:

Pair with Entra (or your IdP) for identity. OBO is the cleanest answer for Microsoft-native data. Conditional Access and Identity Protection give you the device, location, and risk signals AGT’s policy can act on.
Adopt the hot-path / cold-path topology for classification. Purview (or equivalent) handles at-rest sensitivity labeling; a runtime classifier handles the in-flight stream. AGT enforces on both.
For authorization beyond the Microsoft estate, plan for a Policy Decision Point that handles per-record decisions for non-Microsoft sources, domain business relationships, purpose binding, residency, and the service-account case. This can be a complementary platform or an in-house build.
For threat detection on retrieved content — especially in regulated or air-gapped environments where latency and data-egress constraints matter — pair AGT’s local detector with a semantic, model-based scanner that can run where the content lives. On-prem content-integrity platforms are emerging specifically for this layer.
Treat AGT’s audit log as the spine of your agent compliance story. OpenTelemetry export to Purview, your SIEM, or your compliance portal gives you the trail regulators are going to ask about.

The key shift is mindset: AGT is the foundation, not the whole house. Microsoft built it that way on purpose — modular, neutral, and meant to be composed with other tools.

Summary

The Agent Governance Toolkit is a serious contribution to a problem that needed one. It gives the industry a credible, open-source, framework-neutral foundation for governing what AI agents do at runtime — policy enforcement, sandboxing, supply-chain trust, inter-agent identity, audit. Microsoft’s decision to release it under MIT and move it toward foundation governance is the right move for the ecosystem.

It is also, by deliberate scope, a foundation rather than a complete solution. Semantic prompt-injection detection beyond pattern-matching, real-time content classification, per-record authorization for non-Microsoft data, and air-gapped deployments all sit outside what AGT itself does — and Microsoft expects teams to layer those in.