All notes

Governance is the new agent bottleneck

Governance is the new agent bottleneck

A small AI assistant that lives inside Microsoft Teams, answers chat, sends email under its own address, and reads your company files. Two years ago, wiring one took weeks of plumbing. At Microsoft Build 2026 I watched the same thing built on stage in under 25 minutes. That speedup is not the story I took away. The story I keep coming back to is that nothing else in the pipeline got 25 minutes faster, so the slow step is no longer "can a developer build this"; it is "who is allowed to ship one, against which data, with which audit trail." That gap is what I want to walk through here, because it is the gap I keep running into at the Mittelstand IT desks I work with.

Key takeaways

  • Build time collapsed to 25 minutes, so the binding constraint moved upstream to who can authorize, scope, and audit a new agent.
  • Microsoft GA'd Agent 365 on 2026-05-01 because it had already lost round one of shadow AI inside its own tenants.
  • Entra Agent ID, Purview, DLP, and Managed Environments are the four levers that gate every Copilot Studio agent shipped after 2026-03-18.
  • EU Mittelstand readers face a single deadline: 2026-08-02 GPAI enforcement, not a separate "AI governance" deadline.
  • If your stack lives on GitHub and Slack, none of this applies yet, and your control plane question is a different one.

In this article

The 25-minute number is the tell

Replaying the DEM332 walkthrough at Build 2026, the Microsoft Developer channel showed a working project-management agent provisioned, deployed into Teams, dropped into a group chat, and rewired with its own email identity, all on a single live stream timed against the talk's title number. The session itself is short, around twenty minutes of run time on the recording, and the entire demo fits inside it.

That is the headline. My honest reading is the opposite of what the headline implies. When a step that used to take days collapses to minutes, the binding constraint has not been removed; it has moved. Building agents was never the slow part of shipping agents at a regulated company. Approval was. Data scoping was. Audit was. The 25-minute number does not make those steps faster; it makes them more visible by removing the camouflage of slow plumbing. I read it as a tell, not a victory lap.

Microsoft's own timing reads the same way. On 2026-05-01 it took Agent 365 generally available at $15 per user per month standalone, or bundled into M365 E7 at $99 per user per month, and pitched it explicitly as a "control plane for observing, governing, and securing AI agents." I read the timing as a receipt: a control plane GA on the same calendar as a 25-minute build demo is not a coincidence. It is what the round of shadow AI inside their own tenants already cost them.

💡 The 25-minute build does not make governance faster. It makes governance the only thing left to be slow.

That is my thesis. Everything below is the wiring I would actually use.

What collapsed: the build path

The DEM332 build path is genuinely shorter. The single teams app create CLI command provisions an Entra app registration, a bot resource, secrets, and a manifest in one shot, and the new Teams SDK is now GA in Python, JavaScript, and C#. A Copilot-driven scaffolder reads the docs and writes the integration code into an existing web app while the presenter is talking. The deployment surface is Teams itself, which the demo positions as the new front door for any agent regardless of where it runs (Foundry, Vercel, CrewAI, Replit).

The Teams surface is also a decision lock. The moment you pick Teams as the publishing surface, you have implicitly turned on Tenant Graph Grounding and accepted the M365 data boundary for the agent. That is a different argument from the one in the Copilot Studio surface-choice post, but it sharpens the same point: surface is not a UX decision, it is an auth decision. Monday, the IT lead can name the allowed publishing surfaces inside the Microsoft 365 Admin Center agent management page under Copilot Control System lever two ("sharing"), so that "the next agent gets shipped into Teams" becomes a tenant policy, not a maker choice. Without that policy, the publishing surface is the maker's preference, and the data boundary follows the surface, not the intent.

What did get faster, then, is the developer-to-installation-link path. Not the agent-to-production path. That distinction is the one I would not let a vendor demo blur for me.

What did not collapse: the governance surface

Here is the part the 25 minutes do not include. Every new Copilot Studio agent created after 2026-03-18 auto-receives an Entra Agent ID. Pre-existing agents need a recreate-with-Agent-ID migration that is a real piece of work. Once the agent has an identity, Conditional Access policies for agents can target it with the same vocabulary as a user policy: device compliance, network location, sign-in risk. Monday, the IT lead writes one CA policy that blocks any agent identity attempting a sign-in from outside the corporate network or from a non-compliant device. That single policy turns "the agent inherits the maker's scope" into "the agent inherits the maker's scope, minus the contexts the policy refuses," which is the gap between a demo and a defensible posture.

The audit story is where the language matters. Purview Audit logs record that an agent interaction happened, with timestamp, user, and workload, but the substance of the prompt and response lives in the user's mailbox and is retrievable via eDiscovery against that mailbox using the "Copilot activity" condition. This is the same audit-trail surface that the deterministic workflow spine reads from, and it is the upstream half of the same governance problem documented in the agent inventory critique. One post asks what happens after you have already shipped 50 agents. This one asks what gates the next 50 from shipping at all.

DLP is the third lever. Purview data security for Copilot Studio agents lets you block sensitive items from being processed or referenced inside an agent's responses. Monday, the data protection officer stands up a DLP policy that catches the existing sensitivity labels ("Confidential / Finance", "PII / Customer") inside any Copilot Studio agent, and tests it against a synthetic prompt that asks for a salary table. If the agent answers, the policy is wrong.

Environment ALM is the fourth. The default Power Platform environment is the documented oversharing hazard. Managed Environments carry tiered DLP and a premium-license requirement for active users, with in-app notifications rolling out from June 2026, and the Power Apps Per App plan was retired for new customers on 2026-01-02. The AI Builder free-credit cliff hits on 2026-11-01, which changes the unit economics of any citizen-developer fleet. The April 2026 Copilot Studio wave already made the admin approval queue the documented default for tenant publishing, not maker-direct sharing. The lever exists. The question is whether it is pulled.

A rough shape of the DLP policy a Power Platform admin actually writes, expressed as the PowerShell pattern (shape, not exact syntax; refer to the Managed Environment docs above for the current cmdlets):

# shape, not literal syntax; consult the Power Platform admin module docs
New-AdminDlpPolicy `
  -DisplayName "Copilot Studio Sandbox - blocking" `
  -EnvironmentType OnlyEnvironments `
  -Environments @("env-copilot-sandbox") `
  -BlockGroup @(
    "Microsoft.HTTPConnector",
    "Microsoft.HTTPWithAzureADConnector",
    "Microsoft.CustomConnector"
  ) `
  -BusinessGroup @(
    "Microsoft.SharePointConnector",
    "Microsoft.Office365OutlookConnector"
  )

The point is not the cmdlet names. My point is that the policy artifact is named, scoped to one environment, and refuses arbitrary HTTP egress. That is the wedge I would lean on first.

The contested numbers

A working IT lead deserves honest sourcing, and I owe you the caveat before I lean on any of these. Three numbers float around the shadow-AI discourse: 29% of employees use unsanctioned AI agents, 41% of employees are classified as citizen developers, and 65% of enterprise security leaders rank shadow IT as a top-three concern while fewer than 30% have a formal governance framework. All three appear in the Cosnet Global aggregator post, attributed to Microsoft Cyber Pulse 2026 and Gartner respectively. I could not find a clean primary Microsoft or Gartner URL for any of the three when I went looking, and I would not put them in front of an auditor on that basis. I use them as a sense of the gradient, not as audit-defensible numbers, and I cite the aggregator when I quote them.

What I treat as audit-defensible: the Agent 365 price ($15 per user per month or bundled in E7 at $99), the EU AI Act GPAI enforcement date (2026-08-02), the Entra Agent ID auto-provisioning effective date (2026-03-18), and the Power Platform Per App retirement date (2026-01-02). Each is in the primary-source link bank above. The rest I treat as ambient signal.

The Mittelstand scenario

The case study most readers here actually face is not a US Fortune 500 with a CISO org of forty. From what I see at Mittelstand clients, the live shape is much closer to this: picture the IT lead at a 120-person specialty chemicals supplier in Nordrhein-Westfalen. The stack is Microsoft 365 Business Premium plus a recently-added Copilot for Microsoft 365 license for the sales and operations team, an on-prem ERP with a SharePoint front end, and Dataverse for two production scheduling apps. There is no separate security organization. The IT lead is the security organization. The legal counsel works two days a week and has already asked about the EU AI Act.

A sales rep watched the DEM332 demo on YouTube the morning after Build and asked, "can I build that for our customer status updates this afternoon." My honest answer is yes, the demo works on Business Premium. My harder answer is what happens if the agent reads from a SharePoint site that was over-shared in 2021 and surfaces a margin column from a discontinued product line in a customer-facing chat. That is the scenario I would not let go unanswered before any Monday build.

Monday, here is what I would have the IT lead do, and none of it is theoretical. First, in the Microsoft 365 Admin Center Copilot Control System, set lever one ("access") from "all Copilot-licensed users" to "the agent-makers security group" of three named people, and turn lever three ("publishing") on so every new agent has to be approved by the IT lead before it is shared. Second, lock the Default Power Platform environment ("no new makers, no new apps") and stand up a "Copilot Studio Sandbox" Managed Environment with the DLP policy shape above that blocks the HTTP connector and the custom connector but allows SharePoint and Outlook. Third, write one Conditional Access policy that targets all agent identities and requires a compliant device and a corporate-network sign-in. Fourth, talk to the legal counsel about which of the proposed agents would touch Annex III categories under the EU AI Act timeline, and tag those in the agent registry as "high-risk pending review."

The measurable outcome is also unromantic. Within 24 hours, every new agent created in the tenant has an Entra Agent ID, a Purview audit trail, a Conditional Access policy attached to its identity, and an admin-approval gate before it can be shared beyond the sandbox. Within seven days, the agent registry inside the Microsoft 365 Admin Center shows zero "unmanaged" agents and zero "agents without owners." That is the only honest definition of "we have agent governance" I have seen survive an external audit.

This scenario is not glamorous, and I would not pretend otherwise. A 120-person company is not buying Agent 365 at $15 per user per month for the whole company. My Mittelstand reading of Agent 365 is "buy it for the five agent-makers, not the 120 users." The framing of shadow AI as a governed asset class only pays off when there is a fleet to govern. Five agents and one approver is not a fleet, it is a list of names taped to the IT lead's monitor, and that is fine; it is also the correct shape for a 120-person company that does not need a control plane priced for an enterprise SOC.

Where the thesis loses

My thesis is not universal, and pretending it is would be the same kind of vendor-shaped argument the talk itself makes. Here are the places I have seen it break.

It loses for shops that do not live in M365. If your stack is GitHub, Slack, Linear, and a Vercel deployment, the Copilot Control System levers are not your control plane. I would not even open this post in that situation. Your equivalent question is whether your CI gates a deployment without an explicit human approval, whether your Slack workspace has app-installation gating turned on, and whether your secrets manager is the only egress point for any agent you ship. None of the Microsoft-specific tools in this post will help you, and the bottleneck framing still applies, just with different levers.

It loses for read-only summarizers grounded on public corpora. An agent that watches a public RSS feed and writes a Slack summary every morning is not the agent that needs Entra Agent ID, eDiscovery, and DLP. Treating it like one is the inverse mistake I see often: governance overhead becomes the bottleneck for that agent, not the build time. My honest principle is that the control surface scales with the data the agent can read and the actions it can take, not with the headline build-velocity number.

It loses, partially, for any company small enough that the IT lead is also the maker. In a ten-person shop, "admin approval queue" is theater. The maker is the approver. What I keep at that size is one rule: any agent that can read the customer mailbox or write to the production database has to be reviewed by a named second person. That is governance, even if there is no Purview policy to point at.

It also loses on the timeline, and this is the one I worry about most. EU AI Act enforcement powers and high-risk obligations activate on 2026-08-02. The full Microsoft governance stack (Agent 365 GA, Entra Agent ID auto-provisioning, CCS publishing queue, Managed Environment DLP) is technically available, but the operational maturity gap between "the toggle exists" and "we have an audit-defensible posture" is wider than the calendar suggests. Anyone telling a Mittelstand finanzdienstleister that they will be "AI-Act-ready" by toggling Copilot Studio settings in July is, in my read, selling them a bedtime story.

What you do Monday

Strip the post to four actions and you have the Monday backlog I would actually run. None of them require a new license.

  1. Open the Microsoft 365 Admin Center agent management page (link in the section above). Read the three CCS levers. Decide who can create, who can share, who has to approve. Tighten lever one off the default "all Copilot-licensed users" if you have not already.
  2. Lock the Default Power Platform environment to no new makers and no new apps. Stand up a single Managed Environment named "Copilot Studio Sandbox" with a DLP policy that blocks the HTTP, HTTP-with-AAD, and custom connectors. Test the policy by trying to create a flow that violates it.
  3. Write one Conditional Access policy that targets all agent identities, requires a compliant device, and blocks risky sign-ins. Test it by trying to invoke an agent from a personal laptop on a hotel network and confirm the sign-in is refused. If it succeeds, the policy is wrong.
  4. Walk into the legal counsel's office with a printed list of every Copilot Studio agent currently in the tenant. Tag each one as "low-risk", "limited-risk", or "Annex III review needed" under the EU AI Act. Do not over-tag. Anything tagged "Annex III review needed" goes into an approval freeze until reviewed.

That is the four-step audit-defensible posture I would put my name on. It is not glamorous. It is not the demo. It is the part that is now the slow step because the build is fast, and I would not run an agent program without it.

If you are wiring this same control plane at a similar size, I would compare notes. The 20-to-200-employee Mittelstand pattern is undersold in the public discourse, and the F500-shaped governance content is the wrong shape for the people I actually talk to. Reach out and I will share the policy shape we settled on.