From My Workbench
Notes from the build: data pipelines, agent orchestration, and the unglamorous economics of getting intelligent systems into production.
New to some of the jargon? The AI Glossary collects every acronym and technical term used across these posts, with a one-line definition and a list of articles where each shows up. Use it as a side-channel while reading.
- Jun 8, 202613 min readmittelstand
Mittelstand AI runs on a Mac Studio, not someone else's cloud
The typical Mittelstand AI conversation hits two walls at once. The Data Protection Officer blocks cloud LLMs, the CFO blocks the monthly bill. One workstation answers both.
- Jun 7, 202612 min readagent-skills
Agent Skills are the cheapest ALM your team will ever buy
Most small teams cannot afford an ALM platform or the process consultants that come with it. A folder of markdown skills now does the job for free.
- Jun 6, 202612 min readdurable-execution
Pick the durable runtime before your second agent ships
Your first agent survives on a Python loop and a cron job. The second one will not, and the rewrite afterwards is the bill you wanted to avoid.
- Jun 5, 202613 min readmicrosoft-fabric
Boring ML on Fabric still beats LLM-everything for the Mittelstand
A decision tree, MLflow, and one Fabric REST endpoint outprice GPT-class scoring for SMB churn, demand, and lead-scoring jobs. The talk demo proves it.
- Jun 4, 202613 min readdataverse
Dataverse is the agent backend most M365 shops already own
For SMBs already on Microsoft 365, the agent backend is a procurement question, not an engineering one. Dataverse ships row-level security a bolt-on vector DB cannot match.
- Jun 3, 202613 min readai-agents
Production RL is finally cheap enough to close the agent loop
Reinforcement learning on production agents used to be a research budget. At a hundred dollars an hour of training, it now sits next to your CI bill.
- Jun 2, 202612 min readllm-evals
Production agents need runtime scorers, not just pre-ship evals
Your CI eval suite is a pre-flight checklist. The minute the agent meets real users, you need a different artifact: scorers on live traces, or ship blind.
- Jun 1, 202613 min readcopilot-studio
Stop buying Copilot seats, rebuild one process instead
Microsoft's own customer reel quietly shifted from assist wins to process rebuilds. The SMB read: cancel half the seats, rebuild one workflow.
- May 31, 202613 min readai-agents
Stop routing every agent step through a frontier LLM
Routing every agent step through a frontier LLM was the 2024 answer. The 2026 answer is a planner that hands work to a 4B specialist for cents.
- May 30, 202612 min readcopilot-studio
Citizen-dev Copilot agents still need pro-dev ALM
Copilot Studio agents look low-code until the first production incident. The supported 2026 path is solutions, Pipelines, and the new Agentic CoE, not the deprecated accelerators.
- May 29, 202613 min readai-agents
Memory tiers, not bigger models, cut your agent token bill
Token cost is a memory-architecture problem wearing a model-selection costume. Pick the right tier per use case before you swap any model or vendor.
- May 28, 202613 min readcopilot-studio
Agent Flows are the SMB onramp to agentic automation
A 60-person Mittelstand company cannot hire a platform team to ship an LLM. Power Automate plus Agent Flows is the first onramp priced like the bill it already pays.
- May 27, 202613 min readcopilot-studio
Governance is the new agent bottleneck
Building an AI agent now takes 25 minutes. Approving one to talk to your customers, your data, and your auditor still takes weeks. Guess which one is the bottleneck.
- May 26, 202612 min readai-agents
Four agent patterns recur regardless of vendor stack
Every vendor demo claims agent wins, but only four shapes survive the case studies across stacks. The pattern is portable; the platform is governance tax.
- May 25, 202613 min readobservability
Observability outlives your agent framework
Your agent framework is a six-month decision. The trace schema underneath it is a five-year one. Pick the spine before you pick the framework.
- May 24, 202613 min readcopilot-studio
Copilot Studio Workflows is the spine LLM agents needed
Microsoft just shipped, inside Power Platform, the same lesson Temporal and LangGraph users learned the hard way: LLM steps belong inside a deterministic harness.
- May 23, 202614 min readsharepoint
SharePoint is the grounded RAG layer you already own
The SharePoint tenant your company already pays for is a permission-trimmed, already-indexed grounding layer. A bespoke vector pipeline is usually the wrong default for internal agents.
- May 22, 202614 min readagent-identity
Stop deploying agents, start onboarding them
A production agent is an identity with an employment lifecycle: hired, scoped, watched, fired. The whole industry agreed on this in 2025, and most teams missed it.
- May 21, 202616 min readcopilot-studio
Copilot Studio: most agent problems are integration problems
The model rarely kills your Copilot Studio agent in production. The surface does. Auth, grounding, and the 100-second wall ride on one choice.
- May 20, 202615 min readrag
Structured retrieval beats vector RAG for enterprise agents
Structured retrieval against Dataverse beats vector RAG for enterprise agents on accuracy, governance, and cost. The argument with numbers.
- May 19, 202616 min readmicrosoft-365
Microsoft 365 ships agent inventory not observability
Microsoft 365 Admin Center ships an agent registry, approval queue, and Azure billing. Per-request observability is still missing.
- May 18, 202612 min readai-agents
Temperature zero will not save you
Temperature zero is not deterministic. Score agent tests on a pass-rate distribution, not a string match. YAML and sample-size guidance inside.
- May 17, 202612 min readpostgres
Your agent's identity is a Postgres role
The pgvector-doesn't-scale meme is mostly outdated. Postgres is the agent data plane for 2026 — one database, one permission model.
- May 16, 202613 min readevals
Models depreciate, eval suites compound
The model sets your agent's ceiling. The eval suite is the part that compounds — surviving every model swap and prompt rewrite.