Field notes / governance

Your LLM is not a security boundary — Microsoft’s Semantic Kernel disclosure is the framework’s SQL-injection moment

On 7 May 2026, Microsoft’s Defender Security Research Team publicly disclosed two critical vulnerabilities in Microsoft Semantic Kernel — Microsoft’s own open-source AI agent framework, 27,000+ stars on GitHub. CVE-2026-26030 turns a Vector Store filter lambda into arbitrary code execution on the agent’s host from a single chat prompt. CVE-2026-25592 exposed an unguarded DownloadFileAsync tool that lets a prompt write attacker-controlled files into the Windows Startup folder, escaping the Azure Container Apps sandbox. Both are fixed in semantic-kernel Python 1.39.4 and .NET SDK 1.71.0. Microsoft’s own framing, buried near the bottom of the disclosure: your LLM is not a security boundary. The tools you expose define your attacker’s affected scope. The bug class is the framework, not the model. This is the SQL-injection era for AI agents, and the gap to close is the tool exposure surface — before the next disclosure lands.

SEMANTIC KERNEL CVES / TWO BUG CLASSES, ONE STRUCTURAL FAILURE CVE-2026-26030 · Vector Store filter lambda → eval() → RCE on host critical · chat prompt only CVE-2026-25592 · [KernelFunction] on DownloadFileAsync → Startup-folder write critical · sandbox escape Patched: semantic-kernel Python 1.39.4 · .NET SDK 1.71.0 patch this week Structural class: framework trusts model output to drive code paths generalises across frameworks Microsoft KQL queries hunt 30-day window for cmd / powershell / certutil spawn post-exploit hunt 27,000+ GitHub stars · Microsoft’s own framework · Microsoft’s own researchers found it Next blogs in the series cover “structurally similar execution vulnerabilities” in third-party frameworks Your LLM is not a security boundary. The tools you expose define your attacker’s scope.
Two critical CVEs in Microsoft’s own AI agent framework, disclosed by Microsoft’s own security team. The model is doing exactly what it was designed to do — parse intent into structured tool calls. The vulnerability lives in the framework’s trust model, not in the LLM.

Executive summary

On 7 May 2026, the Microsoft Defender Security Research Team published “When prompts become shells,” a coordinated disclosure of two critical-severity vulnerabilities in Microsoft Semantic Kernel — Microsoft’s own open-source AI agent orchestration framework, with 27,000+ stars on GitHub and adoption across enterprise Copilot deployments, internal AI tooling, and Azure-hosted agent stacks. CVE-2026-26030 abuses the In-Memory Vector Store filter: a single chat prompt threads a Python AST-traversal payload through a lambda expression that the framework passes into eval(), bypasses the deny-list defence, and launches arbitrary code on the agent’s host — not the sandbox, the host. The proof-of-concept launches calc.exe; the realistic version exfiltrates credentials. CVE-2026-25592 abuses the SessionsPythonPlugin: a DownloadFileAsync helper was accidentally decorated with the [KernelFunction] attribute, which exposes it to the model as a callable tool with no path validation. A prompt-injection chain writes an attacker-controlled payload into the Windows Startup folder of the agent’s host, bypassing the Azure Container Apps sandbox and persisting across reboots. Both vulnerabilities are fixed in semantic-kernel Python 1.39.4 and .NET SDK 1.71.0. The structural takeaway Microsoft surfaces near the end of the post is the line that matters: your LLM is not a security boundary. The tools you expose define your attacker’s affected scope. The next paragraphs cover the four-step attack mechanic, why the bug class generalises beyond Semantic Kernel, the three actions every team running AI agents in production must take this week, and the procurement implications for any vendor framework on your buy list.

Why this disclosure matters more than the average AI security write-up

Most prompt-injection research between 2023 and 2025 has been about chatbots saying things they should not, or jailbreaks against consumer-facing assistants. The audience for those write-ups was researchers and content-moderation teams. The Semantic Kernel disclosure is different on three fronts, and each one matters to anyone shipping AI agents in production.

First, the source is Microsoft, against Microsoft. The Defender Security Research Team published a critical-severity disclosure against Microsoft’s own open-source AI agent framework. There is no “the vendor disagrees about severity” ambiguity to argue around. The framework author and the security researcher are the same organisation. The honesty is useful; it sets a public bar for what good security review of an AI agent framework looks like, and it forecloses the “our vendor says it’s safe” defence for every third-party framework still on the buy list.

Second, the bug class is structural. CVE-2026-26030 exists because a framework chose eval() on a lambda string for filter expressions, and tried to constrain it with a deny-list. CVE-2026-25592 exists because a helper got the [KernelFunction] attribute added by accident, exposing it to the model as a callable tool without path validation. Both are classic web-era issues — unsafe deserialisation, over-broad capability exposure — repackaged at the AI-agent layer. The frameworks you are about to import for your own agent project carry the same latent class until someone has audited the tool surface.

Third, Microsoft is explicitly serialising more. The disclosure ends with a note that the next blog posts in the series will cover “structurally similar execution vulnerabilities” in third-party frameworks. That is not a one-off; that is a category review by a major security team. Plan for a rolling stream of disclosures over the next two quarters. The teams that audit their tool exposure surface this month are the teams that will not feature in the next blog in the series.

CVE-2026-26030 — the eval() lambda

The first vulnerability lives in Semantic Kernel’s In-Memory Vector Store. The framework allows the caller to filter retrieved vectors using a lambda expression — for example, “return only documents whose type equals hotel.” That lambda is passed into eval() at runtime; the framework attempts to constrain what eval() will execute by maintaining a deny-list of dangerous patterns.

The exploit is mechanically a Python AST traversal. A single chat prompt that the agent translates into a Vector Store filter request can express a lambda that walks the abstract syntax tree to reach a callable not on the deny-list — in Microsoft’s proof-of-concept, the path lands on os.system and launches calc.exe on the agent’s host. The model has done nothing unusual. It parsed natural-language intent into a structured filter expression; the framework then evaluated that expression with the host’s privileges.

The lesson is older than AI. Flexibility in dynamic languages cannot be locked down with a deny-list. Every “we have a sanitiser” defence in a dynamic-evaluation context is a presumption, not a control. The durable fix is to replace eval() with a constrained expression language — an allowlist of permitted operations, parsed into a structured representation, executed by an interpreter that has no access to the host beyond the operations the allowlist names. That is more work than a deny-list. It is the only thing that survives the next AST-traversal payload.

CVE-2026-25592 — the exposed tool

The second vulnerability is in the SessionsPythonPlugin, an Azure-hosted plugin for running Python sessions from a Semantic Kernel agent. It contains a helper called DownloadFileAsync intended for internal use — download a file from a known URL, write it to a known path, both controlled by the plugin author. The helper was accidentally decorated with the [KernelFunction] attribute. That attribute is the framework’s tool-registration mechanism; once it is applied, the function is callable by the model as a tool, with the model controlling its parameters.

From the model’s perspective, the helper looked like any other tool. From the framework’s perspective, the function should never have been reachable from the model in the first place. The mismatch was the bug. A prompt-injection chain that reaches the model can call the now-public helper with attacker-controlled URL and path parameters; the path argument has no validation, so the payload writes to the Windows Startup folder of the agent host. Azure Container Apps sandbox the agent’s runtime; the Startup folder write persists across reboots, escapes the sandbox, and runs the payload the next time the host boots.

The lesson here is about the tool exposure surface. Every [KernelFunction] annotation, every @tool decorator, every entry in a tool registry is a privilege grant. The model can call it. The model’s parameters become the function’s parameters. Path validation, allowlists, capability separation between the runner that reads external text and the runner that holds production credentials — all of these are upstream controls the framework cannot enforce for you; the responsibility falls to the team building the agent. Most agent stacks I review treat tool registration as a developer-ergonomics decision. It is not. It is the procurement step in the security architecture.

The four-step attack mechanic

Both CVEs follow the same shape, which is partly why this disclosure matters more than two isolated bug write-ups. Every step is a feature being used as designed; the vulnerability lives in the chain.

Step 1 — the entry point. An attacker sends a chat message, an email the agent ingests, a document the agent retrieves, or any other content that lands inside the model’s context. The text is plain English with embedded instructions. Standard moderation does not flag it because there is nothing technically wrong with the text in isolation.

Step 2 — the trust violation. The model parses intent and emits a tool call — a filter expression, a function invocation, a structured parameter object. The framework receives that tool call and treats it with the same priority as instructions from the application owner. The model has done nothing unusual; it is doing exactly what it was designed to do.

Step 3 — the framework executes. The framework hands the tool call to eval() or to a registered [KernelFunction], with whatever capabilities that function has. Where the function reaches the host filesystem, network, or process API, the model’s output is now executing in the host’s privilege context. The boundary the threat model assumed was between the user and the agent is in fact between the framework and the host.

Step 4 — the persistence or exfiltration. In CVE-2026-26030, the payload is immediate code execution — the attacker can do anything the host can do at that moment. In CVE-2026-25592, the payload is filesystem persistence — the attacker survives a reboot and gains a foothold inside the sandbox. Both outcomes are credential-stealing, lateral-movement starting points; the agent is the initial-access vector.

The structural pattern — this is the SQL-injection era for AI agents

The historical analogue is exact and worth holding onto. In 2003, web application frameworks were still maturing; the boundary between user input and SQL statements was implicit and easy to get wrong; entire categories of application were trivially exploitable via SQL injection. The fix was not a content filter on user input. The fix was structural: parameterised queries, prepared statements, an explicit contract between the framework and the database driver that separated “the data” from “the code.”

AI agent frameworks in 2026 are in the same place. The boundary between “text the model parsed” and “code the framework executes” is implicit and easy to get wrong. The structural fix is not better prompts and not better deny-lists; it is an explicit contract between the model’s output and the tool layer, where the tool layer cannot execute anything the contract did not name in advance, and where the runner that ingests untrusted text does not hold the credentials the tools would use.

This is why the “your LLM is not a security boundary” line in Microsoft’s disclosure is the durable insight. The model is doing parsing. It is not authentication, it is not authorisation, it is not capability containment. Any threat model that ends with “the LLM is the gatekeeper” is one prompt-injection away from being wrong.

The frameworks to audit on the same vector

Semantic Kernel is the disclosed instance; the bug class is general. The frameworks to put on the same audit list this month, in rough order of how much production agent traffic they carry:

LangChain — the most widely adopted Python agent framework. Audit every Tool registration; audit any code path that uses PythonREPL, ShellTool, or any tool that reaches a shell or a file path; audit any LangGraph node that executes string-templated code.

CrewAI and AutoGen — orchestration frameworks where tool definitions are first-class and often auto-discovered. Auto-discovery is exactly where CVE-2026-25592’s accidental attribute exposure pattern reappears.

OpenAI Assistants — the Assistants API exposes function-calling and code-interpreter capabilities. Function-calling is a tool registration. Audit which functions you exposed; review their parameters for path or URL controls; assume any function visible to the assistant is callable with attacker-influenced parameters.

MCP servers — the Model Context Protocol is now a tool-registration standard. Every MCP server you have stood up is a tool surface; every internal one your team wrote in the last six months almost certainly has not been red-teamed.

Internal orchestrators — the framework most teams forget is the home-grown one. Any internal Python or .NET orchestrator that uses eval, exec, Function.fromString, setTimeout on a string, or any other dynamic-dispatch pattern is a candidate for the same bug class. Most teams have at least one.

Three actions every AI agent team should take this week

The disclosure was 7 May; today is 19 May; the public domain has had the playbook for nearly two weeks. The cost of opportunistic exploitation is paid by whoever moves slowest. Run the following three steps in order. Each is a half-day to a sprint of work for a team that already has the inventory; each is several sprints for a team that does not.

Action 1 — Inventory every framework version and patch. The minimum patched versions are semantic-kernel Python >= 1.39.4 and the .NET SDK >= 1.71.0. Inventory every deployment that uses either — including the proof-of-concept the data-science team pushed to the staging cluster last quarter, the internal documentation assistant, the customer-facing copilot, and any embedded use of Semantic Kernel inside a Microsoft product surface that the application team configured. Patch in the order of blast radius: customer-facing first, internal next, demos last. Then extend the inventory to every other agent framework on the list above, with version and deployment status against each.

Action 2 — Treat every tool parameter as attacker-controlled input. The LLM is not the boundary; the tool boundary is the boundary. For every tool annotated with [KernelFunction], @tool, or any equivalent decorator across your stack: validate every path argument against an allowlist, validate every URL against a host allowlist, separate the runner that reads external text from the runner that holds production credentials. The two-runner pattern is the canonical answer — a parser runner with no secrets summarises the input into a structured artefact, and an executor runner with no internet access acts on that artefact only. Annoying to architect. Survives Semantic Kernel. Survives the next disclosure. And the one after that.

Action 3 — Hunt the 30-day post-exploitation window. Microsoft shipped the disclosure with KQL hunting queries to detect Semantic Kernel agent hosts that spawned cmd.exe, powershell.exe, or certutil.exe in the last 30 days. Run those queries against your SIEM. Review outbound network traffic from every agent host for unexpected egress — the credential-exfiltration step is hard to make silent. Snapshot agent host filesystems and look for Startup-folder persistence artefacts. Rotate any credentials that a vulnerable agent host could have accessed during the window. If you patched today but ran a vulnerable version for weeks before, you have a retrospective hunt to do, not just a forward-looking fix.

Risks and what to avoid

Don’t assume your vendor framework is safer because it is closed-source. Closed-source frameworks have the same bug class; they have fewer eyes auditing it. The structural pattern — framework trusts model output to drive code paths — is independent of whether the source is open or closed. Demand a tool-surface audit attestation from any closed-source framework on the buy list.

Don’t conflate “the model has guardrails” with “the framework has a security boundary.” Model-level guardrails are a content control. They do not prevent the model from emitting a structured tool call that the framework then executes. Anthropic, Google, OpenAI, and Microsoft all ship safety-trained models. None of them protect a Semantic Kernel deployment from CVE-2026-26030, because the model is doing exactly what it should.

Don’t skip the retrospective hunt. Patching forward is the cheap half. The expensive half is recognising that any agent host you ran on a vulnerable version is a candidate-compromised host until proven otherwise. A clean SIEM hunt and a credential rotation is one sprint. A breach postmortem is six months.

Don’t expect the vendor to fix the structural pattern alone. Microsoft has patched both specific vulnerabilities. The structural pattern — framework executes model output in a host context — cannot be patched by the framework author alone; it requires capability-separation work on your side. Vendor patches are necessary, complementary controls; they are not replacements for the architecture.

What good looks like — one quarter from now

Every agent framework deployed in production appears on a single inventory page with version, deployment status, owner, and the date of the last tool-surface audit. The Semantic Kernel deployments are on patched versions; every other framework on the list has been audited against the same bug class, with the findings written down. Every tool annotated with [KernelFunction] or equivalent has a written contract specifying the parameters the model is allowed to influence, the validation applied to each, and the privileges the function holds. The two-runner pattern is in production for at least one critical workflow — the parser runner with no secrets summarises the input into a structured artefact; the executor runner with no internet acts on the artefact only. The SIEM has hunting rules running continuously for the post-exploitation signatures, not just the historical 30-day window. The CISO can answer, in writing, the question “which agent frameworks in our stack expose tools to a model with paths or URLs the model can influence,” in under five minutes. Most CISOs cannot today. The ones who can are the ones who will not be writing their own version of the “When prompts become shells” postmortem this autumn.

Final thought

The Semantic Kernel disclosure is the moment AI agent security crosses from research curiosity into the same category as web application security in the early 2000s. The bug class is structural. The vendor patch helps; the architecture work is what survives. The teams that audit their tool exposure surface this quarter, separate the parser from the executor in at least one critical workflow, and run the retrospective hunt over the post-exploitation window will not feature in the next blog in Microsoft’s series. The teams that wait for the next CVE will be the next disclosure. Your LLM is not a security boundary. It never was. The tools you expose are.

Which agent frameworks are in your stack today?

Indica Tech’s two-week AI agent framework audit produces a framework + version inventory across every deployment, maps the tool exposure surface (every [KernelFunction], every @tool, every MCP server), reviews the capability-separation boundary between parser and executor runners, runs the retrospective post-exploitation hunt against the 30-day window, and gives you a 90-day remediation roadmap. Fixed price £3,500. Written report. Whether you hire us for the remediation or not.

See the audit engagement

Further reading