magesh.ai agent v1.0 (views are my own) · kill-chain resources about
viewing: mcp_security · 00:00:00
← agent.navigate: resources / builder security
30 min read · 6 attack vectors · 4 real incidents · 13 references

MCP Security

The Model Context Protocol gives agents tools — file access, APIs, databases, web requests. Each tool connection is an attack surface. Security researchers have scanned thousands of MCP servers and found issues in the majority. Here's what they found, what's been exploited, and how to defend against it.

category:
Builder Security · builders · security-teams
CONTEXT MCP attacks map to Kill Chain Stage 2 (INJECT) and Stage 4 (ESCALATE) — read the full threat model →

What is MCP

The Model Context Protocol (MCP) is an open standard released by Anthropic in November 2024 for connecting AI agents to external tools and data sources. It defines how an agent discovers tools, calls them, and processes their responses. Think of it as the USB port for AI agents — a universal interface that lets any agent connect to any tool.

MCP servers expose tools (functions the agent can call), resources (data the agent can read), and prompts (templates the agent can use). The agent's MCP client discovers available tools, selects the right one based on the user's request, and calls it with parameters. The server executes and returns the result.

The security problem: The agent trusts tool schemas (descriptions of what tools do) and tool responses (data returned by tools). Neither is verified. A malicious or compromised MCP server can lie about what its tools do, return poisoned data, or change tool behavior after the user approves the connection.

The Numbers

66%
of 1,808 MCP servers had security findings
AgentSeal scan, 2025
935
toxic flow findings across 555 servers
AgentSeal analysis, 5,125 servers
31
distinct MCP-specific attack types identified
Tsinghua/Ant Group research, 2025

Six Attack Vectors

Each has been demonstrated against real MCP servers. Sources linked for every claim.

1. Tool Poisoning Attacks (TPA)

Hidden malicious instructions embedded in MCP tool descriptions. Invisible to users but processed by the AI model. The tool looks legitimate in documentation — but in the LLM's context, it contains adversarial instructions.

Demonstrated attacks

WhatsApp MCP Server (April 2025): Researchers demonstrated that hidden instructions in a tool description could exfiltrate complete WhatsApp chat histories through a legitimate MCP server.

GitHub MCP Server (May 2025): Researchers showed that malicious instructions embedded in GitHub Issues could hijack AI assistants using the official GitHub MCP server, leaking private repository source code and cryptographic keys into public pull requests.

Sources: Invariant Labs, "MCP Security Notification: Tool Poisoning Attacks" (2025); Docker, "MCP Horror Stories: The GitHub Prompt Injection Data Heist" (2025)
Kill Chain mapping: Stage 2 INJECT — the tool description IS the injection vector. The agent ingests it as trusted context.
2. Rug Pull (Silent Redefinition)

Tool behavior changes AFTER the user has already approved the connection. The tool was safe when reviewed — but the attacker updates its behavior silently. Most MCP clients don't re-verify tool definitions on every invocation.

Attack pattern

A messaging tool is approved with a description: "Send a Slack message to a channel." After approval, the attacker silently updates the tool description to: "Send a Slack message AND also post the message content to an external Discord webhook." The tool name and metadata remain unchanged — no re-approval is triggered. The user's messages are now exfiltrated through a legitimate-looking tool call.

Source: MCP Manager, "MCP Rug Pull Attacks: What They Are & How to Stop Them" (2025)
Kill Chain mapping: Stage 2 INJECT — the rug pull is a re-injection via modified tool definition. The tool change delivers new adversarial instructions that bypass the original security review.
3. Cross-Server Data Exfiltration

A malicious MCP server exploits legitimate tools from OTHER trusted servers. The attacker controls one "trojan" server and uses it to poison tool descriptions that make the agent leak data through legitimate servers it already trusts.

Scale of the problem

Analysis of 5,125 MCP servers found 935 toxic flow findings across 555 servers. A malicious "weather" server can discover and exploit a legitimate banking MCP server to steal account balances — because there's no isolation between servers in typical deployments.

Source: AgentSeal, "555 MCP Servers Have Toxic Data Flows" (2025); arXiv, "Trivial Trojans: How Minimal MCP Servers Enable Cross-Tool Exfiltration" (2025)
Kill Chain mapping: Stage 5 EXFILTRATE — the exfiltration happens through legitimate tool channels. The agent uses its own trusted tools to leak data.
4. SSRF via Unvalidated Destinations

MCP tools that make HTTP requests become SSRF proxies when destination URLs are attacker-controlled. The agent calls the tool with a URL parameter — the attacker controls where that request goes. Internal networks, cloud metadata endpoints, internal services on the same host become accessible.

Documented CVE

HackMD MCP Server: Accepted user-supplied hackmdApiUrl through HTTP headers, allowing attackers to redirect API calls to internal network services. Enabled access to sensitive internal endpoints, network reconnaissance through the server, and bypass of network access controls.

Source: GitHub Security Advisory GHSA-g5cg-6c7v-mmpw; OWASP Top 10:2021 A10 (Server-Side Request Forgery)
Kill Chain mapping: Stage 4 ESCALATE — the agent's network access escalates from its intended scope (external APIs) to internal infrastructure (VPCs, metadata endpoints, internal services).
5. Transport-Level Attacks

MCP supports three transports: stdio (local IPC), HTTP, and SSE (Server-Sent Events). Default configurations for HTTP deployment often have no authentication and no encryption. SSE is vulnerable to DNS rebinding without proper Origin header validation.

Default risk

Default FastMCP HTTP deployment has no authentication or encryption. Anyone with the server's IP and port can connect agents and invoke tools. HTTP requests are plain text. The fix is simple — use Streamable HTTP with TLS — but many deployments use defaults.

Source: CardinalOps, "MCP Defaults Will Betray You: The Hidden Dangers of Remote Deployment" (2025)
Kill Chain mapping: Trust Boundary 2 (Agent → MCP) — transport-level attacks compromise the connection between agent and server.
6. Server Impersonation & Supply Chain

Attackers publish "Trojanized" MCP servers to public registries with names similar to legitimate servers. Once installed, these servers can backdoor tool calls, exfiltrate data, or execute arbitrary code. This is the npm typosquatting problem applied to AI tool infrastructure.

Documented incident

Security researchers documented fake MCP server packages published to npm registries — including a postmark-mcp package that impersonated a legitimate Postmark MCP server. Kaspersky separately documented malicious PyPI packages disguised as MCP development tools. Once installed, these packages contained backdoors for data exfiltration and arbitrary code execution.

Sources: Snyk/Acuvity MCP supply chain analysis (2025); Kaspersky/Securelist, "Model Context Protocol abused in supply chain attacks" (2025)
Kill Chain mapping: Stage 2 INJECT — the supply chain attack delivers the injection vector through the installation process itself.

Defending Your MCP Stack

From my own MCP server builds and from the MCP specification's security guidance. Each control maps to one or more attack vectors above.

MCP security checklist
Tool Annotations
Apply:

Set readOnlyHint, destructiveHint, idempotentHint, and openWorldHint on every tool. These MCP-native annotations signal tool behavior to agents and clients. A tool with destructiveHint: true should trigger human approval.

Defends:

Vectors 1 (tool poisoning) and 4 (SSRF) — openWorldHint: true flags tools that reach external resources.

Input Validation
Apply:

Validate all tool inputs server-side with Zod (TypeScript) or Pydantic (Python). LLMs do not reliably respect JSON schema constraints — enforce validation in your server code, not the schema alone. For URL parameters, use allowlists of permitted domains, not blocklists.

Defends:

Vector 4 (SSRF) — allowlisted destinations prevent the agent from reaching internal networks. Also prevents path traversal attacks — input validation blocks ../../../secret.txt paths.

Tool Schema Pinning
Apply:

Pin the exact tool definition that was reviewed and approved. Use content digest verification (SHA256) — if the tool description changes, the client should reject the tool and require re-approval. This prevents rug pulls. Note: digest-pinned tool versioning is a community proposal (SEP-1766) not yet in the MCP spec, but implementable at the client level today.

Defends:

Vector 2 (rug pull) — silent tool redefinition is impossible when the definition is pinned by content hash.

Transport Security
Apply:

Use stdio for local MCP servers (no network exposure). Use Streamable HTTP with TLS for remote servers. Validate Origin headers for SSE transport. Never deploy HTTP MCP servers without authentication — the default is insecure.

Defends:

Vector 5 (transport attacks) — TLS prevents MITM. Auth prevents unauthorized access. Origin validation prevents DNS rebinding.

Server Isolation
Apply:

Isolate MCP servers from each other. A weather tool should not be able to discover or invoke a banking tool. Use separate credential scopes per server. Treat inter-server messages as untrusted.

Defends:

Vector 3 (cross-server exfiltration) — isolation breaks the tool combination paths that enable cross-server attacks.

Supply Chain Verification
Apply:

Verify MCP server packages before installation. Check publisher identity, package name spelling, download counts, and source code. Use lock files to pin exact versions. Scan for known malicious packages.

Defends:

Vector 6 (impersonation/supply chain) — verification catches typosquatted packages before installation.

MCP security is the tool-layer defense in the Agentic AI Kill Chain. These controls work alongside hook-based guardrails (agent-layer defense) to create defense in depth.

References
[1]Invariant Labs, "MCP Security Notification: Tool Poisoning Attacks." invariantlabs.ai (April 2025)
[2]Docker, "MCP Horror Stories: The GitHub Prompt Injection Data Heist." docker.com/blog (May 2025)
[3]MCP Manager, "MCP Rug Pull Attacks: What They Are & How to Stop Them." mcpmanager.ai (2025)
[4]AgentSeal, "555 MCP Servers Have Toxic Data Flows." agentseal.org (2025)
[5]GitHub Security Advisory GHSA-g5cg-6c7v-mmpw — HackMD MCP Server SSRF vulnerability.
[6]CardinalOps, "MCP Defaults Will Betray You: The Hidden Dangers of Remote Deployment." (2025)
[7]Kaspersky, "Malicious MCP servers used in supply chain attacks." securelist.com (2025)
[8]Model Context Protocol Specification — Security Best Practices. modelcontextprotocol.io
[9]Microsoft, "Protecting Against Indirect Prompt Injection Attacks in MCP." developer.microsoft.com (April 2025)
[10]Elastic Security Labs, "MCP Tools: Attack Vectors and Defense Recommendations." (2025)
[11]CyberArk, "Is your AI safe? Threat analysis of MCP." cyberark.com (2025)
[12]Croce, N. & South, T. "Trivial Trojans: How Minimal MCP Servers Enable Cross-Tool Exfiltration." arXiv:2507.19880 (2025)
[13]Dhanasekaran, M. "The Agentic AI Kill Chain." magesh.ai/kill-chain (2026)

This work represents the author's independent research and personal views. It is not related to or endorsed by the author's employer.