Blog
MCP in medtech: what the protocol actually unlocks for regulated documentation
A quality lead I worked with last year was maintaining three manual exports per week to keep her AI tool's context current. PLM export Monday. QMS export Wednesday. Test management export Friday. About 40 minutes each. The week before submission, she caught a version mismatch anyway — a design input revised since the Wednesday export.
That's not a discipline problem. That's what copy-paste between regulated systems looks like at volume. The fix isn't better manual discipline. It's an architecture that lets the agent reach into the live record directly.
Why the multi-system problem doesn't have a manual solution
A serious medical device company's truth is distributed across five to seven systems before you count legacy spreadsheets. PLM for design history. QMS for risk files and change records. Test management for verification evidence. CAD for the physical artifact. Clinical data systems for human factors and performance data. Standards library. FDA 510(k) database for predicate research.
When an engineer drafts a 510(k) section manually, she opens five browser tabs, exports a BOM from the PLM, copies a risk summary from the QMS, finds the predicate from the FDA database, assembles by hand. Every step introduces a version risk. The BOM export might be three days old. The risk summary might reflect a superseded revision. The predicate comparison might miss the most recent 510(k) clearances.
The same problem hits AI-assisted workflows. A general-purpose LLM fed documents manually works from whatever you copied in, not the current authoritative record. If the document you uploaded is stale, the output is stale.
Manual copy-paste between regulated systems isn't a minor inconvenience. It's a version control gap, an audit trail gap, and in certain interpretations a 21 CFR 820 design control gap — because your documentation may no longer accurately reflect your design.
What MCP is and why it matters here
Model Context Protocol is an open standard from Anthropic, released publicly in late 2024, now supported natively by Anthropic, OpenAI, Google, and Microsoft. Over 6,400 registered MCP servers as of February 2026. PLM vendors, QMS platforms, ERP providers, and clinical data systems are all investing in MCP integrations because their customers are asking for them.
The protocol has three parts: client, server, transport. The client is the AI agent. The server is the system exposing data — your QMS, your PLM, your FDA database integration. The transport is how they communicate: stdio for local processes, HTTP with Server-Sent Events for remote services.
The key design decision is that servers don't expose raw database access. They expose tools — defined, named operations with typed parameters and typed return values. A QMS MCP server doesn't give the agent a SQL connection. It gives callable functions: get_document, search_by_version, get_change_history. The agent calls the tool. The server executes it against the underlying system. The result comes back structured.
This boundary matters for two reasons. It limits blast radius — the agent can only do what the server's tool definitions permit. And it makes every data access a discrete, loggable event. That second property is what makes MCP interesting in a regulated context.
What MCP servers look like for medtech systems
An MCP server is an interface definition for a regulated system. It exposes a curated set of operations that cover the access patterns an AI agent actually needs — not the database, not the API internals.
A QMS MCP server exposes things like get_document(id, version) to retrieve a specific document at a specific revision, get_change_history(id) to retrieve the full change record including approvers and dates, get_risk_controls(hazard_id) to return risk controls associated with a specific hazard. A PLM server exposes get_design_input(id, version), get_bom(product_id, revision), list_design_outputs(design_input_id). A test management server exposes get_test_results(requirement_id), get_verification_summary(version), get_open_defects(severity).
The agent doesn't need to know whether your QMS is Greenlight Guru, MasterControl, or Veeva. It calls the same tool names and gets the same structured response. The server handles the translation.
What composition looks like in a real workflow
The real power is composition. One agent, multiple servers, one orchestrated workflow.
Take 510(k) predicate analysis. To generate a substantive predicate comparison you need: the predicate's cleared indications from the FDA 510(k) database; your device's design inputs and intended use from the PLM; your design outputs and specifications; the relevant FDA guidance from your standards library. Manual assembly takes hours and introduces every version risk described above.
With four MCP servers composed — FDA 510(k) database, PLM, QMS, standards library — the agent executes this in one workflow. Call fda_510k.search_predicates(device_type, intended_use). Call fda_510k.get_clearance_summary(k_number) for each candidate. Call plm.get_design_input(id, version="current"). Call qms.get_document(id="intended_use_statement", version="approved"). Call standards.get_guidance(topic="510k_comparison").
Authoritative, current data from five sources. One workflow. No human copying anything. Every call logged. Every piece of data versioned. The output document cites exactly which revision of your design input it compared against, and which predicate clearance it referenced.
The audit trail advantage most teams don't see immediately
MCP tool calls are more auditable than manual workflows. Not less.
When an engineer manually assembles a submission section, the audit trail is thin — maybe a change record saying "document approved," with no record of which version of which PLM artifact was referenced, or when the reference was captured, or whether it was current at the time.
Every MCP tool call is a discrete, loggable event. Server name. Tool name. Parameters passed. Timestamp. Result hash. That gives you a complete provenance record for every piece of data that contributed to a generated document: what was retrieved, from where, at what version, at what time.
When an FDA reviewer asks how a specific claim in your 510(k) was substantiated, you point to the retrieval record. When an internal auditor asks whether the risk summary reflects the approved version of your risk management file, you produce the call log. When a change control event requires re-verifying that submitted documents still reflect the current design, you re-run the retrieval and diff the results.
Manual workflows produce outputs. MCP workflows produce outputs with provenance. In a regulated context, that's the difference between "we believe this is current" and "here's the record showing it was current at 14:23 on November 14."
Four governance problems to solve before production
Getting MCP to work in a demo is straightforward. Getting it into production in a regulated environment requires solving problems most teams don't think about until they're already in the weeds.
Authentication and access control. Every data access needs to be attributable to a user or process. An MCP server accepting anonymous requests doesn't meet this bar. The 2026 production pattern is SSO-integrated authentication — the agent operates under a service account tied to a specific user context, with access scoped by project, device, and regulatory pathway. Not blanket read access to the entire system.
Version pinning. "Always retrieve current" is fine for document drafting. It's not fine for verification activities, where you need to prove the agent was working with the same design revision as your test records. MCP servers in a regulated context need explicit version parameters — get_design_input(id, version="2.3") — so that document generation workflows can pin to a specific design history state and produce a reproducible result.
Write gates. Read operations and write operations are not equivalent risk. An agent reading your QMS to generate a draft is low risk. An agent writing to your QMS — creating records, marking items approved, closing change orders — is a different risk profile entirely. The right architecture is asymmetric: agents read freely; write operations require explicit human approval, routed through a human-in-the-loop gate that appears in the same audit log as the read operations.
Data residency. If your device program is EU MDR-scoped and your clinical data is GDPR-scoped, the MCP transport cannot route that data through servers outside your permitted region. This is a deployment architecture question, not an MCP protocol question — but regulated teams need to answer it before production.
There's a fifth: agent access governance. Regulated companies have mature processes for human access control — an engineer requests QMS access for a project, a manager approves it, the approval is recorded. Those processes need to extend to agent access control with the same rigor, the same approval records, the same periodic review. This is new work for most quality systems.
MANKAIND
The teams that move fastest in the next two years won't be the ones with the most documentation headcount. They'll be the ones with AI agents connected directly to their engineering record — so that when a submission section needs drafting, the agent is working from the current, version-controlled truth.
MANKAIND exposes your engineering record through a governed MCP layer. Agents that generate documentation — submission sections, risk summaries, predicate analyses, design history narratives — draw from the current, version-controlled record, with every access logged for audit. Read operations run against live data. Write operations go through human approval. The gateway enforces access scope by project and device program. The audit log meets the same requirements as your eQMS.
You don't have to build the gateway. You don't have to negotiate MCP server implementations with your PLM vendor. You don't have to design the access governance model from scratch. The question is whether you want your next submission drafted from the current record or from last week's exports.
See how MANKAIND handles this
30-minute demo. Bring your hardest design controls question.