LLMs in Clinical Settings: What the FDA, HIPAA, and Your Hospital Client Actually Require

In this guide, you’ll learn:
- Where the FDA draws the line between a clinical LLM and a regulated medical device
- Three HIPAA risks LLMs create that most compliance programs overlook
- What hospital procurement teams review before approving an LLM product
- A practical pre-deployment checklist to use before any hospital discussion
A common scenario that keeps repeating across HealthTech in this era of AIs now is- A founder or CTO builds a genuinely useful clinical LLM product. It summarizes notes and helps with documentation. It flags potential drug interactions. The demo goes well and the hospital is interested too.
But, when the IT security team sends a 90-question vendor questionnaire. Along with the clinical governance committee asking for the FDA regulatory classification and for HIPAA Business Associate Agreement.
Everything goes to complete stop. Not because the product is bad. Because the founder and the team built an impressive AI product forgetting to build a compliant AI product first.
Under HIPAA, unauthorized disclosure of PHI can lead to penalties ranging from $141 to over $2.1 million per violation. Healthcare data breaches affected over 168 million individuals in 2024, driven largely by large-scale cyberattacks on healthcare vendors and clearinghouses.
When an LLM is in that chain, the liability does not sit with the model. It sits with you.
This guide covers what the FDA, HIPAA, and your hospital clients actually require before an LLM goes anywhere near a clinical workflow.
Understand What Makes a Clinical LLM Different From Any Other LLM
Unlike general AI models trained on internet content, medical LLMs learn from PubMed literature (35 million+ biomedical citations), clinical documentation, and electronic health records. But the data the model was trained on is not what determines your compliance obligations. What determines your obligations is what the model does with patient data in deployment.
The moment your LLM touches, processes, generates, or transmits Protected Health Information (PHI), HIPAA applies. The moment it produces outputs that influence clinical decisions about specific patients, FDA oversight becomes relevant.
These are two separate regulatory questions that require separate answers.
| LLM Use Case | HIPAA Applies? | FDA Oversight? |
|---|---|---|
| Ambient documentation (clinical note drafting) | Yes (processes PHI) | Generally no, if not decision support |
| Discharge summary generation | Yes | Generally no |
| Prior authorisation letter drafting | Yes | Generally no |
| Drug interaction checking | Yes | Yes (likely SaMD) |
| Diagnostic suggestion from patient history | Yes | Yes (SaMD, moderate to high risk) |
| Clinical risk stratification | Yes | Yes (SaMD) |
| Chatbot answering patient symptom questions | Yes | Depends on clinical specificity |
| Billing code suggestion (ICD-10) | Yes | Generally no |
| Research literature summarization (no PHI) | Maybe | No |
The most dangerous assumption in HealthTech right now is that an LLM used "just for documentation" sits outside the regulatory perimeter. If it processes PHI, it is inside HIPAA. If its output influences a clinical decision, it may also be inside FDA oversight.
What HIPAA Actually Requires From Your LLM Architecture
Traditional HIPAA compliance focused on database access controls and encryption. LLMs introduce three distinct attack surfaces that standard compliance programmes do not automatically cover.
Attack Surface 1: PHI in Training Data
When you fine-tune a model on clinical notes, PHI can become embedded in model weights. Large language models can reproduce verbatim text from training data under adversarial prompting.
This means:
- Any LLM fine-tuned on patient records without proper de-identification is a standing HIPAA violation, regardless of how the model performs clinically
- De-identification must meet the HIPAA Safe Harbor standard (removing 18 specific identifiers) or Expert Determination (statistical proof that re-identification risk is very low)
- If your model was pre-trained or fine-tuned by a third party on clinical data, you need documented evidence of how that data was handled before you can deploy the model for your own clinical use cases
Attack Surface 2: PHI in Inference Calls
Every time your application sends patient data to an LLM API, that data is transmitted to the model provider's infrastructure. If that infrastructure is not covered by a signed Business Associate Agreement (BAA), the transmission is a HIPAA violation.
BAA status for major LLM providers as of 2026:
| Provider | HIPAA BAA Available? | Notes |
|---|---|---|
| OpenAI API (direct) | No | No BAA available for direct API access |
| Azure OpenAI Service | Yes | BAA available through Microsoft enterprise agreements |
| Google Vertex AI (Gemini) | Yes | BAA available through Google Cloud healthcare agreements |
| Anthropic Claude API | Yes | BAA available for enterprise agreements |
| AWS Bedrock | Yes | BAA available through AWS healthcare agreements |
| Self-hosted open-source (Llama, Mistral) | Not applicable | You control the infrastructure. HIPAA applies to your setup, not a provider |
While the OpenAI API is not currently compliant with HIPAA, Azure services provide HIPAA-compliant access to OpenAI's models. Similarly, Anthropic provides HIPAA-certified API hosting for its Claude models.
The practical implication: if you are calling an LLM API directly and sending any patient data in the prompt, check whether a BAA exists. If it does not, you are in violation regardless of what your privacy policy says.
Attack Surface 3: PHI in Model Outputs and Logs
LLM outputs that contain patient information are themselves PHI if they identify or could identify a specific patient. This means:
- Application logs that capture LLM inputs and outputs must be treated as PHI
- Any analytics or monitoring on LLM responses that stores raw text must be on HIPAA-compliant infrastructure
- Caching of LLM responses that contain patient-specific content must follow HIPAA retention and deletion rules
What HIPAA requires from your LLM infrastructure:
- All PHI processed by the LLM is on HIPAA-eligible infrastructure with a signed BAA
- Encryption in transit (TLS 1.2 minimum) for all API calls containing PHI
- Encryption at rest for all stored inputs, outputs, and logs
- Audit logging: who accessed what, when, and what the LLM produced
- Role-based access controls on all systems that handle LLM inputs or outputs
- A HIPAA risk assessment that specifically addresses your LLM use cases
What the FDA Requires: SaMD Question for LLMs
The FDA's approach to LLMs in clinical settings flows from its existing Software as a Medical Device framework. The key question is the same one covered in our guide on building compliant AI for HealthTech products: does the LLM output inform, drive, or replace a clinical decision about a specific patient?
For LLMs specifically, the FDA's 2024 Clinical Decision Support (CDS) guidance provides the clearest framework:
Non-Device CDS (generally not requiring FDA clearance):
- The LLM displays standard medical reference information (like a drug reference tool)
- The clinician can independently verify the basis of the recommendation without relying on the LLM
- The recommendation does not acquire patient-specific data beyond basic demographics to operate
Device CDS (likely requiring FDA clearance):
- The LLM acquires, processes, or analyses patient-specific data to provide a recommendation
- The basis of the recommendation is not independently verifiable by the clinician using the LLM
- The recommendation is intended to replace or reduce the clinical judgement required from the clinician
For LLM products specifically, this means:
| LLM Feature | Non-Device or Device CDS? | Regulatory Implication |
|---|---|---|
| Summarising a patient record without clinical interpretation | Non-Device | No clearance required |
| Generating a differential diagnosis list from patient history | Device CDS | 510(k) or De Novo likely required |
| Drafting a clinical note from ambient audio | Non-Device (if no clinical recommendation) | No clearance required |
| Flagging abnormal lab values with clinical interpretation | Device CDS | FDA clearance pathway needed |
| Answering patient symptom questions with clinical guidance | Device CDS if specific to patient | Clearance likely required |
| Extracting structured data from unstructured clinical text | Non-Device | No clearance required |
The adaptive LLM problem:
Most modern LLMs can be updated, retrained, or prompted differently after deployment. This is the adaptive AI issue that the FDA's Predetermined Change Control Plan (PCCP) framework addresses.
If your LLM can change its clinical behaviour after deployment, you need a PCCP filed with the FDA as part of your clearance submission. Teams that deploy a fixed-version LLM and then update it without a PCCP are creating a significant regulatory problem.
What Your Hospital Client Actually Checks
FDA classification and HIPAA compliance are the regulatory requirements. But hospital clients add a layer of practical requirements on top of these that many product teams are not prepared for.
A typical hospital enterprise procurement for a clinical LLM product will include:
From the IT Security Team:
- Evidence of HIPAA-eligible infrastructure and signed BAA with your LLM provider
- SOC 2 Type II report for your product
- Penetration test results within the last 12 months
- Data flow diagram showing exactly where PHI goes: from EHR to your application, into the LLM API, and back
- Confirmation of data residency (especially for UK NHS and GCC hospital systems)
- Confirmation that no patient data is used to train or improve the model without explicit consent
From the Clinical Governance Committee:
- FDA regulatory classification document confirming whether clearance is required and its status
- Clinical validation data: how was the model tested and what were the results on your target patient population
- Hallucination rate and what controls are in place to detect and handle incorrect outputs
- Version control policy: what version of the model is deployed and how are updates managed
- Audit trail: how do you log which model version produced which clinical output for which patient
From the Legal and Procurement Team:
- HIPAA Business Associate Agreement
- Data Processing Agreement (DPA) for UK and EU deployments under UK GDPR or EU GDPR
- Indemnification terms covering AI-related clinical errors
- Subprocessor list: every third party that handles PHI on your behalf, including the LLM provider
93% of healthcare organisations were hit by a cyber attack in the previous 12 months. 35% of respondents identified employee failure to follow policies as the main reason behind data loss. Hospital procurement teams know these numbers. Their questions reflect that awareness.
Read More: UK Digital Health 2026: DTAC, MHRA, NHS Digital- What Founders Get Wrong
International Compliance: UK NHS and GCC Requirements
UK NHS requirements for LLM products:
NHS Digital's Data Security and Protection (DSP) Toolkit is the baseline compliance framework for any product handling NHS patient data. For LLM products specifically:
- UK GDPR applies to all PHI processing (separate from EU GDPR post-Brexit)
- The MHRA's SaMD guidance mirrors FDA logic for decision-support AI
- NHS Digital's guidance on AI and data ethics applies to any LLM used in clinical care
- Data processed for NHS patients must remain within UK or adequately protected jurisdictions
- NICE evidence standards for AI products are increasingly referenced in NHS procurement
GCC requirements for LLM products:
- Saudi Arabia's PDPL requires that patient data is processed on Saudi-resident infrastructure. An LLM API call that routes PHI through a US or EU data centre for a Saudi patient is non-compliant regardless of model quality.
- UAE's MOHAP digital health framework is developing specific AI guidance. Current expectation is that PHI remains on UAE-based infrastructure.
- Saudi Arabia's NCA ECC 2:2024 adds mandatory cybersecurity controls that extend to AI systems handling health data.
- SFDA guidance for SaMD applies to any LLM that qualifies as a medical device under Saudi Arabia's framework.
Pre-Deployment Checklist for Clinical LLM Products
Work through this before your next hospital conversation or procurement submission.
Regulatory Classification
FDA SaMD classification completed. Documented as Device CDS or Non-Device CDS with rationale
FDA SaMD classification completed. Documented as Device CDS or Non-Device CDS with rationale.
If SaMD: regulatory pathway identified (510(k), De Novo, or PMA)
If SaMD: regulatory pathway identified (510(k), De Novo, or PMA).
If adaptive model: PCCP filed or in progress
If adaptive model: PCCP filed or in progress.
UK MHRA classification confirmed if selling to NHS
UK MHRA classification confirmed if selling to NHS.
SFDA/MOHAP classification confirmed if selling in GCC
SFDA/MOHAP classification confirmed if selling in GCC.
HIPAA and Data Privacy
BAA signed with every LLM provider that receives PHI
BAA signed with every LLM provider that receives PHI.
All infrastructure confirmed HIPAA-eligible (not just cloud provider, but specific services used)
All infrastructure confirmed HIPAA-eligible (not just cloud provider, but specific services used).
Training data de-identified under Safe Harbor or Expert Determination if fine-tuned on clinical data
Training data de-identified under Safe Harbor or Expert Determination if fine-tuned on clinical data.
Data flow diagram produced showing PHI flow from source to LLM to output to storage
Data flow diagram produced showing PHI flow from source to LLM to output to storage.
HIPAA risk assessment covers LLM-specific use cases explicitly
HIPAA risk assessment covers LLM-specific use cases explicitly.
UK GDPR DPA in place for NHS deployments
UK GDPR DPA in place for NHS deployments.
Data residency confirmed per market (UK, KSA, UAE)
Data residency confirmed per market (UK, KSA, UAE).
Technical Controls
Encryption in transit (TLS 1.2 minimum) for all LLM API calls
Encryption in transit (TLS 1.2 minimum) for all LLM API calls.
Encryption at rest for all stored inputs, outputs, and logs
Encryption at rest for all stored inputs, outputs, and logs.
Audit logging per inference: model version, timestamp, user, input category, output category
Audit logging per inference: model version, timestamp, user, input category, output category.
Model version control: you can answer which model version produced any specific output
Model version control: you can answer which model version produced any specific output.
Hallucination detection or confidence scoring in output layer
Hallucination detection or confidence scoring in output layer.
Role-based access controls on all systems handling LLM inputs and outputs
Role-based access controls on all systems handling LLM inputs and outputs.
Hospital Procurement Readiness
SOC 2 Type II report available
SOC 2 Type II report available.
Penetration test completed within 12 months
Penetration test completed within 12 months.
Clinical validation data documented with methodology and results
Clinical validation data documented with methodology and results.
FDA regulatory status document prepared
FDA regulatory status document prepared.
HIPAA BAA template ready to sign
HIPAA BAA template ready to sign.
Subprocessor list current and available
Subprocessor list current and available.
Conclusion
Building an LLM for healthcare is no longer just an AI challenge. It is a compliance, security, and regulatory challenge from day one. A model that summarizes notes today can quickly become a regulated clinical decision support tool tomorrow, depending on how it is used.
At the same time, HIPAA obligations extend far beyond databases to include training data, prompts, outputs, logs, and every vendor involved in the AI workflow. Hospital buyers understand these risks and increasingly evaluate AI products through the lens of governance, security, and clinical safety rather than model performance alone.
Teams that address FDA classification, HIPAA requirements, data residency, auditability, and procurement readiness early gain a significant advantage. Before your next hospital conversation, ensure your LLM architecture is designed not only to deliver value but also to withstand regulatory scrutiny and enterprise due diligence.
Frequently Asked Questions
Your Clinical LLM Needs More Than a Good Demo to Pass Procurement
Know exactly where your LLM product stands on FDA, HIPAA, and hospital procurement requirements before your next enterprise deal.