Blog Post

Microsoft Foundry Blog
6 MIN READ

Automate Prior Authorization with AI Agents - Now Available as a Foundry Template

amimukherjee's avatar
amimukherjee
Icon for Microsoft rankMicrosoft
Apr 23, 2026

By Amit Mukherjee · Principal Solutions Engineer, Microsoft Health & Life Sciences

Lindsey Craft-Goins · Technology Leader - Cloud & AI Platforms, Health & Life Sciences

Joel Borellis · Director Solutions Engineering - Cloud & AI Platforms, Health & Life Sciences

 

Prior authorization (PA) is one of the most expensive bottlenecks in U.S. healthcare. Physicians complete an average of 39 PA requests per week, spending roughly 13 hours of physician-and-staff time on PA-related work (AMA 2024 Prior Authorization Physician Survey). Turnaround averages 5–14 business days, and PA alone accounts for an estimated $35 billion in annual administrative spending (Sahni et al., Health Affairs Scholar, 2024).

The regulatory clock is now ticking. CMS-0057-F mandates electronic PA with 72-hour urgent response starting in 2026. Forty-nine states plus DC already have PA laws on the books, and at least half of all U.S. state legislatures introduced new PA reform bills this year, including laws specifically targeting AI use in PA decisions (KFF Health News, April 2026).

Today we’re making the Prior Authorization Multi-Agent Solution Accelerator available as a Microsoft Foundry template. Health plan payers can deploy a working, four-agent PA review pipeline to Azure using the Azure Developer CLI (“azd”) with a single command in supported environments, then customize it to their policies, workflows, and EHR environment.

 

Try it now: Find the template in the Foundry template gallery, or clone directly from github.com/microsoft/Prior-Authorization-Multi-Agent-Solution-Accelerator

What the template delivers

The accelerator deploys four specialist Foundry hosted agents (Compliance, Clinical Reviewer, Coverage, and Synthesis), each independently containerized and managed by Foundry. In internal testing with synthetic demo cases, the pipeline reduced review workflow, from beginning to completion in under 5 minutes per case.

 

Agent

Role

Key capability

Compliance

Documentation check

10-item checklist with blocking/non-blocking flags

Clinical Reviewer

Clinical evidence

ICD-10 validation, PubMed + ClinicalTrials.gov search

Coverage

Policy matching

CMS NCD/LCD lookup, per-criterion MET/NOT_MET mapping

Synthesis

Decision rubric

3-gate APPROVE/PEND with weighted confidence scoring

 

Compliance and Clinical run in parallel. Coverage runs after clinical findings are ready. Synthesis evaluates all three outputs through a three-gate rubric. The result is a structured recommendation with per-criterion confidence scores and a full audit trail, not a black-box answer.

Solution architecture

The accelerator runs entirely on Azure. The frontend and backend deploy as Azure Container Apps. The four specialist agents are hosted by Microsoft Foundry. Real-time healthcare data flows through third-party MCP servers.

 

                                                                                      Figure 1: Azure solution architecture

How the pipeline works

The four agents execute in a structured parallel-then-sequential pipeline. Compliance and Clinical run simultaneously in Phase 1. Coverage runs after clinical findings are ready. The Synthesis agent applies a three-gate decision rubric over all prior outputs.

 

                                                                                  Figure 2: Agentic architecture, hosted agent pipeline

Compliance and Clinical run in parallel via asyncio.gather, since neither depends on the other. Coverage runs sequentially after Clinical because it needs the structured clinical profile for criterion mapping. Synthesis evaluates all three outputs through a three-gate rubric (Provider, Codes, Medical Necessity) with weighted confidence scoring: 40% coverage criteria + 30% clinical extraction + 20% compliance + 10% policy match. The total pipeline time is bound by the slowest parallel agent plus the sequential agents, not the sum. In internal testing with synthetic demo cases, this architecture indicated materially reduced processing time compared to sequential manual workflows.

Under the hood

For the architect in the room, here are four design decisions worth knowing about:

  • Foundry hosted agents: Each agent is independently containerized, versioned, and managed by Foundry’s runtime. The FastAPI backend is a pure HTTP dispatcher. All reasoning happens inside the agent containers, and there are no code changes between local (Docker Compose) and production (Foundry); the environment variable is the only switch.
  • Structured output: Every agent uses MAF’s response_format enforcement to produce typed Pydantic schemas at the token level. No JSON parsing, no malformed fences, no free-form text. The orchestrator receives typed Python objects; the frontend receives a stable API contract.
  • Keyless security: DefaultAzureCredential throughout, so no API keys are stored anywhere. Managed Identity handles production; azd tokens handle local development. Role assignments are provisioned automatically by Bicep at deploy time.
  • Observability: All agents emit OpenTelemetry traces to Azure Application Insights. The Foundry portal shows per-agent spans correlated by case ID. End-to-end latency, per-agent contribution, and error rates are visible from day one with no additional configuration.

For the full architecture documentation, agent specifications, Pydantic schemas, and extension guides, see the GitHub repository.

Why this matters now

Human-in-the-loop by design

The system runs in LENIENT mode by default: it produces only APPROVE or PEND and is not designed to produce automated DENY outcomes in its default configuration. Every recommendation requires a clinician to Accept or Override with documented rationale before the decision is finalized. Override records flow to the audit PDF, notification letters, and downstream systems. This directly addresses the emerging wave of state legislation governing AI use in PA decisions.

Domain experts own the rules

Agent behavior is defined in markdown skill files, not Python code. When CMS updates a coverage determination or a plan changes its commercial policy, a clinician or compliance officer edits a text file and redeploys. No engineering PR required.

Real-time healthcare data via MCP

Agents connect to five MCP servers for real-time data: ICD-10 codes, NPI Registry, CMS Coverage policies, PubMed, and ClinicalTrials.gov. This incorporates real‑time clinical reference data sources to inform agent recommendations.

 

Third-party MCP servers are included for demonstration with synthetic data only. Their inclusion does not constitute an endorsement by Microsoft. See the GitHub repository for production migration guidance.

Audit-ready from day one

Every case generates an 8-section audit justification PDF with per-criterion evidence, data source attribution, timestamps, and confidence breakdowns. Clinician overrides are recorded in Section 9. Notification letters (approval and pend) are generated automatically. These artifacts are designed to support CMS-0057-F documentation requirements.

Deploy in under 15 minutes

From the Foundry template gallery or from the command line:

 

git clone https://github.com/microsoft/Prior-Authorization-Multi-Agent-Solution-Accelerator

cd Prior-Authorization-Multi-Agent-Solution-Accelerator

azd up

 

That single command provisions Foundry, Azure Container Registry, Container Apps, builds all Docker images, registers the four agents, and runs health checks. The demo is live with a synthetic sample case as soon as deployment completes.

 

What’s included

What you customize

4 Foundry hosted agents

Payer-specific coverage policies

FastAPI orchestrator + Next.js frontend

EHR/FHIR integration for clinical notes

5 MCP healthcare data connections

Self-hosted MCP servers for production PHI

Audit PDF + notification letter generation

Authentication (Microsoft Entra ID)

Full Bicep infrastructure-as-code

Persistent storage (Cosmos DB / PostgreSQL)

OpenTelemetry + App Insights observability

Additional agents (Pharmacy, Financial)

Built on

Microsoft Foundry + Foundry hosted agents · Microsoft Agent Framework (MAF) · Azure OpenAI gpt-5.4 · Azure Container Apps · Azure Developer CLI + Bicep · OpenTelemetry + Azure Application Insights · DefaultAzureCredential (keyless, no secrets)

Full architecture documentation, agent specifications, and extension guides are in the GitHub repository.

Get started

 

Disclaimers

Not a medical device. This solution accelerator is not a medical device, is not FDA-cleared, and is not intended for autonomous clinical decision-making. All AI recommendations require qualified clinical review before any authorization decision is finalized.

 

Not production-ready software. This is an open-source reference architecture (MIT License), not a supported Microsoft product. Customers are solely responsible for testing, validation, regulatory compliance, security hardening, and production deployment.

 

Performance figures are illustrative. Metrics cited (including processing time reductions) are based on internal testing with synthetic demo data. Actual results will vary based on case complexity, infrastructure, and configuration.

 

Third-party services included for demonstration only; not endorsed by Microsoft. Customers should evaluate providers against their compliance and data residency requirements.

 

The demo uses synthetic data only. Customers deploying real patient data are responsible for HIPAA compliance and establishing appropriate Business Associate Agreements.

 

This accelerator is intended to help customers align documentation workflows with CMS‑0057‑F requirements but has not been independently validated or certified for regulatory compliance.

 

Updated Apr 23, 2026
Version 1.0
No CommentsBe the first to comment