A Complete Guide to Auditing, Cost Optimization, and Governance
Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the official policy or position of Microsoft Corporation.
Last week, I got an Teams call from a customer: "Our production app just went down. Deployment is throwing authentication errors and we can't figure out why."
Working as a Cloud Solution Architect at Microsoft, I've debugged my fair share of Azure OpenAI issues. This one didn't take long to figure out—they were running a model version that Microsoft had retired three months ago. The retirement announcement?
Buried somewhere in their inbox, probably marked as read but never actually read. The result? Several hours of downtime and some very stressed engineers.
Here's the thing: this keeps happening. As teams spin up more Azure OpenAI deployments, keeping track of everything manually just doesn't work anymore.
The Challenge: Managing Azure OpenAI Deployments at Scale
You start with one Azure OpenAI/Foundry deployments, maybe two model deployments. Simple enough. Six months later? You've got 30+ deployments scattered across resource groups, different teams testing different model versions, and you're pretty sure you're paying for stuff nobody's using anymore.
Here are the main headaches I see teams dealing with:
- Model Retirements Sneak Up On You - Microsoft updates and retires models regularly (GPT-4, GPT-3.5, you name it). If you're not actively tracking this, you'll find out the hard way when production breaks.
- Ghost Deployments Everywhere - Remember that Provisioned Throughput Unit someone created for "just testing"? It's still running. Still costing $5,000/month. Still getting zero API calls. This stuff adds up fast.
- Compliance is a Mess - When your auditor asks "who's been accessing these AI models and from where," digging through Azure Portal logs manually is nobody's idea of a good time.
- No One Knows What's Actually Deployed - In bigger orgs, teams deploy models independently. Nobody has a complete picture of what's out there, where it's running, or what it's costing.
Tracking this manually doesn't scale. Spreadsheets are outdated the second you save them. You need something automated.
The Solution: An Open-Source Audit Tool
I built a tool that handles all of this automatically. It scans your Azure subscriptions, finds every OpenAI and AI Foundry deployment, pulls actual usage data from Azure Monitor, and flags models that are about to retire.
Here's what it does:
- Finds all your Azure OpenAI and AI Services accounts automatically
- Grabs real usage metrics—API calls, token counts, the works
- Compares what you've deployed against Microsoft's official retirement schedules
- Spits out CSV reports with everything you need: inventories, usage stats, retirement warnings
- Can even configure diagnostic settings and pull detailed logs from Log Analytics if you need them
Best part? Zero dependencies. Just Python standard library. It runs Azure CLI commands under the hood (which you probably already have installed anyway).
Grab it here: https://github.com/anishek-microsoft/foundry_model_audit
How This Can Help You
Catch Cost Leaks Before They Drain Your Budget
Ever wonder if you've got deployments sitting idle? The audit shows you exactly which ones have zero usage. Those Provisioned Throughput Units (PTUs) are expensive—if one's been sitting there doing nothing for weeks, you'll know immediately.
Plan Model Migrations Without the Panic
Instead of scrambling when a model gets retired, you'll see it coming months in advance. The tool flags everything that's approaching retirement and even shows Microsoft's suggested replacements. You get time to test, update configs, and migrate smoothly. No emergency meetings, no rushed deployments.
Make Compliance Audits Actually Manageable
Need detailed logs showing who accessed your AI models and when? Enable diagnostic settings and the tool pulls all that data from Log Analytics into a clean CSV. When audit season rolls around, you've got comprehensive access reports ready to go instead of manually piecing together Portal logs.
Get Visibility Across Your Whole Organization
If your Azure environment is anything like most I work with, you've got multiple teams deploying independently. This gives you one complete picture: every account, every deployment, every region. You'll finally know what you're actually running and what it's costing.
How It Actually Works
The tool ties into Azure Resource Manager, Azure Monitor, and Log Analytics. Here's the flow:
- Uses your existing `az login` session (no extra auth needed)
- Scans Azure Resource Manager for OpenAI and AI Services accounts
- Calls Azure REST APIs to list all deployments (handles different API versions automatically)
- Pulls metrics from Azure Monitor—API calls, token counts, last 7 days of data
- Checks deployments against a JSON file of Microsoft's official retirement dates
- Optionally queries Log Analytics with KQL for detailed usage logs
I keep the retirement database (`model_retirements.json`) updated with Microsoft's docs. There's a helper script if you want to update it yourself from CSV exports.
Everything outputs to timestamped CSV files. Easy to open in Excel, diff between runs, or feed into your BI tools.
Getting Started
Three commands and you're running:
# Grab the code
git clone https://github.com/anishek-microsoft/foundry_model_audit.git
cd foundry_model_audit
# Log into Azure if you haven't already
az login
# Run it
python foundry_model_audit.py
You'll get a timestamped folder (like `foundry-audit-20260126-114221/`) with five CSV files:
- openai_deployments.csv - Everything you've got deployed
- targeted_deployments.csv - Specific models you're tracking with usage data
- model_retirement_alerts.csv - What's retiring soon
- log_analytics_detailed_logs.csv - Detailed audit logs (if you enabled diagnostics)
- openai_no_diagnostics.csv - Accounts that don't have logging turned on
Want to check specific models?
python foundry_model_audit.py --target-models '[{"ModelName":"gpt-4","Versions":["0613","1106-preview"]}]'
Enable detailed logging:
python foundry_model_audit.py --enable-diag --diag-workspace-id "/subscriptions/.../workspaces/my-workspace"
Full documentation, parameters, and examples are in the [README](https://github.com/anishek-microsoft/foundry_model_audit).
What to Look For in the Reports
- Find the dead weight: Check `targeted_deployments.csv` for anything with `totalCalls_7d = 0`. If it's been sitting idle for a month, time to shut it down.
- Spot the money burners: Filter for `sku = ProvisionedManaged` (those are PTUs) with low usage. You're paying fixed costs whether you use them or not. Low usage means you're probably wasting money.
- Watch for upcoming retirements: In `model_retirement_alerts.csv`, anything retiring in less than 90 days needs your attention. Microsoft usually suggests what to upgrade to, so you've got a migration path.
- Security check: In `log_analytics_detailed_logs.csv`, scan for weird `CallerIP` or `Identity` values. If you see API calls from places or accounts you don't recognize, that's worth investigating.
Things I'd Recommend
- Run this regularly, not just once. Set up a weekly job (Azure Function or scheduled task, whatever works). Track how things change over time—usage patterns, costs, new deployments.
- Don't let retirements surprise you. Set up some kind of alert for models retiring in the next 90 days. Give yourself time to plan migrations instead of firefighting.
- Be smart about logging. Turn on diagnostics for production stuff where you need compliance trails. For test/dev environments? Maybe skip it to save on Log Analytics costs. (First 5GB/month is free, but it adds up if you're logging everything.)
- Keep audit data secure. These logs have IP addresses, identities, sometimes request details. Don't commit them to Git. Use Azure Blob Storage with proper access controls. Encrypt if you're in a regulated industry.
Establish an audit cadence. Here's what I recommend:
- Weekly: Run the full audit to catch new deployments and usage changes
- Monthly: Review retirement alerts and plan migrations for anything < 90 days out
- Quarterly: Deep-dive cost analysis—look for PTU optimization opportunities and capacity right-sizing
This schedule aligns well with Microsoft's typical model retirement announcement cadence (usually 90+ days notice).
Taking It Further: Automation and Dashboards
Running this manually is useful, but you're probably wondering: "Can I automate this whole thing?" Yep. Here's how I'd set it up:
Run Audits Automatically with Azure Functions
Deploy the script as an Azure Function with a timer trigger. Set it to run every Monday morning, whatever works for you.
Basic setup:
- Timer trigger kicks off the audit weekly
- Use Managed Identity so you don't have to mess with credentials
- Save CSV files to Blob Storage
- Event Grid notifies you when new reports are ready
Why this works well:
- No servers to maintain
- Scales automatically if needed
- Built-in logs and monitoring
- Consumption plan keeps costs low
Sample deployment:
import azure.functions as func
import subprocess
from azure.storage.blob import BlobServiceClient
from datetime import datetime
def main(mytimer: func.TimerRequest) -> None:
# Run the audit
result = subprocess.run(['python', 'foundry_model_audit.py'],
capture_output=True, text=True)
# Upload results to Blob Storage
timestamp = datetime.utcnow().strftime('%Y%m%d-%H%M%S')
blob_service = BlobServiceClient.from_connection_string(os.environ['STORAGE_CONNECTION'])
# Upload each CSV file
for csv_file in ['openai_deployments.csv', 'targeted_deployments.csv', 'model_retirement_alerts.csv']:
blob_client = blob_service.get_blob_client(container='audit-reports', blob=f'{timestamp}/{csv_file}')
with open(csv_file, 'rb') as data:
blob_client.upload_blob(data)
Build Dashboards with Power BI
Once audit data is flowing to Blob Storage, hook up Power BI for some actual visibility:
Useful dashboard views:
- Cost tracking
- How many deployments per account and region
- PTU deployments sitting idle (easy cost savings)
- Cost trends month-over-month
- Top 10 most expensive underused deployments
- Retirement timeline
- Calendar showing when stuff's retiring
- Group by urgency (30/60/90 days out)
- Track migration progress
- Usage patterns
- API call trends
- Token usage (prompt vs completion)
- Which deployments are actually getting hit
- Spot unusual spikes
- Compliance view
- Which accounts have logging enabled
- Access patterns by user/service principal
- Overall audit coverage
Setup is pretty standard:
Connect to Blob Storage, import the CSVs with Power Query, build some visuals, set auto-refresh, publish to Power BI Service. Set up alerts for critical stuff (like models retiring in 30 days).
Get Alerts in Microsoft Teams
Use Power Automate to push notifications to Teams when stuff needs attention:
Flow setup:
- Trigger when a new blob shows up (new audit report)
- Parse the CSV for important stuff (retirements, unused PTUs)
- Post an adaptive card to your Teams channel
You'll get messages like: "Hey, found 5 unused Provisioned deployments worth $12K/month" or "3 models retiring in next 90 days." Beats checking manually.
Enterprise Workflows with Logic Apps
For bigger setups, Logic Apps can orchestrate more complex stuff:
- Loop through multiple subscriptions automatically
- Route alerts to the right team owners
- Create work items in Azure DevOps for migrations
- Send exec summaries via email weekly
Basically turns this from a one-off script into a proper governance system that runs itself.
Wrapping Up
Managing Azure OpenAI at scale isn't easy. The cloud moves fast, models retire, costs creep up, and keeping track of everything manually just doesn't work past a certain point.
This tool won't solve every problem, but it'll give you visibility. You'll know what's deployed, what's actually being used, what's wasting money, and what's about to retire. That's a huge step up from flying blind.
Want to try it?
- Grab the code:https://github.com/anishek-microsoft/foundry_model_audit)
- Run `python foundry_model_audit.py` on your subscription
- See what you find
Set this up, run it regularly, and save yourself some headaches.
Related reading:
Azure OpenAI Service Documentation
Azure AI Foundry Model Lifecycle and Retirement
Azure Monitor Metrics for Cognitive Services
Azure Cost Management and Optimization
Got questions or ideas? Drop a comment or open an issue on GitHub. I'd love to hear what you think and what features would make this more useful.