Forum Discussion
Need Guidance on cost breakdown of Microsoft Foundry Agent portal I created
I have developed a complaint handling portal for customers and employees using Azure AI Foundry. The solution is built with Foundry agents, models from the catalog, input/output caching, agent logging/tracing, and other Foundry capabilities. The frontend and orchestration layer are deployed on Azure Container Apps.
While Azure Cost Analysis provides an overview of spending, several parts remain unclear or act as a black box for accurate estimation, including:
- Token consumption assumptions (input/output tokens across different models and agents)
- User concurrency, sessions, and behavior patterns
- Agent logging and observability costs
- Impact of input/output caching
- Detailed resource consumption and billing in Azure Container Apps
What is the best way to accurately calculate or estimate the total running cost for such an Azure AI Foundry-based platform with Container Apps frontend?
Are there official Microsoft documentation, pricing guides, or reference architectures for cost breakdown? How do companies typically present costs for such AI platforms to attract customers (e.g., TCO models or per-user pricing)? I want to know how the platform costs are shown to customers.
Thank you.
1 Reply
hi Tasmia_Monzoor This is a very common challenge with AI Foundry solutions right now ,Azure Cost Analysis gives overall spend, but detailed AI-agent cost attribution is still not very transparent.
For a Foundry + Container Apps architecture, the main cost drivers are usually:
- Model token usage (input/output tokens)
- Number of agent calls/tool executions
- Concurrent users & session duration
- Container Apps scaling (CPU/memory replicas)
- Logging/tracing/Application Insights ingestion
- Vector/search/storage components
- Caching effectiveness
For estimating costs more accurately, most teams combine:
Azure Pricing Calculator
Azure Monitor + Application Insights metrics
Token usage telemetry from models/endpoints
Load testing for concurrency/session patterns
A practical approach is to calculate:
Cost per request/conversation
Average tokens per interaction
Expected monthly active users/concurrency
Infrastructure baseline (Container Apps minimum replicas, monitoring, storage, etc.)
For customer-facing pricing, companies typically present:
Per-user/month
Per-conversation/request
Tiered usage bundles
Or platform + consumption-based pricing
And internally they build a TCO model including:
- AI inference
- Hosting
- Observability
- Support/operations
- Buffer for scaling peaks
Microsoft does have useful references across:
- Azure AI Foundry pricing docs
- Azure OpenAI pricing
- Azure Container Apps pricing
- Well-Architected Framework (Cost Optimization pillar)
But today, there’s still some manual estimation involved, especially around agent orchestration and token behavior.