Live AMA: Demystifying Azure pricing (AM session) | Microsoft Community Hub

Event details

⏱️ This live AMA is on January 22nd, 2026 at 9:00 AM PT. This same session is also scheduled at 5:00 PM PT on January 22nd. SESSION DETAILS This session breaks down the complexity of Azure pr...

Updated Jan 23, 2026

Copper Contributor

Jan 22, 2026

How should teams decide the right PTU sizing to balance performance and cost, and what common mistakes lead to unexpectedly high PTU charges?

kyleikeda
Microsoft
Jan 22, 2026
Thanks for the question. Here are some best practices to consider when purchasing PTUs:

Workload Characteristics: Different workloads consume varying amounts of processing capacity. Generations require more capacity than prompts. Analyze historical token usage data or call shape estimations (input and output tokens, requests per minute) to approximate PTUs needed.

Traffic Patterns: A wide distribution of call shapes, including some large calls, may lead to lower throughput per PTU compared to a narrower distribution with similar average sizes.

Capacity Planning Tools: Utilize the Foundry calculator to size specific workload shapes and estimate the required PTUs based on input and output tokens.

Benchmarking: The most accurate way to determine capacity is to benchmark a deployment with a representative workload for your use case.

Resources to learn more:

The Foundry PTU quota calculator: https://ai.azure.com/resource/calculator

Understanding costs associated with PTUs: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/provisioned-throughput-onboarding?…