Blog Post

Azure AI Foundry Blog
2 MIN READ

DeepSeek R1: Improved Performance, Higher Limits, and Transparent Pricing

Sharmichock's avatar
Sharmichock
Icon for Microsoft rankMicrosoft
Feb 26, 2025

On Jan 29, 2025, we introduced DeepSeek R1 in the model catalog in Azure AI Foundry, bringing one of the popular open-weight models to developers and enterprises looking for high-performance AI capabilities. At launch, we made DeepSeek R1 available without pricing as we gathered insights on real-world usage and performance. 

Now, we’re excited to share that the model has better latency and throughput along with competitive pricing, making it easier to integrate DeepSeek R1 into your applications while keeping costs predictable. 

Scaling to Meet Demand: Performance Optimizations in Action 

The high adoption brought a few challenges—early users experienced capacity constraints and performance fluctuations due to the surge in demand. Our product and engineering teams moved quickly, optimizing infrastructure and fine-tuning system performance. 

You can expect higher rate limits and improved response times starting from Feb 26, 2025. We continue rolling out further improvements to meet customers’ expectations. You can learn more about rate limits in the Azure AI model inference quotas and limits documentation page. 

Thanks to these improvements, we’ve significantly increased model efficiency, reduced latency, and improved throughput, ensuring a smoother experience for all users. 

DeepSeek R1 Pricing 

With these optimizations, DeepSeek R1 now delivers a good price-to-performance ratio. Whether you’re building chatbots, document summarization tools, or AI-driven search experiences, you get a high-quality model at a competitive cost, making it easier to scale AI workloads without breaking the bank. 

Model SKUInput Pricing in USD (1K Tokens)Output Pricing in USD (1K Tokens)
DeepSeek-R1 Global$0.00135$0.0054
DeepSeek-R1 Regional$0.001485$0.00594

 

What’s Next? 

We’re committed to continuously improving DeepSeek-R1’s availability as we scale. If you haven’t tried it yet, now is the perfect time to explore how DeepSeek-R1 on Azure AI Foundry can power your AI applications with state-of-the-art capabilities. 

Learn how to get started with DeepSeek-R1 in Azure AI Foundry.

 

Updated Mar 07, 2025
Version 2.0

2 Comments

  • geacm's avatar
    geacm
    Copper Contributor

    Thank you for the update on DeepSeek R1's performance and pricing. While the blog post mentions improved performance, I haven't observed these improvements in practice, echoing the sentiment of the previous commenter regarding the East US region.

    I'm particularly concerned about the 4,096 token context window for the DeepSeek-r1 model in Azure AI Foundry. For a model marketed for reasoning capabilities, this limit seems surprisingly restrictive. When testing with complex reasoning tasks, such as evaluating intricate mathematical integrals like:

    Evaluate the following integral: \[\int_0^{\pi} \max\left(|2\sin(x)|, |2 \cos(2x) - 1| \right)^2 \cdot
    \min\left(|\sin(2x)|, |\cos(3x)| \right)^2 \, dx\]

    I've encountered truncation of the model's reasoning process, indicated by the output being cut off mid-thought within the model's thinking tags. This suggests the context limit is being reached prematurely and hindering the model's ability to fully process and respond to complex queries.

    It's worth noting that other providers offer significantly larger context windows (e.g., 32,768 tokens on AWS for same model), which is often crucial for effective reasoning tasks.

    Since the performance update announcement on February 26th, 2025, I have tested DeepSeek-r1 daily. Unfortunately, the current performance and context limitations significantly impact its utility for my use cases, particularly those requiring in-depth reasoning.

    I hope this feedback is helpful as you continue to optimize DeepSeek-r1 on Azure AI.

  • TawandaEwing's avatar
    TawandaEwing
    Copper Contributor

    Very excited to start using DeepSeek properly! However, it seems like the performance hasn't improved at all in the eastus region ☹️. Are there regions that will give better performance right now?