model catalog
57 TopicsRAFT: A new way to teach LLMs to be better at RAG
In this article, we will look at the limitations of RAG and domain-specific Fine-tuning to adapt LLMs to existing knowledge and how a team of UC Berkeley researchers, Tianjun Zhang and Shishir G. Patil, may have just discovered a better approach.108KViews7likes5CommentsAnnouncing Healthcare AI Models in Azure AI Model Catalog
Modern medicine encompasses various data modalities, including medical imaging, genomics, clinical records, and other structured and unstructured data sources. Understanding the intricacies of this multimodal environment, Azure AI onboards specialized healthcare AI models that go beyond traditional text-based applications, providing robust solutions tailored to healthcare's unique challenges.11KViews5likes1CommentAzure OpenAI Model Upgrades: Prompt Safety Pitfalls with GPT-4o and Beyond
Upgrading to New Azure OpenAI Models? Beware Your Old Prompts Might Break. I recently worked on upgrading our Azure OpenAI integration from gpt-35-turbo to gpt-4o-mini, expecting it to be a straightforward configuration change. Just update the Azure Foundry resource endpoint, change the model name, deploy the code — and voilà, everything should work as before. Right? Not quite. The Unexpected Roadblock As soon as I deployed the updated code, I started seeing 400 status errors from the OpenAI endpoint. The message was cryptic: The response was filtered due to the prompt triggering Azure OpenAI's content management policy. At first, I assumed it was a bug in my SDK call or a malformed payload. But after digging deeper, I realized this wasn’t a technical failure — it was a content safety filter kicking in before the prompt even reached the model. The Prompt That Broke It Here’s the original system prompt that worked perfectly with gpt-35-turbo: YOU ARE A QNA EXTRACTOR IN TEXT FORMAT. YOU WILL GET A SET OF SURVEYJS QNA JSONS. YOU WILL CONVERT THAT INTO A TEXT DOCUMENT. FOR THE QUESTIONS WHERE NO ANSWER WAS GIVEN, MARK THOSE AS NO ANSWER. HERE IS THE QNA: BE CREATIVE AND PROFESSIONAL. I WANT TO GENERATE A DOCUMENT TO BE PUBLISHED. {{$style}} +++++ {{$input}} +++++ This prompt had been reliable for months. But with gpt-4o-mini, it triggered Azure’s new input safety layer, introduced in mid-2024. What Changed with GPT-4o-mini? Unlike gpt-35-turbo, the gpt-4o family: Applies stricter content filtering — not just on the output, but also on the input prompt. Treats system messages and user messages as role-based chat messages, passing them through moderation before the model sees them. Flags prompts that look like prompt injection attempts like aggressive instructions like “YOU ARE…”, “BE CREATIVE”, “GENERATE”, “PROFESSIONAL”. Flags unusual formatting (like `+++++`), artificial delimiters or token markers as it may look like encoded content. In short, the model didn’t even get a chance to process my prompt — it was blocked at the gate. Fixing It: Softening the Prompt The solution wasn’t to rewrite the entire logic, but to soften the system prompt and remove formatting that could be misinterpreted. Here’s what helped: - Replacing “YOU ARE…” with a gentler instruction like “Please help convert the following Q&A data…” - Removing creative directives like “BE CREATIVE” or “PROFESSIONAL” unless clearly contextualized. - Avoiding raw JSON markers and template syntax (`{{ }}`, `+++++`) in the prompt. Once I made these changes, the model responded smoothly — and the upgrade was finally complete. Evolving the Prompt — Not Abandoning It Interestingly, for some prompts I didn’t have to completely eliminate the “YOU ARE…” structure. Instead, I refined it to be more natural and less directive. Here’s a comparison: ❌ Old Prompt (Blocked) ✅ New Prompt (Accepted) YOU ARE A SOURCING AND PROCUREMENT MANAGER. YOU WILL GET BUYER'S REQUIREMENTS IN QNA FORMAT. HERE IS THE QNA: {{$input}} +++++ YOU WILL GENERATE TOP 10 {{$category}} RELATED QUESTIONS THAT CAN BE ASKED OF A SUPPLIER IN JSON FORMAT. THE JSON MUST HAVE QUESTION NUMBER AS THE KEY AND QUESTION TEXT AS THE QUESTION. DON'T ADD ANY DESCRIPTION TEXT OR FORMATTING IN THE OUTPUT. BE CREATIVE AND PROFESSIONAL. I WANT TO GENERATE AN RFX. You are an AI assistant that helps clarify sourcing requirements. You will receive buyer's requirements in QnA format. Here is the QnA: {$input} Your task is to generate the top 10 {$category} related questions that can be asked of a supplier, in JSON format. - The JSON must use the question number as the key and the question text as the value. - Do not include any description text or formatting in the output. - Focus on creating clear, professional, and relevant questions that will help prepare an RFX. Key Takeaways - Model upgrades aren’t just about configuration changes — they can introduce new moderation layers that affect prompt design. - Prompt safety filtering is now a first-class citizen in Azure OpenAI, especially for newer models. - System prompts need to be rewritten with moderation in mind, not just clarity or creativity. This experience reminded me that even small upgrades can surface big learning moments. If you're planning to move to gpt-4o-mini or any newer Azure OpenAI model, take a moment to review your prompts — they might need a little more finesse than before.120Views3likes1CommentBlack Forest Labs FLUX.1 Kontext [pro] and FLUX1.1 [pro] Now Available in Azure AI Foundry
We're excited to announce Azure AI Foundry Models now hosts FLUX.1 Kontext [pro] and FLUX1.1 [pro] as direct from Azure, giving developers a first-party, enterprise-ready path to Black Forest Labs’ (BFL) state-of-the-art image models. You get secure endpoints with pay-as-you-go model, Azure billing and Content Safety integration—no GPU wrangling required. Meet the Models Model Core task What’s new Speed Resolution / IO FLUX.1 Kontext [pro] in-context image generation and editing (text + image prompt) single model unifies local edits, full scene re-gen, style transfer, character consistency, iterative editing up to 8× faster than other SOTA editors 1024 x 1024 default; iterative multi-turn editing FLUX1.1 [pro] text-to-image Ultra mode: 4 MP images, Raw mode for natural “camera” look 6× faster than Flux 1-pro; 10 s for a 4 MP frame up to 4 MP, strong prompt adherence Under the hood: Both models sit on a rectified flow transformer backbone—BFL’s answer to diffusion and latent consistency models—yielding better sample diversity and lower inference latency. Image Capabilities & Enterprise Use-case Patterns Exploring the power of FLUX.1 Kontext [pro], we put its in-context image generation and editing capabilities to the test in Azure AI Foundry, transforming simple prompts into stunning, detailed visuals that showcase just how far generative AI has come. Prompt 1: “Two children sailing a paper boat down a winding river, surrounded by lush jungles and curious animals” Prompt 2: “Abstract digital painting of a futuristic city at sunset, with glowing neon lights and flying vehicles, in cyberpunk style” Prompt 3: “Surreal landscape made of floating islands, waterfalls spilling into the sky, and glowing crystal trees” With these Black Forest Labs models now available on Azure AI Foundry, enterprises are enabled to accelerate creative pipelines, generate e-commerce variants, automate marketing workflows and simulate digital twins at scale. Scenario Pattern to Try Creative Pipeline Acceleration Use FLUX 1.1 [pro] for storyboard ideation → pass frames into Kontext [pro] for surgical tweaks without PSD layers. E-commerce Variant Generation Inject product hero shot + prompt to FLUX.1 Kontext [pro] to auto-paint seasonal backdrops while preserving SKU angles. Marketing Automation Pair Azure OpenAI GPT-4o for copy + FLUX images via Logic Apps; send variants to A/B email testing. Digital Twin Simulation Use iterative editing to visualize wear/tear on equipment over time in maintenance portals. Benchmarks & Economics Latency: FLUX.1 Kontext [pro] averages 0.9 s per 1024 x 1024 edit—eight times faster than leading diffusion-based editors on identical A100s. Quality: On KontextBench, FLUX.1 Kontext [pro] ranks #1 on text-guided editing and character-consistency, while FLUX 1.1 [pro] tops aesthetics and prompt-following in T2I tests. Pricing Model Name Meter Type Price FLUX 1.1 [pro] Global 1K Images $40 FLUX.1 Kontext [pro] Global 1K Images $40 Tips for Production Readiness Seed for determinism: Both models accept seed for repeatable outputs—store alongside prompt history. Step budget: Ultra-mode images look best with 40-50 inference steps; FLUX.1 Kontext [pro] edits converge in < 30. Guard-rail chaining: Pipe outputs through Azure AI Content Safety and your own watermark classifier. Caching: For high-traffic apps, cache intermediate latent representations (Kontext) to speed multi-turn edits. Why Azure AI Foundry? Direct from Azure models give you the fastest time-to-value on cutting-edge foundation models, while Azure AI Foundry supplies the right tools, evaluation, deployment, safety, and lifecycle plumbing needed by real-world enterprises. What You Get Why It Matters Unified access All “Direct from Azure” models—OpenAI, DeepSeek, FLUX, Llama, Grok—share the same REST/SDK surface, auth (keys + Entra ID), metrics, and portal UX. Switch or chain models without rewriting code or juggling separate keys/resources. Enterprise-ready SLAs & security Models are hosted and sold by Microsoft under Microsoft Product Terms, with built-in content-safety, RBAC, network isolation, and Azure Monitor logging. Meets compliance officers where they live—no third-party contracts, guaranteed uptime. Scalable deployments Choose pay-as-you-go standard endpoints or capacity-backed PTU deployments that autoscale on A100/H100 pools. Start small in dev, flip to prod traffic without re-deploying. Deep toolchain hook-ups Prompt Flow, ACLI/Bicep/Terraform, Azure DevOps/GitHub Actions, Cost Management reservations, Policy, Purview & Sentinel signals—all work out of the box. Shorter path from hack-day demo to governed production workload. Build Trustworthy AI Solutions Black Forest Labs models on Azure AI Foundry are delivered under the Microsoft Product Terms, giving you enterprise-grade security and compliance out of the box. Each FLUX endpoint offers secure Content Safety controls and guardrails. Runtime protections include built-in content-safety filters, role-based access control, virtual-network isolation, and automatic Azure Monitor logging. Governance signals stream directly into Azure Policy, Purview, and Microsoft Sentinel, giving security and compliance teams real-time visibility. Together, Microsoft's capabilities let you create with more confidence, knowing that privacy, security, and safety are woven into every Black Forest Labs deployment from day one. How to Deploy BFL Models in Azure AI Foundry? If you don’t have an Azure subscription, you can sign up for an Azure account here. Search for the model name in the model catalog in Azure AI Foundry. FLUX.1-Kontext-pro FLUX-1.1-pro Open the model card in the model catalog. Click on deploy to obtain the inference API and key and also to access the playground. You should land on the deployment page that shows you the API and key in less than a minute. You can try out your prompts in the playground. You can use the API and key with various clients. The FLUX family has already re-defined speed/quality trade-offs in open image generation. Landing FLUX.1 Kontext [pro] and FLUX 1.1 [pro] inside Azure AI Foundry brings those capabilities—with Azure’s scalability, governance, and integrated tooling—to every developer building imaging workflows. Happy generating! Learn More ▶️ RSVP for the next Model Monday LIVE on YouTube or On-Demand 👩💻 Explore Azure AI Foundry Models 👋 Continue the conversation on Discord4.2KViews3likes3CommentsDeepening our Partnership with Mistral AI on Azure AI Foundry
We’re excited to mark a new chapter in our collaboration with Mistral AI, a leading European AI innovator, with the launch of Mistral Document AI in Azure AI Foundry Models. This marks the first in a series of Mistral models coming to Azure as a serverless API, giving customers seamless access to Mistral’s cutting-edge capabilities, fully hosted, managed, and integrated into the Foundry ecosystem. This launch also deepens our support for sovereign cloud customers —especially in Europe. At Microsoft, we believe Sovereign AI is essential for enabling organizations and regulated industries to harness the full potential of AI while maintaining control over their security, data, and governance. As Satya Nadella has said, “We want every country, every organization, to build AI in a way that respects their sovereignty—of data, of applications, and of infrastructure.” By combining Mistral’s state-of-the-art models with Azure’s enterprise-grade reliability and scale we’re enabling customers to confidently deploy AI that meets strict regulatory and data sovereignty requirements. Mistral Document AI By the Mistral AI Team “Enterprises today are overwhelmed with documents—contracts, forms, research papers, invoices—holding critical information that’s often trapped in scanned images and PDFs. With nearly 90% of enterprise data stored in unstructured formats, traditional OCR simply can’t keep up. Mistral Document AI is built with a multimodal approach that combines vision and language understanding, it interprets documents with contextual intelligence and delivers structured outputs that reflect the original layout—tables remain tables, headings remain headings, and images are preserved alongside the text.” Key Capabilities Document Parsing: Mistral Document AI interprets complex layouts and extracts rich structures such as tables, charts, and LaTeX-formatted equations with markdown-style clarity. Multilingual & Multimodal: The model supports dozens of languages and understands both text and visual elements, making it well-suited for global, diverse datasets. Structured Output & Doc-as-Prompt: Mistral Document AI delivers results in structured formats like JSON, enabling easy downstream integration with databases or AI agents. This supports use cases like Retrieval-Augmented Generation (RAG), where document content becomes a prompt for subsequent queries. Use Cases Document Digitization: Process archives of scanned PDFs or handwritten forms into structured digital records. Knowledge Extraction: Transform research papers, technical manuals, or customer guides into machine-readable formats. RAG pipelines and Intelligent Agents: Integrate structured output into pipelines that feed AI systems for Q&A, summarization, and more. Mistral Document AI on Azure AI Foundry You can now access Mistral Document AI’s capabilities through Azure AI Foundry as a serverless Azure model, sold directly from Microsoft. One-Click Deployment (Serverless) – With a few clicks, you can deploy the model as a serverless REST API, without needing to provision any GPU machines or container hosts. This makes it easy to get started. Enterprise-Grade Security & Privacy – Because the model runs within your Azure environment, you get network isolation and data security out of the box. All inferencing happens in Azure’s cloud under your account, so your documents aren’t sent to a third-party server. Azure AI Foundry ensures your data stays private (no data leaves the Azure region you choose) and offers compliance with enterprise security standards. This is critical for sensitive use cases like banking or healthcare documents. Integrated Responsible AI Capabilities – With Mistral Doc AI running in Azure AI Foundry, you can apply Azure’s built-in Responsible AI tools—such as content filtering, safety system monitoring, and evaluation frameworks—to ensure your deployments align with your organization’s ethical and compliance standards. Observability & Monitoring – Foundry’s monitoring features give you full visibility into model usage, performance, and cost. You can track API calls, latency, and error rates, enabling proactive troubleshooting and optimization. Agent Services Enablement – You can connect Mistral Document AI to Azure AI Agent Service, enabling intelligent agents to process, reason over, and act on extracted document data—unlocking new automation and decision-making scenarios. Azure Ecosystem Integration – Once deployed, the Mistral Document AI endpoint can easily plug into your existing Azure workflows. And because it’s part of Foundry, you can manage it alongside other models in a unified way. This interoperability accelerates the development of intelligent applications. Getting Started: Deploying and Using Mistral Document AI on Azure Setting up Mistral Document AI on Azure AI Foundry is straightforward. Here’s a quick guide to get you up and running: Create an Azure AI Foundry workspace – Ensure you have an Azure subscription (pay-as-you-go, not a free trial) and create an AI Foundry hub and project in the Azure portal Deploy the Mistral Document AI model – In the Azure AI Foundry Model Catalog, search for “mistral-document-ai-2505”. Then click the Deploy button. You’ll be prompted to select a pricing plan – choose deploy. Call the Mistral Document AI API – Once deployed, using the model is as easy as calling a REST API. You can do this from any programming language or even a command-line tool like cURL. Integrate and iterate – With the OCR results in hand, you can integrate Mistral Document AI into your workflows. Conclusion Mistral Document AI joins Azure AI Foundry as one of the several tools available to help organizations unlock insights from unstructured documents. This launch reflects our continued commitment to bringing the latest, most capable models into Foundry, giving developers and enterprises more choice than ever. Whether you’re digitizing records, building knowledge bases, or enhancing your AI workflows, Azure AI Foundry offers powerful and accessible solutions. Pricing Model Name Pricing /1K pages mistral-document-ai-2505 Global $3 mistral-document-ai-2505 DataZone $3.3 Mistral OCR Global $1 Resources Explore Mistral Document AI MS Learn Github Code Samples10KViews3likes3Comments