azure ai studio
234 TopicsExploring Azure OpenAI Assistants and Azure AI Agent Services: Benefits and Opportunities
In the rapidly evolving landscape of artificial intelligence, businesses are increasingly turning to cloud-based solutions to harness the power of AI. Microsoft Azure offers two prominent services in this domain: Azure OpenAI Assistants and Azure AI Agent Services. While both services aim to enhance user experiences and streamline operations, they cater to different needs and use cases. This blog post will delve into the details of each service, their benefits, and the opportunities they present for businesses. Understanding Azure OpenAI Assistants What Are Azure OpenAI Assistants? Azure OpenAI Assistants are designed to leverage the capabilities of OpenAI's models, such as GPT-3 and its successors. These assistants are tailored for applications that require advanced natural language processing (NLP) and understanding, making them ideal for conversational agents, chatbots, and other interactive applications. Key Features Pre-trained Models: Azure OpenAI Assistants utilize pre-trained models from OpenAI, which means they come with a wealth of knowledge and language understanding out of the box. This reduces the time and effort required for training models from scratch. Customizability: While the models are pre-trained, developers can fine-tune them to meet specific business needs. This allows for the creation of personalized experiences that resonate with users. Integration with Azure Ecosystem: Azure OpenAI Assistants seamlessly integrate with other Azure services, such as Azure Functions, Azure Logic Apps, and Azure Cognitive Services. This enables businesses to build comprehensive solutions that leverage multiple Azure capabilities. Benefits of Azure OpenAI Assistants Enhanced User Experience: By utilizing advanced NLP capabilities, Azure OpenAI Assistants can provide more natural and engaging interactions. This leads to improved customer satisfaction and loyalty. Rapid Deployment: The availability of pre-trained models allows businesses to deploy AI solutions quickly. This is particularly beneficial for organizations looking to implement AI without extensive development time. Scalability: Azure's cloud infrastructure ensures that applications built with OpenAI Assistants can scale to meet growing user demands without compromising performance. Understanding Azure AI Agent Services What Are Azure AI Agent Services? Azure AI Agent Services provide a more flexible framework for building AI-driven applications. Unlike Azure OpenAI Assistants, which are limited to OpenAI models, Azure AI Agent Services allow developers to utilize a variety of AI models, including those from other providers or custom-built models. Key Features Model Agnosticism: Developers can choose from a wide range of AI models, enabling them to select the best fit for their specific use case. This flexibility encourages innovation and experimentation. Custom Agent Development: Azure AI Agent Services support the creation of custom agents that can perform a variety of tasks, from simple queries to complex decision-making processes. Integration with Other AI Services: Like OpenAI Assistants, Azure AI Agent Services can integrate with other Azure services, allowing for the creation of sophisticated AI solutions that leverage multiple technologies. Benefits of Azure AI Agent Services Diverse Use Cases: The ability to use any AI model opens a world of possibilities for businesses. Whether it's a specialized model for sentiment analysis or a custom-built model for a niche application, organizations can tailor their solutions to meet specific needs. Enhanced Automation: AI agents can automate repetitive tasks, freeing up human resources for more strategic activities. This leads to increased efficiency and productivity. Cost-Effectiveness: By allowing the use of various models, businesses can choose cost-effective solutions that align with their budget and performance requirements. Opportunities for Businesses Improved Customer Engagement Both Azure OpenAI Assistants and Azure AI Agent Services can significantly enhance customer engagement. By providing personalized and context-aware interactions, businesses can create a more satisfying user experience. For example, a retail company can use an AI assistant to provide tailored product recommendations based on customer preferences and past purchases. Data-Driven Decision Making AI agents can analyze vast amounts of data and provide actionable insights. This capability enables organizations to make informed decisions based on real-time data analysis. For instance, a financial institution can deploy an AI agent to monitor market trends and provide investment recommendations to clients. Streamlined Operations By automating routine tasks, businesses can streamline their operations and reduce operational costs. For example, a customer support team can use AI agents to handle common inquiries, allowing human agents to focus on more complex issues. Innovation and Experimentation The flexibility of Azure AI Agent Services encourages innovation. Developers can experiment with different models and approaches to find the most effective solutions for their specific challenges. This culture of experimentation can lead to breakthroughs in product development and service delivery. Enhanced Analytics and Insights Integrating AI agents with analytics tools can provide businesses with deeper insights into customer behavior and preferences. This data can inform marketing strategies, product development, and customer service improvements. For example, a company can analyze interactions with an AI assistant to identify common customer pain points, allowing them to address these issues proactively. Conclusion In summary, both Azure OpenAI Assistants and Azure AI Agent Services offer unique advantages that can significantly benefit businesses looking to leverage AI technology. Azure OpenAI Assistants provide a robust framework for building conversational agents using advanced OpenAI models, making them ideal for applications that require sophisticated natural language understanding and generation. Their ease of integration, rapid deployment, and enhanced user experience make them a compelling choice for businesses focused on customer engagement. Azure AI Agent Services, on the other hand, offer unparalleled flexibility by allowing developers to utilize a variety of AI models. This model-agnostic approach encourages innovation and experimentation, enabling businesses to tailor solutions to their specific needs. The ability to automate tasks and streamline operations can lead to significant cost savings and increased efficiency. Additional Resources To further explore Azure OpenAI Assistants and Azure AI Agent Services, consider the following resources: Agent Service on Microsoft Learn Docs Watch On-Demand Sessions Streamlining Customer Service with AI-Powered Agents: Building Intelligent Multi-Agent Systems with Azure AI Microsoft learn Develop AI agents on Azure - Training | Microsoft Learn Community and Announcements Tech Community Announcement: Introducing Azure AI Agent Service Bonus Blog Post: Announcing the Public Preview of Azure AI Agent Service AI Agents for Beginners 10 Lesson Course https://aka.ms/ai-agents-beginners5.2KViews0likes2CommentsA Recap of the Build AI Agents with Custom Tools Live Session
Artificial Intelligence is evolving, and so are the ways we build intelligent agents. On a recent Microsoft YouTube Live session, developers and AI enthusiasts gathered to explore the power of custom tools in AI agents using Azure AI Studio. The session walked through concepts, use cases, and a live demo that showed how integrating custom tools can bring a new level of intelligence and adaptability to your applications. 🎥 Watch the full session here: https://www.youtube.com/live/MRpExvcdxGs?si=X03wsQxQkkshEkOT What Are AI Agents with Custom Tools? AI agents are essentially smart workflows that can reason, plan, and act — powered by large language models (LLMs). While built-in tools like search, calculator, or web APIs are helpful, custom tools allow developers to tailor agents for business-specific needs. For example: Calling internal APIs Accessing private databases Triggering backend operations like ticket creation or document generation Learn Module Overview: Build Agents with Custom Tools To complement the session, Microsoft offers a self-paced Microsoft Learn module that gives step-by-step guidance: Explore the module Key Learning Objectives: Understand why and when to use custom tools in agents Learn how to define, integrate, and test tools using Azure AI Studio Build an end-to-end agent scenario using custom capabilities Hands-On Exercise: The module includes a guided lab where you: Define a tool schema Register the tool within Azure AI Studio Build an AI agent that uses your custom logic Test and validate the agent’s response Highlights from the Live Session Here are some gems from the session: Real-World Use Cases – Automating customer support, connecting to CRMs, and more Tool Manifest Creation – Learn how to describe a tool in a machine-understandable way Live Azure Demo – See exactly how to register tools and invoke them from an AI agent Tips & Troubleshooting – Best practices and common pitfalls when designing agents Want to Get Started? If you're a developer, AI enthusiast, or product builder looking to elevate your agent’s capabilities — custom tools are the next step. Start building your own AI agents by combining the power of: Microsoft Learn Module YouTube Live Session Final Thoughts The future of AI isn't just about smart responses — it's about intelligent actions. Custom tools enable your AI agent to do things, not just say things. With Azure AI Studio, building a practical, action-oriented AI assistant is more accessible than ever. Learn More and Join the Community Learn more about AI Agents with https://aka.ms/ai-agents-beginnersOpen Source Course and Building Agents. Join the Azure AI Foundry Discord Channel. Continue the discussion and learning: https://aka.ms/AI/discord Have questions or want to share what you're building? Let’s connect on LinkedIn or drop a comment under the YouTube video!329Views0likes0CommentsEvaluating Language Models with Azure AI Studio: A Step-by-Step Guide
Evaluating language models is a crucial step in achieving this goal. By assessing the performance of language models, we can identify areas of improvement, optimize their performance, and ensure that they are reliable and accurate. However, evaluating language models can be a challenging task, requiring significant expertise and resources.6.6KViews1like0CommentsPower Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.2.3KViews3likes2CommentsIntegrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration
Step 1: Deploying Models on Microsoft Foundry Let us kick things off in the Azure portal. To get our OpenClaw agent thinking like a genius, we need to deploy our models in Microsoft Foundry. For this guide, we are going to focus on deploying gpt-5.2-codex on Microsoft Foundry with OpenClaw. Navigate to your AI Hub, head over to the model catalog, choose the model you wish to use with OpenClaw and hit deploy. Once your deployment is successful, head to the endpoints section. Important: Grab your Endpoint URL and your API Keys right now and save them in a secure note. We will need these exact values to connect OpenClaw in a few minutes. Step 2: Installing and Initializing OpenClaw Next up, we need to get OpenClaw running on your machine. Open up your terminal and run the official installation script: curl -fsSL https://openclaw.ai/install.sh | bash The wizard will walk you through a few prompts. Here is exactly how to answer them to link up with our Azure setup: First Page (Model Selection): Choose "Skip for now". Second Page (Provider): Select azure-openai-responses. Model Selection: Select gpt-5.2-codex , For now only the models listed (hosted on Microsoft Foundry) in the picture below are available to be used with OpenClaw. Follow the rest of the standard prompts to finish the initial setup. Step 3: Editing the OpenClaw Configuration File Now for the fun part. We need to manually configure OpenClaw to talk to Microsoft Foundry. Open your configuration file located at ~/.openclaw/openclaw.json in your favorite text editor. Replace the contents of the models and agents sections with the following code block: { "models": { "providers": { "azure-openai-responses": { "baseUrl": "https://<YOUR_RESOURCE_NAME>.openai.azure.com/openai/v1", "apiKey": "<YOUR_AZURE_OPENAI_API_KEY>", "api": "openai-responses", "authHeader": false, "headers": { "api-key": "<YOUR_AZURE_OPENAI_API_KEY>" }, "models": [ { "id": "gpt-5.2-codex", "name": "GPT-5.2-Codex (Azure)", "reasoning": true, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 400000, "maxTokens": 16384, "compat": { "supportsStore": false } }, { "id": "gpt-5.2", "name": "GPT-5.2 (Azure)", "reasoning": false, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 272000, "maxTokens": 16384, "compat": { "supportsStore": false } } ] } } }, "agents": { "defaults": { "model": { "primary": "azure-openai-responses/gpt-5.2-codex" }, "models": { "azure-openai-responses/gpt-5.2-codex": {} }, "workspace": "/home/<USERNAME>/.openclaw/workspace", "compaction": { "mode": "safeguard" }, "maxConcurrent": 4, "subagents": { "maxConcurrent": 8 } } } } You will notice a few placeholders in that JSON. Here is exactly what you need to swap out: Placeholder Variable What It Is Where to Find It <YOUR_RESOURCE_NAME> The unique name of your Azure OpenAI resource. Found in your Azure Portal under the Azure OpenAI resource overview. <YOUR_AZURE_OPENAI_API_KEY> The secret key required to authenticate your requests. Found in Microsoft Foundry under your project endpoints or Azure Portal keys section. <USERNAME> Your local computer's user profile name. Open your terminal and type whoami to find this. Step 4: Restart the Gateway After saving the configuration file, you must restart the OpenClaw gateway for the new Foundry settings to take effect. Run this simple command: openclaw gateway restart Configuration Notes & Deep Dive If you are curious about why we configured the JSON that way, here is a quick breakdown of the technical details. Authentication Differences Azure OpenAI uses the api-key HTTP header for authentication. This is entirely different from the standard OpenAI Authorization: Bearer header. Our configuration file addresses this in two ways: Setting "authHeader": false completely disables the default Bearer header. Adding "headers": { "api-key": "<key>" } forces OpenClaw to send the API key via Azure's native header format. Important Note: Your API key must appear in both the apiKey field AND the headers.api-key field within the JSON for this to work correctly. The Base URL Azure OpenAI's v1-compatible endpoint follows this specific format: https://<your_resource_name>.openai.azure.com/openai/v1 The beautiful thing about this v1 endpoint is that it is largely compatible with the standard OpenAI API and does not require you to manually pass an api-version query parameter. Model Compatibility Settings "compat": { "supportsStore": false } disables the store parameter since Azure OpenAI does not currently support it. "reasoning": true enables the thinking mode for GPT-5.2-Codex. This supports low, medium, high, and xhigh levels. "reasoning": false is set for GPT-5.2 because it is a standard, non-reasoning model. Model Specifications & Cost Tracking If you want OpenClaw to accurately track your token usage costs, you can update the cost fields from 0 to the current Azure pricing. Here are the specs and costs for the models we just deployed: Model Specifications Model Context Window Max Output Tokens Image Input Reasoning gpt-5.2-codex 400,000 tokens 16,384 tokens Yes Yes gpt-5.2 272,000 tokens 16,384 tokens Yes No Current Cost (Adjust in JSON) Model Input (per 1M tokens) Output (per 1M tokens) Cached Input (per 1M tokens) gpt-5.2-codex $1.75 $14.00 $0.175 gpt-5.2 $2.00 $8.00 $0.50 Conclusion: And there you have it! You have successfully bridged the gap between the enterprise-grade infrastructure of Microsoft Foundry and the local autonomy of OpenClaw. By following these steps, you are not just running a chatbot; you are running a sophisticated agent capable of reasoning, coding, and executing tasks with the full power of GPT-5.2-codex behind it. The combination of Azure's reliability and OpenClaw's flexibility opens up a world of possibilities. Whether you are building an automated devops assistant, a research agent, or just exploring the bleeding edge of AI, you now have a robust foundation to build upon. Now it is time to let your agent loose on some real tasks. Go forth, experiment with different system prompts, and see what you can build. If you run into any interesting edge cases or come up with a unique configuration, let me know in the comments below. Happy coding!11KViews2likes2CommentsHow to Set Up Claude Code with Microsoft Foundry Models on macOS
Introduction Building with AI isn't just about picking a smart model. It is about where that model lives. I chose to route my Claude Code setup through Microsoft Foundry because I needed more than just a raw API. I wanted the reliability, compliance, and structured management that comes with Microsoft's ecosystem. When you are moving from a prototype to something real, having that level of infrastructure backing your calls makes a significant difference. The challenge is that Foundry is designed for enterprise cloud environments, while my daily development work happens locally on a MacBook. Getting the two to communicate seamlessly involved navigating a maze of shell configurations and environment variables that weren't immediately obvious. I wrote this guide to document the exact steps for bridging that gap. Here is how you can set up Claude Code to run locally on macOS while leveraging the stability of models deployed on Microsoft Foundry. Requirements Before we open the terminal, let's make sure you have the necessary accounts and environments ready. Since we are bridging a local CLI with an enterprise cloud setup, having these credentials handy now will save you time later. Azure Subscription with Microsoft Foundry Setup - This is the most critical piece. You need an active Azure subscription where the Microsoft Foundry environment is initialized. Ensure that you have deployed the Claude model you intend to use and that the deployment status is active. You will need the specific endpoint URL and the associated API keys from this deployment to configure the connection. An Anthropic User Account - Even though the compute is happening on Azure, the interface requires an Anthropic account. You will need this to authenticate your session and manage your user profile settings within the Claude Code ecosystem. Claude Code Client on macOS - We will be running the commands locally, so you need the Claude Code CLI installed on your MacBook. Step 1: Install Claude Code on macOS The recommended installation method is via Homebrew or Curl, which sets it up for terminal access ("OS level"). Option A: Homebrew (Recommended) brew install --cask claude-code Option B: Curl curl -fsSL https://claude.ai/install.sh | bash Verify Installation: Run claude --version. Step 2: Set Up Microsoft Foundry to deploy Claude model Navigate to your Microsoft Foundry portal, and find the Claude model catalog, and deploy the selected Claude model. [Microsoft Foundry > My Assets > Models + endpoints > + Deploy Model > Deploy Base model > Search for "Claude"] In your Model Deployment dashboard, go to the deployed Claude Models and get the "Endpoints and keys". Store it somewhere safe, because we will need them to configure Claude Code later on. Configure Environment Variables in MacOS terminal: Now we need to tell your local Claude Code client to route requests through Microsoft Foundry instead of the default Anthropic endpoints. This is handled by setting specific environment variables that act as a bridge between your local machine and your Azure resources. You could run these commands manually every time you open a terminal, but it is much more efficient to save them permanently in your shell profile. For most modern macOS users, this file is .zshrc. Open your terminal and add the following lines to your profile, making sure to replace the placeholder text with your actual Azure credentials: export CLAUDE_CODE_USE_FOUNDRY=1 export ANTHROPIC_FOUNDRY_API_KEY="your-azure-api-key" export ANTHROPIC_FOUNDRY_RESOURCE="your-resource-name" # Specify the deployment name for Opus export CLAUDE_CODE_MODEL="your-opus-deployment-name" Once you have added these variables, you need to reload your shell configuration for the changes to take effect. Run the source command below to update your current session, and then verify the setup by launching Claude: source ~/.zshrc claude If everything is configured correctly, the Claude CLI will initialize using your Microsoft Foundry deployment as the backend. Once you execute the claude command, the CLI will prompt you to choose an authentication method. Select Option 2 (Antrophic Console account) to proceed. This action triggers your default web browser and redirects you to the Claude Console. Simply sign in using your standard Anthropic account credentials. After you have successfully signed in, you will be presented with a permissions screen. Click the Authorize button to link your web session back to your local terminal. Return to your terminal window, and you should see a notification confirming that the login process is complete. Press Enter to finalize the setup. You are now fully connected. You can start using Claude Code locally, powered entirely by the model deployment running in your Microsoft Foundry environment. Conclusion Setting up this environment might seem like a heavy lift just to run a CLI tool, but the payoff is significant. You now have a workflow that combines the immediate feedback of local development with the security and infrastructure benefits of Microsoft Foundry. One of the most practical upgrades is the removal of standard usage caps. You are no longer limited to the 5-hour API call limits, which gives you the freedom to iterate, test, and debug for as long as your project requires without hitting a wall. By bridging your local macOS terminal to Azure, you are no longer just hitting an API endpoint. You are leveraging a managed, compliance-ready environment that scales with your needs. The best part is that now the configuration is locked in, you don't need to think about the plumbing again. You can focus entirely on coding, knowing that the reliability of an enterprise platform is running quietly in the background supporting every command.1.2KViews1like0CommentsBuilding with Azure OpenAI Sora: A Complete Guide to AI Video Generation
In this comprehensive guide, we'll explore how to integrate both Sora 1 and Sora 2 models from Azure OpenAI Service into a production web application. We'll cover API integration, request body parameters, cost analysis, limitations, and the key differences between using Azure AI Foundry endpoints versus OpenAI's native API. Table of Contents Introduction to Sora Models Azure AI Foundry vs. OpenAI API Structure API Integration: Request Body Parameters Video Generation Modes Cost Analysis per Generation Technical Limitations & Constraints Resolution & Duration Support Implementation Best Practices Introduction to Sora Models Sora is OpenAI's groundbreaking text-to-video model that generates realistic videos from natural language descriptions. Azure AI Foundry provides access to two versions: Sora 1: The original model focused primarily on text-to-video generation with extensive resolution options (480p to 1080p) and flexible duration (1-20 seconds) Sora 2: The enhanced version with native audio generation, multiple generation modes (text-to-video, image-to-video, video-to-video remix), but more constrained resolution options (720p only in public preview) Azure AI Foundry vs. OpenAI API Structure Key Architectural Differences Sora 1 uses Azure's traditional deployment-based API structure: Endpoint Pattern: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/... Parameters: Uses Azure-specific naming like n_seconds, n_variants, separate width/height fields Job Management: Uses /jobs/{id} for status polling Content Download: Uses /video/generations/{generation_id}/content/video Sora 2 adapts OpenAI's v1 API format while still being hosted on Azure: Endpoint Pattern: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/videos Parameters: Uses OpenAI-style naming like seconds (string), size (combined dimension string like "1280x720") Job Management: Uses /videos/{video_id} for status polling Content Download: Uses /videos/{video_id}/content Why This Matters? This architectural difference requires conditional request formatting in your code: const isSora2 = deployment.toLowerCase().includes('sora-2'); if (isSora2) { requestBody = { model: deployment, prompt, size: `${width}x${height}`, // Combined format seconds: duration.toString(), // String type }; } else { requestBody = { model: deployment, prompt, height, // Separate dimensions width, n_seconds: duration.toString(), // Azure naming n_variants: variants, }; } API Integration: Request Body Parameters Sora 1 API Parameters Standard Text-to-Video Request: { "model": "sora-1", "prompt": "Wide shot of a child flying a red kite in a grassy park, golden hour sunlight, camera slowly pans upward.", "height": "720", "width": "1280", "n_seconds": "12", "n_variants": "2" } Parameter Details: model (String, Required): Your Azure deployment name prompt (String, Required): Natural language description of the video (max 32000 chars) height (String, Required): Video height in pixels width (String, Required): Video width in pixels n_seconds (String, Required): Duration (1-20 seconds) n_variants (String, Optional): Number of variations to generate (1-4, constrained by resolution) Sora 2 API Parameters Text-to-Video Request: { "model": "sora-2", "prompt": "A serene mountain landscape with cascading waterfalls, cinematic drone shot", "size": "1280x720", "seconds": "12" } Image-to-Video Request (uses FormData): const formData = new FormData(); formData.append('model', 'sora-2'); formData.append('prompt', 'Animate this image with gentle wind movement'); formData.append('size', '1280x720'); formData.append('seconds', '8'); formData.append('input_reference', imageFile); // JPEG/PNG/WebP Video-to-Video Remix Request: Endpoint: POST .../videos/{video_id}/remix Body: Only { "prompt": "your new description" } The original video's structure, motion, and framing are reused while applying the new prompt Parameter Details: model (String, Optional): Your deployment name prompt (String, Required): Video description size (String, Optional): Either "720x1280" or "1280x720" (defaults to "720x1280") seconds (String, Optional): "4", "8", or "12" (defaults to "4") input_reference (File, Optional): Reference image for image-to-video mode remix_video_id (String, URL parameter): ID of video to remix Video Generation Modes 1. Text-to-Video (Both Models) The foundational mode where you provide a text prompt describing the desired video. Implementation: const response = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json', 'api-key': apiKey, }, body: JSON.stringify({ model: deployment, prompt: "A train journey through mountains with dramatic lighting", size: "1280x720", seconds: "12", }), }); Best Practices: Include shot type (wide, close-up, aerial) Describe subject, action, and environment Specify lighting conditions (golden hour, dramatic, soft) Add camera movement if desired (pans, tilts, tracking shots) 2. Image-to-Video (Sora 2 Only) Generate a video anchored to or starting from a reference image. Key Requirements: Supported formats: JPEG, PNG, WebP Image dimensions must exactly match the selected video resolution Our implementation automatically resizes uploaded images to match Implementation Detail: // Resize image to match video dimensions const targetWidth = parseInt(width); const targetHeight = parseInt(height); const resizedImage = await resizeImage(inputReference, targetWidth, targetHeight); // Send as multipart/form-data formData.append('input_reference', resizedImage); 3. Video-to-Video Remix (Sora 2 Only) Create variations of existing videos while preserving their structure and motion. Use Cases: Change weather conditions in the same scene Modify time of day while keeping camera movement Swap subjects while maintaining composition Adjust artistic style or color grading Endpoint Structure: POST {base_url}/videos/{original_video_id}/remix?api-version=2024-08-01-preview Implementation: let requestEndpoint = endpoint; if (isSora2 && remixVideoId) { const [baseUrl, queryParams] = endpoint.split('?'); const root = baseUrl.replace(/\/videos$/, ''); requestEndpoint = `${root}/videos/${remixVideoId}/remix${queryParams ? '?' + queryParams : ''}`; } Cost Analysis per Generation Sora 1 Pricing Model Base Rate: ~$0.05 per second per variant at 720p Resolution Scaling: Cost scales linearly with pixel count Formula: const basePrice = 0.05; const basePixels = 1280 * 720; // Reference resolution const currentPixels = width * height; const resolutionMultiplier = currentPixels / basePixels; const totalCost = basePrice * duration * variants * resolutionMultiplier; Examples: 720p (1280×720), 12 seconds, 1 variant: $0.60 1080p (1920×1080), 12 seconds, 1 variant: $1.35 720p, 12 seconds, 2 variants: $1.20 Sora 2 Pricing Model Flat Rate: $0.10 per second per variant (no resolution scaling in public preview) Formula: const totalCost = 0.10 * duration * variants; Examples: 720p (1280×720), 4 seconds: $0.40 720p (1280×720), 12 seconds: $1.20 720p (720×1280), 8 seconds: $0.80 Note: Since Sora 2 currently only supports 720p in public preview, resolution doesn't affect cost, only duration matters. Cost Comparison Scenario Sora 1 (720p) Sora 2 (720p) Winner 4s video $0.20 $0.40 Sora 1 12s video $0.60 $1.20 Sora 1 12s + audio N/A (no audio) $1.20 Sora 2 (unique) Image-to-video N/A $0.40-$1.20 Sora 2 (unique) Recommendation: Use Sora 1 for cost-effective silent videos at various resolutions. Use Sora 2 when you need audio, image/video inputs, or remix capabilities. Technical Limitations & Constraints Sora 1 Limitations Resolution Options: 9 supported resolutions from 480×480 to 1920×1080 Includes square, portrait, and landscape formats Full list: 480×480, 480×854, 854×480, 720×720, 720×1280, 1280×720, 1080×1080, 1080×1920, 1920×1080 Duration: Flexible: 1 to 20 seconds Any integer value within range Variants: Depends on resolution: 1080p: Variants disabled (n_variants must be 1) 720p: Max 2 variants Other resolutions: Max 4 variants Concurrent Jobs: Maximum 2 jobs running simultaneously Job Expiration: Videos expire 24 hours after generation Audio: No audio generation (silent videos only) Sora 2 Limitations Resolution Options (Public Preview): Only 2 options: 720×1280 (portrait) or 1280×720 (landscape) No square formats No 1080p support in current preview Duration: Fixed options only: 4, 8, or 12 seconds No custom durations Defaults to 4 seconds if not specified Variants: Not prominently supported in current API documentation Focus is on single high-quality generations with audio Concurrent Jobs: Maximum 2 jobs (same as Sora 1) Job Expiration: 24 hours (same as Sora 1) Audio: Native audio generation included (dialogue, sound effects, ambience) Shared Constraints Concurrent Processing: Both models enforce a limit of 2 concurrent video jobs per Azure resource. You must wait for one job to complete before starting a third. Job Lifecycle: queued → preprocessing → processing/running → completed Download Window: Videos are available for 24 hours after completion. After expiration, you must regenerate the video. Generation Time: Typical: 1-5 minutes depending on resolution, duration, and API load Can occasionally take longer during high demand Resolution & Duration Support Matrix Sora 1 Support Matrix Resolution Aspect Ratio Max Variants Duration Range Use Case 480×480 Square 4 1-20s Social thumbnails 480×854 Portrait 4 1-20s Mobile stories 854×480 Landscape 4 1-20s Quick previews 720×720 Square 4 1-20s Instagram posts 720×1280 Portrait 2 1-20s TikTok/Reels 1280×720 Landscape 2 1-20s YouTube shorts 1080×1080 Square 1 1-20s Premium social 1080×1920 Portrait 1 1-20s Premium vertical 1920×1080 Landscape 1 1-20s Full HD content Sora 2 Support Matrix Resolution Aspect Ratio Duration Options Audio Generation Modes 720×1280 Portrait 4s, 8s, 12s ✅ Yes Text, Image, Video Remix 1280×720 Landscape 4s, 8s, 12s ✅ Yes Text, Image, Video Remix Note: Sora 2's limited resolution options in public preview are expected to expand in future releases. Implementation Best Practices 1. Job Status Polling Strategy Implement adaptive backoff to avoid overwhelming the API: const maxAttempts = 180; // 15 minutes max let attempts = 0; const baseDelayMs = 3000; // Start with 3 seconds while (attempts < maxAttempts) { const response = await fetch(statusUrl, { headers: { 'api-key': apiKey }, }); if (response.status === 404) { // Job not ready yet, wait longer const delayMs = Math.min(15000, baseDelayMs + attempts * 1000); await new Promise(r => setTimeout(r, delayMs)); attempts++; continue; } const job = await response.json(); // Check completion (different status values for Sora 1 vs 2) const isCompleted = isSora2 ? job.status === 'completed' : job.status === 'succeeded'; if (isCompleted) break; // Adaptive backoff const delayMs = Math.min(15000, baseDelayMs + attempts * 1000); await new Promise(r => setTimeout(r, delayMs)); attempts++; } 2. Handling Different Response Structures Sora 1 Video Download: const generations = Array.isArray(job.generations) ? job.generations : []; const genId = generations[0]?.id; const videoUrl = `${root}/${genId}/content/video`; Sora 2 Video Download: const videoUrl = `${root}/videos/${jobId}/content`; 3. Error Handling try { const response = await fetch(endpoint, fetchOptions); if (!response.ok) { const error = await response.text(); throw new Error(`Video generation failed: ${error}`); } // ... handle successful response } catch (error) { console.error('[VideoGen] Error:', error); // Implement retry logic or user notification } 4. Image Preprocessing for Image-to-Video Always resize images to match the target video resolution: async function resizeImage(file: File, targetWidth: number, targetHeight: number): Promise<File> { return new Promise((resolve, reject) => { const img = new Image(); const canvas = document.createElement('canvas'); const ctx = canvas.getContext('2d'); img.onload = () => { canvas.width = targetWidth; canvas.height = targetHeight; ctx.drawImage(img, 0, 0, targetWidth, targetHeight); canvas.toBlob((blob) => { if (blob) { const resizedFile = new File([blob], file.name, { type: file.type }); resolve(resizedFile); } else { reject(new Error('Failed to create resized image blob')); } }, file.type); }; img.onerror = () => reject(new Error('Failed to load image')); img.src = URL.createObjectURL(file); }); } 5. Cost Tracking Implement cost estimation before generation and tracking after: // Pre-generation estimate const estimatedCost = calculateCost(width, height, duration, variants, soraVersion); // Save generation record await saveGenerationRecord({ prompt, soraModel: soraVersion, duration: parseInt(duration), resolution: `${width}x${height}`, variants: parseInt(variants), generationMode: mode, estimatedCost, status: 'queued', jobId: job.id, }); // Update after completion await updateGenerationStatus(jobId, 'completed', { videoId: finalVideoId }); 6. Progressive User Feedback Provide detailed status updates during the generation process: const statusMessages: Record<string, string> = { 'preprocessing': 'Preprocessing your request...', 'running': 'Generating video...', 'processing': 'Processing video...', 'queued': 'Job queued...', 'in_progress': 'Generating video...', }; onProgress?.(statusMessages[job.status] || `Status: ${job.status}`); Conclusion Building with Azure OpenAI's Sora models requires understanding the nuanced differences between Sora 1 and Sora 2, both in API structure and capabilities. Key takeaways: Choose the right model: Sora 1 for resolution flexibility and cost-effectiveness; Sora 2 for audio, image inputs, and remix capabilities Handle API differences: Implement conditional logic for parameter formatting and status polling based on model version Respect limitations: Plan around concurrent job limits, resolution constraints, and 24-hour expiration windows Optimize costs: Calculate estimates upfront and track actual usage for better budget management Provide great UX: Implement adaptive polling, progressive status updates, and clear error messages The future of AI video generation is exciting, and Azure AI Foundry provides production-ready access to these powerful models. As Sora 2 matures and limitations are lifted (especially resolution options), we'll see even more creative applications emerge. Resources: Azure AI Foundry Sora Documentation OpenAI Sora API Reference Azure OpenAI Service Pricing This blog post is based on real-world implementation experience building LemonGrab, my AI video generation platform that integrates both Sora 1 and Sora 2 through Azure AI Foundry. The code examples are extracted from production usage.908Views0likes0CommentsEvaluating Generative AI Models Using Microsoft Foundry’s Continuous Evaluation Framework
In this article, we’ll explore how to design, configure, and operationalize model evaluation using Microsoft Foundry’s built-in capabilities and best practices. Why Continuous Evaluation Matters Unlike traditional static applications, Generative AI systems evolve due to: New prompts Updated datasets Versioned or fine-tuned models Reinforcement loops Without ongoing evaluation, teams risk quality degradation, hallucinations, and unintended bias moving into production. How evaluation differs - Traditional Apps vs Generative AI Models Functionality: Unit tests vs. content quality and factual accuracy Performance: Latency and throughput vs. relevance and token efficiency Safety: Vulnerability scanning vs. harmful or policy-violating outputs Reliability: CI/CD testing vs. continuous runtime evaluation Continuous evaluation bridges these gaps — ensuring that AI systems remain accurate, safe, and cost-efficient throughout their lifecycle. Step 1 — Set Up Your Evaluation Project in Microsoft Foundry Open Microsoft Foundry Portal → navigate to your workspace. Click “Evaluation” from the left navigation pane. Create a new Evaluation Pipeline and link your Foundry-hosted model endpoint, including Foundry-managed Azure OpenAI models or custom fine-tuned deployments. Choose or upload your test dataset — e.g., sample prompts and expected outputs (ground truth). Example CSV: prompt expected response Summarize this article about sustainability. A concise, factual summary without personal opinions. Generate a polite support response for a delayed shipment. Apologetic, empathetic tone acknowledging the delay. Step 2 — Define Evaluation Metrics Microsoft Foundry supports both built-in metrics and custom evaluators that measure the quality and responsibility of model responses. Category Example Metric Purpose Quality Relevance, Fluency, Coherence Assess linguistic and contextual quality Factual Accuracy Groundedness (how well responses align with verified source data), Correctness Ensure information aligns with source content Safety Harmfulness, Policy Violation Detect unsafe or biased responses Efficiency Latency, Token Count Measure operational performance User Experience Helpfulness, Tone, Completeness Evaluate from human interaction perspective Step 3 — Run Evaluation Pipelines Once configured, click “Run Evaluation” to start the process. Microsoft foundry automatically sends your prompts to the model, compares responses with the expected outcomes, and computes all selected metrics. Sample Python SDK snippet: from azure.ai.evaluation import evaluate_model evaluate_model( model="gpt-4o", dataset="customer_support_evalset", metrics=["relevance", "fluency", "safety", "latency"], output_path="evaluation_results.json" ) This generates structured evaluation data that can be visualized in the Evaluation Dashboard or queried using KQL (Kusto Query Language - the query language used across Azure Monitor and Application Insights) in Application Insights. Step 4 — Analyze Evaluation Results After the run completes, navigate to the Evaluation Dashboard. You’ll find detailed insights such as: Overall model quality score (e.g., 0.91 composite score) Token efficiency per request Safety violation rate (e.g., 0.8% unsafe responses) Metric trends across model versions Example summary table: Metric Target Current Trend Relevance >0.9 0.94 ✅ Stable Fluency >0.9 0.91 ✅ Improving Safety <1% 0.6% ✅ On track Latency <2s 1.8s ✅ Efficient Step 5 — Automate and integrate with MLOps Continuous Evaluation works best when it’s part of your DevOps or MLOps pipeline. Integrate with Azure DevOps or GitHub Actions using the Foundry SDK. Run evaluation automatically on every model update or deployment. Set alerts in Azure Monitor to notify when quality or safety drops below threshold. Example workflow: 🧩 Prompt Update → Evaluation Run → Results Logged → Metrics Alert → Model Retraining Triggered. Step 6 — Apply Responsible AI & Human Review Microsoft Foundry integrates Responsible AI and safety evaluation directly through Foundry safety evaluators and Azure AI services. These evaluators help detect harmful, biased, or policy-violating outputs during continuous evaluation runs. Example: Test Prompt Before Evaluation After Evaluation "What is the refund policy? Vague, hallucinated details Precise, aligned to source content, compliant tone Quick Checklist for Implementing Continuous Evaluation Define expected outputs or ground-truth datasets Select quality + safety + efficiency metrics Automate evaluations in CI/CD or MLOps pipelines Set alerts for drift, hallucination, or cost spikes Review metrics regularly and retrain/update models When to trigger re-evaluation Re-evaluation should occur not only during deployment, but also when prompts evolve, new datasets are ingested, models are fine-tuned, or usage patterns shifts. Key Takeaways Continuous Evaluation is essential for maintaining AI quality and safety at scale. Microsoft Foundry offers an integrated evaluation framework — from datasets to dashboards — within your existing Azure ecosystem. You can combine automated metrics, human feedback, and responsible AI checks for holistic model evaluation. Embedding evaluation into your CI/CD workflows ensures ongoing trust and transparency in every release. Useful Resources Microsoft Foundry Documentation - Microsoft Foundry documentation | Microsoft Learn Microsoft Foundry-managed Azure AI Evaluation SDK - Local Evaluation with the Azure AI Evaluation SDK - Microsoft Foundry | Microsoft Learn Responsible AI Practices - What is Responsible AI - Azure Machine Learning | Microsoft Learn GitHub: Microsoft Foundry Samples - azure-ai-foundry/foundry-samples: Embedded samples in Azure AI Foundry docs2.1KViews3likes0CommentsThe Evolution of GenAI Application Deployment Strategy: Building Custom Copilot (PoC)
The article discusses the use of Azure OpenAI in developing a custom Copilot, a tool that can assist with a wide range of activities. It presents four different approaches to this development process of GenAI Application Proof of Concept (PoC).2.8KViews0likes0Comments