azure openai
47 TopicsStop Drawing Architecture Diagrams Manually: Meet the Open-Source AI Architecture Review Agents
Designing and documenting software architecture is often a battle against static diagrams that become outdated the moment they are drawn. The Architecture Review Agent changes that by turning your design process into a dynamic, AI-powered workflow. In this post, we explore how to leverage Microsoft Foundry Hosted Agents, Azure OpenAI, and Excalidraw to build an open-source tool that instantly converts messy text descriptions, YAML, or README files into editable architecture diagrams. Beyond just drawing boxes, the agent acts as a technical co-pilot, delivering prioritized risk assessments, highlighting single points of failure, and mapping component dependencies. Discover how to eliminate manual diagramming, catch security flaws early, and deploy your own enterprise-grade agent with zero infrastructure overhead.7.6KViews6likes5CommentsIntegrating Microsoft Foundry with OpenClaw: Step by Step Model Configuration
Step 1: Deploying Models on Microsoft Foundry Let us kick things off in the Azure portal. To get our OpenClaw agent thinking like a genius, we need to deploy our models in Microsoft Foundry. For this guide, we are going to focus on deploying gpt-5.2-codex on Microsoft Foundry with OpenClaw. Navigate to your AI Hub, head over to the model catalog, choose the model you wish to use with OpenClaw and hit deploy. Once your deployment is successful, head to the endpoints section. Important: Grab your Endpoint URL and your API Keys right now and save them in a secure note. We will need these exact values to connect OpenClaw in a few minutes. Step 2: Installing and Initializing OpenClaw Next up, we need to get OpenClaw running on your machine. Open up your terminal and run the official installation script: curl -fsSL https://openclaw.ai/install.sh | bash The wizard will walk you through a few prompts. Here is exactly how to answer them to link up with our Azure setup: First Page (Model Selection): Choose "Skip for now". Second Page (Provider): Select azure-openai-responses. Model Selection: Select gpt-5.2-codex , For now only the models listed (hosted on Microsoft Foundry) in the picture below are available to be used with OpenClaw. Follow the rest of the standard prompts to finish the initial setup. Step 3: Editing the OpenClaw Configuration File Now for the fun part. We need to manually configure OpenClaw to talk to Microsoft Foundry. Open your configuration file located at ~/.openclaw/openclaw.json in your favorite text editor. Replace the contents of the models and agents sections with the following code block: { "models": { "providers": { "azure-openai-responses": { "baseUrl": "https://<YOUR_RESOURCE_NAME>.openai.azure.com/openai/v1", "apiKey": "<YOUR_AZURE_OPENAI_API_KEY>", "api": "openai-responses", "authHeader": false, "headers": { "api-key": "<YOUR_AZURE_OPENAI_API_KEY>" }, "models": [ { "id": "gpt-5.2-codex", "name": "GPT-5.2-Codex (Azure)", "reasoning": true, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 400000, "maxTokens": 16384, "compat": { "supportsStore": false } }, { "id": "gpt-5.2", "name": "GPT-5.2 (Azure)", "reasoning": false, "input": ["text", "image"], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "contextWindow": 272000, "maxTokens": 16384, "compat": { "supportsStore": false } } ] } } }, "agents": { "defaults": { "model": { "primary": "azure-openai-responses/gpt-5.2-codex" }, "models": { "azure-openai-responses/gpt-5.2-codex": {} }, "workspace": "/home/<USERNAME>/.openclaw/workspace", "compaction": { "mode": "safeguard" }, "maxConcurrent": 4, "subagents": { "maxConcurrent": 8 } } } } You will notice a few placeholders in that JSON. Here is exactly what you need to swap out: Placeholder Variable What It Is Where to Find It <YOUR_RESOURCE_NAME> The unique name of your Azure OpenAI resource. Found in your Azure Portal under the Azure OpenAI resource overview. <YOUR_AZURE_OPENAI_API_KEY> The secret key required to authenticate your requests. Found in Microsoft Foundry under your project endpoints or Azure Portal keys section. <USERNAME> Your local computer's user profile name. Open your terminal and type whoami to find this. Step 4: Restart the Gateway After saving the configuration file, you must restart the OpenClaw gateway for the new Foundry settings to take effect. Run this simple command: openclaw gateway restart Configuration Notes & Deep Dive If you are curious about why we configured the JSON that way, here is a quick breakdown of the technical details. Authentication Differences Azure OpenAI uses the api-key HTTP header for authentication. This is entirely different from the standard OpenAI Authorization: Bearer header. Our configuration file addresses this in two ways: Setting "authHeader": false completely disables the default Bearer header. Adding "headers": { "api-key": "<key>" } forces OpenClaw to send the API key via Azure's native header format. Important Note: Your API key must appear in both the apiKey field AND the headers.api-key field within the JSON for this to work correctly. The Base URL Azure OpenAI's v1-compatible endpoint follows this specific format: https://<your_resource_name>.openai.azure.com/openai/v1 The beautiful thing about this v1 endpoint is that it is largely compatible with the standard OpenAI API and does not require you to manually pass an api-version query parameter. Model Compatibility Settings "compat": { "supportsStore": false } disables the store parameter since Azure OpenAI does not currently support it. "reasoning": true enables the thinking mode for GPT-5.2-Codex. This supports low, medium, high, and xhigh levels. "reasoning": false is set for GPT-5.2 because it is a standard, non-reasoning model. Model Specifications & Cost Tracking If you want OpenClaw to accurately track your token usage costs, you can update the cost fields from 0 to the current Azure pricing. Here are the specs and costs for the models we just deployed: Model Specifications Model Context Window Max Output Tokens Image Input Reasoning gpt-5.2-codex 400,000 tokens 16,384 tokens Yes Yes gpt-5.2 272,000 tokens 16,384 tokens Yes No Current Cost (Adjust in JSON) Model Input (per 1M tokens) Output (per 1M tokens) Cached Input (per 1M tokens) gpt-5.2-codex $1.75 $14.00 $0.175 gpt-5.2 $2.00 $8.00 $0.50 Conclusion: And there you have it! You have successfully bridged the gap between the enterprise-grade infrastructure of Microsoft Foundry and the local autonomy of OpenClaw. By following these steps, you are not just running a chatbot; you are running a sophisticated agent capable of reasoning, coding, and executing tasks with the full power of GPT-5.2-codex behind it. The combination of Azure's reliability and OpenClaw's flexibility opens up a world of possibilities. Whether you are building an automated devops assistant, a research agent, or just exploring the bleeding edge of AI, you now have a robust foundation to build upon. Now it is time to let your agent loose on some real tasks. Go forth, experiment with different system prompts, and see what you can build. If you run into any interesting edge cases or come up with a unique configuration, let me know in the comments below. Happy coding!5.2KViews1like2CommentsHow to Set Up Claude Code with Microsoft Foundry Models on macOS
Introduction Building with AI isn't just about picking a smart model. It is about where that model lives. I chose to route my Claude Code setup through Microsoft Foundry because I needed more than just a raw API. I wanted the reliability, compliance, and structured management that comes with Microsoft's ecosystem. When you are moving from a prototype to something real, having that level of infrastructure backing your calls makes a significant difference. The challenge is that Foundry is designed for enterprise cloud environments, while my daily development work happens locally on a MacBook. Getting the two to communicate seamlessly involved navigating a maze of shell configurations and environment variables that weren't immediately obvious. I wrote this guide to document the exact steps for bridging that gap. Here is how you can set up Claude Code to run locally on macOS while leveraging the stability of models deployed on Microsoft Foundry. Requirements Before we open the terminal, let's make sure you have the necessary accounts and environments ready. Since we are bridging a local CLI with an enterprise cloud setup, having these credentials handy now will save you time later. Azure Subscription with Microsoft Foundry Setup - This is the most critical piece. You need an active Azure subscription where the Microsoft Foundry environment is initialized. Ensure that you have deployed the Claude model you intend to use and that the deployment status is active. You will need the specific endpoint URL and the associated API keys from this deployment to configure the connection. An Anthropic User Account - Even though the compute is happening on Azure, the interface requires an Anthropic account. You will need this to authenticate your session and manage your user profile settings within the Claude Code ecosystem. Claude Code Client on macOS - We will be running the commands locally, so you need the Claude Code CLI installed on your MacBook. Step 1: Install Claude Code on macOS The recommended installation method is via Homebrew or Curl, which sets it up for terminal access ("OS level"). Option A: Homebrew (Recommended) brew install --cask claude-code Option B: Curl curl -fsSL https://claude.ai/install.sh | bash Verify Installation: Run claude --version. Step 2: Set Up Microsoft Foundry to deploy Claude model Navigate to your Microsoft Foundry portal, and find the Claude model catalog, and deploy the selected Claude model. [Microsoft Foundry > My Assets > Models + endpoints > + Deploy Model > Deploy Base model > Search for "Claude"] In your Model Deployment dashboard, go to the deployed Claude Models and get the "Endpoints and keys". Store it somewhere safe, because we will need them to configure Claude Code later on. Configure Environment Variables in MacOS terminal: Now we need to tell your local Claude Code client to route requests through Microsoft Foundry instead of the default Anthropic endpoints. This is handled by setting specific environment variables that act as a bridge between your local machine and your Azure resources. You could run these commands manually every time you open a terminal, but it is much more efficient to save them permanently in your shell profile. For most modern macOS users, this file is .zshrc. Open your terminal and add the following lines to your profile, making sure to replace the placeholder text with your actual Azure credentials: export CLAUDE_CODE_USE_FOUNDRY=1 export ANTHROPIC_FOUNDRY_API_KEY="your-azure-api-key" export ANTHROPIC_FOUNDRY_RESOURCE="your-resource-name" # Specify the deployment name for Opus export CLAUDE_CODE_MODEL="your-opus-deployment-name" Once you have added these variables, you need to reload your shell configuration for the changes to take effect. Run the source command below to update your current session, and then verify the setup by launching Claude: source ~/.zshrc claude If everything is configured correctly, the Claude CLI will initialize using your Microsoft Foundry deployment as the backend. Once you execute the claude command, the CLI will prompt you to choose an authentication method. Select Option 2 (Antrophic Console account) to proceed. This action triggers your default web browser and redirects you to the Claude Console. Simply sign in using your standard Anthropic account credentials. After you have successfully signed in, you will be presented with a permissions screen. Click the Authorize button to link your web session back to your local terminal. Return to your terminal window, and you should see a notification confirming that the login process is complete. Press Enter to finalize the setup. You are now fully connected. You can start using Claude Code locally, powered entirely by the model deployment running in your Microsoft Foundry environment. Conclusion Setting up this environment might seem like a heavy lift just to run a CLI tool, but the payoff is significant. You now have a workflow that combines the immediate feedback of local development with the security and infrastructure benefits of Microsoft Foundry. One of the most practical upgrades is the removal of standard usage caps. You are no longer limited to the 5-hour API call limits, which gives you the freedom to iterate, test, and debug for as long as your project requires without hitting a wall. By bridging your local macOS terminal to Azure, you are no longer just hitting an API endpoint. You are leveraging a managed, compliance-ready environment that scales with your needs. The best part is that now the configuration is locked in, you don't need to think about the plumbing again. You can focus entirely on coding, knowing that the reliability of an enterprise platform is running quietly in the background supporting every command.763Views1like0CommentsBuilding with Azure OpenAI Sora: A Complete Guide to AI Video Generation
In this comprehensive guide, we'll explore how to integrate both Sora 1 and Sora 2 models from Azure OpenAI Service into a production web application. We'll cover API integration, request body parameters, cost analysis, limitations, and the key differences between using Azure AI Foundry endpoints versus OpenAI's native API. Table of Contents Introduction to Sora Models Azure AI Foundry vs. OpenAI API Structure API Integration: Request Body Parameters Video Generation Modes Cost Analysis per Generation Technical Limitations & Constraints Resolution & Duration Support Implementation Best Practices Introduction to Sora Models Sora is OpenAI's groundbreaking text-to-video model that generates realistic videos from natural language descriptions. Azure AI Foundry provides access to two versions: Sora 1: The original model focused primarily on text-to-video generation with extensive resolution options (480p to 1080p) and flexible duration (1-20 seconds) Sora 2: The enhanced version with native audio generation, multiple generation modes (text-to-video, image-to-video, video-to-video remix), but more constrained resolution options (720p only in public preview) Azure AI Foundry vs. OpenAI API Structure Key Architectural Differences Sora 1 uses Azure's traditional deployment-based API structure: Endpoint Pattern: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/... Parameters: Uses Azure-specific naming like n_seconds, n_variants, separate width/height fields Job Management: Uses /jobs/{id} for status polling Content Download: Uses /video/generations/{generation_id}/content/video Sora 2 adapts OpenAI's v1 API format while still being hosted on Azure: Endpoint Pattern: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}/videos Parameters: Uses OpenAI-style naming like seconds (string), size (combined dimension string like "1280x720") Job Management: Uses /videos/{video_id} for status polling Content Download: Uses /videos/{video_id}/content Why This Matters? This architectural difference requires conditional request formatting in your code: const isSora2 = deployment.toLowerCase().includes('sora-2'); if (isSora2) { requestBody = { model: deployment, prompt, size: `${width}x${height}`, // Combined format seconds: duration.toString(), // String type }; } else { requestBody = { model: deployment, prompt, height, // Separate dimensions width, n_seconds: duration.toString(), // Azure naming n_variants: variants, }; } API Integration: Request Body Parameters Sora 1 API Parameters Standard Text-to-Video Request: { "model": "sora-1", "prompt": "Wide shot of a child flying a red kite in a grassy park, golden hour sunlight, camera slowly pans upward.", "height": "720", "width": "1280", "n_seconds": "12", "n_variants": "2" } Parameter Details: model (String, Required): Your Azure deployment name prompt (String, Required): Natural language description of the video (max 32000 chars) height (String, Required): Video height in pixels width (String, Required): Video width in pixels n_seconds (String, Required): Duration (1-20 seconds) n_variants (String, Optional): Number of variations to generate (1-4, constrained by resolution) Sora 2 API Parameters Text-to-Video Request: { "model": "sora-2", "prompt": "A serene mountain landscape with cascading waterfalls, cinematic drone shot", "size": "1280x720", "seconds": "12" } Image-to-Video Request (uses FormData): const formData = new FormData(); formData.append('model', 'sora-2'); formData.append('prompt', 'Animate this image with gentle wind movement'); formData.append('size', '1280x720'); formData.append('seconds', '8'); formData.append('input_reference', imageFile); // JPEG/PNG/WebP Video-to-Video Remix Request: Endpoint: POST .../videos/{video_id}/remix Body: Only { "prompt": "your new description" } The original video's structure, motion, and framing are reused while applying the new prompt Parameter Details: model (String, Optional): Your deployment name prompt (String, Required): Video description size (String, Optional): Either "720x1280" or "1280x720" (defaults to "720x1280") seconds (String, Optional): "4", "8", or "12" (defaults to "4") input_reference (File, Optional): Reference image for image-to-video mode remix_video_id (String, URL parameter): ID of video to remix Video Generation Modes 1. Text-to-Video (Both Models) The foundational mode where you provide a text prompt describing the desired video. Implementation: const response = await fetch(endpoint, { method: 'POST', headers: { 'Content-Type': 'application/json', 'api-key': apiKey, }, body: JSON.stringify({ model: deployment, prompt: "A train journey through mountains with dramatic lighting", size: "1280x720", seconds: "12", }), }); Best Practices: Include shot type (wide, close-up, aerial) Describe subject, action, and environment Specify lighting conditions (golden hour, dramatic, soft) Add camera movement if desired (pans, tilts, tracking shots) 2. Image-to-Video (Sora 2 Only) Generate a video anchored to or starting from a reference image. Key Requirements: Supported formats: JPEG, PNG, WebP Image dimensions must exactly match the selected video resolution Our implementation automatically resizes uploaded images to match Implementation Detail: // Resize image to match video dimensions const targetWidth = parseInt(width); const targetHeight = parseInt(height); const resizedImage = await resizeImage(inputReference, targetWidth, targetHeight); // Send as multipart/form-data formData.append('input_reference', resizedImage); 3. Video-to-Video Remix (Sora 2 Only) Create variations of existing videos while preserving their structure and motion. Use Cases: Change weather conditions in the same scene Modify time of day while keeping camera movement Swap subjects while maintaining composition Adjust artistic style or color grading Endpoint Structure: POST {base_url}/videos/{original_video_id}/remix?api-version=2024-08-01-preview Implementation: let requestEndpoint = endpoint; if (isSora2 && remixVideoId) { const [baseUrl, queryParams] = endpoint.split('?'); const root = baseUrl.replace(/\/videos$/, ''); requestEndpoint = `${root}/videos/${remixVideoId}/remix${queryParams ? '?' + queryParams : ''}`; } Cost Analysis per Generation Sora 1 Pricing Model Base Rate: ~$0.05 per second per variant at 720p Resolution Scaling: Cost scales linearly with pixel count Formula: const basePrice = 0.05; const basePixels = 1280 * 720; // Reference resolution const currentPixels = width * height; const resolutionMultiplier = currentPixels / basePixels; const totalCost = basePrice * duration * variants * resolutionMultiplier; Examples: 720p (1280×720), 12 seconds, 1 variant: $0.60 1080p (1920×1080), 12 seconds, 1 variant: $1.35 720p, 12 seconds, 2 variants: $1.20 Sora 2 Pricing Model Flat Rate: $0.10 per second per variant (no resolution scaling in public preview) Formula: const totalCost = 0.10 * duration * variants; Examples: 720p (1280×720), 4 seconds: $0.40 720p (1280×720), 12 seconds: $1.20 720p (720×1280), 8 seconds: $0.80 Note: Since Sora 2 currently only supports 720p in public preview, resolution doesn't affect cost, only duration matters. Cost Comparison Scenario Sora 1 (720p) Sora 2 (720p) Winner 4s video $0.20 $0.40 Sora 1 12s video $0.60 $1.20 Sora 1 12s + audio N/A (no audio) $1.20 Sora 2 (unique) Image-to-video N/A $0.40-$1.20 Sora 2 (unique) Recommendation: Use Sora 1 for cost-effective silent videos at various resolutions. Use Sora 2 when you need audio, image/video inputs, or remix capabilities. Technical Limitations & Constraints Sora 1 Limitations Resolution Options: 9 supported resolutions from 480×480 to 1920×1080 Includes square, portrait, and landscape formats Full list: 480×480, 480×854, 854×480, 720×720, 720×1280, 1280×720, 1080×1080, 1080×1920, 1920×1080 Duration: Flexible: 1 to 20 seconds Any integer value within range Variants: Depends on resolution: 1080p: Variants disabled (n_variants must be 1) 720p: Max 2 variants Other resolutions: Max 4 variants Concurrent Jobs: Maximum 2 jobs running simultaneously Job Expiration: Videos expire 24 hours after generation Audio: No audio generation (silent videos only) Sora 2 Limitations Resolution Options (Public Preview): Only 2 options: 720×1280 (portrait) or 1280×720 (landscape) No square formats No 1080p support in current preview Duration: Fixed options only: 4, 8, or 12 seconds No custom durations Defaults to 4 seconds if not specified Variants: Not prominently supported in current API documentation Focus is on single high-quality generations with audio Concurrent Jobs: Maximum 2 jobs (same as Sora 1) Job Expiration: 24 hours (same as Sora 1) Audio: Native audio generation included (dialogue, sound effects, ambience) Shared Constraints Concurrent Processing: Both models enforce a limit of 2 concurrent video jobs per Azure resource. You must wait for one job to complete before starting a third. Job Lifecycle: queued → preprocessing → processing/running → completed Download Window: Videos are available for 24 hours after completion. After expiration, you must regenerate the video. Generation Time: Typical: 1-5 minutes depending on resolution, duration, and API load Can occasionally take longer during high demand Resolution & Duration Support Matrix Sora 1 Support Matrix Resolution Aspect Ratio Max Variants Duration Range Use Case 480×480 Square 4 1-20s Social thumbnails 480×854 Portrait 4 1-20s Mobile stories 854×480 Landscape 4 1-20s Quick previews 720×720 Square 4 1-20s Instagram posts 720×1280 Portrait 2 1-20s TikTok/Reels 1280×720 Landscape 2 1-20s YouTube shorts 1080×1080 Square 1 1-20s Premium social 1080×1920 Portrait 1 1-20s Premium vertical 1920×1080 Landscape 1 1-20s Full HD content Sora 2 Support Matrix Resolution Aspect Ratio Duration Options Audio Generation Modes 720×1280 Portrait 4s, 8s, 12s ✅ Yes Text, Image, Video Remix 1280×720 Landscape 4s, 8s, 12s ✅ Yes Text, Image, Video Remix Note: Sora 2's limited resolution options in public preview are expected to expand in future releases. Implementation Best Practices 1. Job Status Polling Strategy Implement adaptive backoff to avoid overwhelming the API: const maxAttempts = 180; // 15 minutes max let attempts = 0; const baseDelayMs = 3000; // Start with 3 seconds while (attempts < maxAttempts) { const response = await fetch(statusUrl, { headers: { 'api-key': apiKey }, }); if (response.status === 404) { // Job not ready yet, wait longer const delayMs = Math.min(15000, baseDelayMs + attempts * 1000); await new Promise(r => setTimeout(r, delayMs)); attempts++; continue; } const job = await response.json(); // Check completion (different status values for Sora 1 vs 2) const isCompleted = isSora2 ? job.status === 'completed' : job.status === 'succeeded'; if (isCompleted) break; // Adaptive backoff const delayMs = Math.min(15000, baseDelayMs + attempts * 1000); await new Promise(r => setTimeout(r, delayMs)); attempts++; } 2. Handling Different Response Structures Sora 1 Video Download: const generations = Array.isArray(job.generations) ? job.generations : []; const genId = generations[0]?.id; const videoUrl = `${root}/${genId}/content/video`; Sora 2 Video Download: const videoUrl = `${root}/videos/${jobId}/content`; 3. Error Handling try { const response = await fetch(endpoint, fetchOptions); if (!response.ok) { const error = await response.text(); throw new Error(`Video generation failed: ${error}`); } // ... handle successful response } catch (error) { console.error('[VideoGen] Error:', error); // Implement retry logic or user notification } 4. Image Preprocessing for Image-to-Video Always resize images to match the target video resolution: async function resizeImage(file: File, targetWidth: number, targetHeight: number): Promise<File> { return new Promise((resolve, reject) => { const img = new Image(); const canvas = document.createElement('canvas'); const ctx = canvas.getContext('2d'); img.onload = () => { canvas.width = targetWidth; canvas.height = targetHeight; ctx.drawImage(img, 0, 0, targetWidth, targetHeight); canvas.toBlob((blob) => { if (blob) { const resizedFile = new File([blob], file.name, { type: file.type }); resolve(resizedFile); } else { reject(new Error('Failed to create resized image blob')); } }, file.type); }; img.onerror = () => reject(new Error('Failed to load image')); img.src = URL.createObjectURL(file); }); } 5. Cost Tracking Implement cost estimation before generation and tracking after: // Pre-generation estimate const estimatedCost = calculateCost(width, height, duration, variants, soraVersion); // Save generation record await saveGenerationRecord({ prompt, soraModel: soraVersion, duration: parseInt(duration), resolution: `${width}x${height}`, variants: parseInt(variants), generationMode: mode, estimatedCost, status: 'queued', jobId: job.id, }); // Update after completion await updateGenerationStatus(jobId, 'completed', { videoId: finalVideoId }); 6. Progressive User Feedback Provide detailed status updates during the generation process: const statusMessages: Record<string, string> = { 'preprocessing': 'Preprocessing your request...', 'running': 'Generating video...', 'processing': 'Processing video...', 'queued': 'Job queued...', 'in_progress': 'Generating video...', }; onProgress?.(statusMessages[job.status] || `Status: ${job.status}`); Conclusion Building with Azure OpenAI's Sora models requires understanding the nuanced differences between Sora 1 and Sora 2, both in API structure and capabilities. Key takeaways: Choose the right model: Sora 1 for resolution flexibility and cost-effectiveness; Sora 2 for audio, image inputs, and remix capabilities Handle API differences: Implement conditional logic for parameter formatting and status polling based on model version Respect limitations: Plan around concurrent job limits, resolution constraints, and 24-hour expiration windows Optimize costs: Calculate estimates upfront and track actual usage for better budget management Provide great UX: Implement adaptive polling, progressive status updates, and clear error messages The future of AI video generation is exciting, and Azure AI Foundry provides production-ready access to these powerful models. As Sora 2 matures and limitations are lifted (especially resolution options), we'll see even more creative applications emerge. Resources: Azure AI Foundry Sora Documentation OpenAI Sora API Reference Azure OpenAI Service Pricing This blog post is based on real-world implementation experience building LemonGrab, my AI video generation platform that integrates both Sora 1 and Sora 2 through Azure AI Foundry. The code examples are extracted from production usage.563Views0likes0CommentsAzure OpenAI GPT model to review Pull Requests for Azure DevOps
In recent months, the use of Generative Pre-trained Transformer (GPT) models for natural language processing (NLP) has gained significant traction. GPT models, which are based on the Transformer architecture, can generate text from arbitrary sources of input data and can be trained to identify errors and detect anomalies in text. As such, GPT models are increasingly being used for a variety of applications, ranging from natural language understanding to text summarization and question-answering. In the software development world, developers use pull requests to submit proposed changes to a codebase. However, reviews by other developers can sometimes take a long time and not accurate, and in some cases, these reviews can introduce new bugs and issues. In order to reduce this risk, During my research I found the integration of GPT models is possible and we can add Azure OpenAI service as pull request reviewers for Azure Pipelines service. The GPT models are trained on developer codebases and are able to detect potential coding issues such as typos, syntax errors, style inconsistencies and code smells. In addition, they can also assess code structure and suggest improvements to the overall code quality. Once the GPT models have been trained, they can be integrated into the Azure Pipelines service so that they can automatically review pull requests and provide feedback. This helps to reduce the time taken for code reviews, as well as reduce the likelihood of introducing bugs and issues.47KViews4likes13CommentsAzure Logic App AI-Powered Monitoring Solution: Automate, Analyze, and Act on Your Azure Data
Introduction In today’s cloud-driven world, monitoring and analyzing application health is critical for business continuity and operational excellence. However, the sheer volume of monitoring data can make it challenging to extract actionable insights quickly. Enter the Azure Logic App AI-Powered Monitoring Solution—an intelligent, serverless pipeline that leverages Azure Logic Apps and Azure OpenAI to automate monitoring, analyze data, and deliver comprehensive reports right to your inbox. This solution is ideal for organizations seeking to modernize their monitoring workflows, reduce manual analysis, and empower teams with AI-driven insights for faster decision-making. What Does This Solution Accomplish? The Azure Logic App AI-Powered Monitoring Solution creates an automated pipeline that: Extracts monitoring data from Azure Log Analytics using KQL queries. Analyzes data with AI using the Azure OpenAI GPT-4o model. Generates intelligent reports and sends them via email. Runs automatically on a daily schedule. Uses managed identity for secure authentication across Azure services. Business Case Solved Automated Monitoring: No more manual log reviews—let AI do the heavy lifting. Actionable Insights: Receive daily, AI-generated summaries highlighting system health, key metrics, potential issues, and recommendations. Operational Efficiency: Reduce time-to-insight and empower teams to act faster on critical events. Secure and Scalable: Built on Azure’s serverless and identity-driven architecture. Key Features Serverless Architecture: Built on Azure Logic Apps Standard for scalability and cost efficiency. AI-Powered Insights: Uses Azure OpenAI for advanced data analysis and summarization. Infrastructure as Code: Deployable via Bicep templates for reproducibility and automation. Secure by Design: Managed identity and Azure RBAC ensure secure access. Cost Effective: Pay-per-execution model with optimized resource usage. Customizable: Easily modify KQL queries and AI prompts to fit your monitoring needs. Solution Architecture Technologies Involved Azure Logic Apps Standard: Orchestrates the workflow. Azure OpenAI Service (GPT-4o): Performs AI-powered data analysis and summarization. Azure Log Analytics: Source for monitoring data, queried via KQL. Application Insights: Monitors workflow execution and telemetry. Azure Storage Account: Stores Logic App runtime data. Managed Identity: Secures authentication across Azure services. Infrastructure as Code (Bicep): Enables automated, repeatable deployments. Office 365 Connector: Sends email notifications. Support Documentation: https://docs.microsoft.com/en-us/azure/logic-apps/ Issues: https://github.com/vinod-soni-microsoft/logicapp-ai-summarize/issues Star this repository if you find it helpful!1.3KViews0likes0CommentsConfigure Embedding Models on Azure AI Foundry with Open Web UI
Introduction Let’s take a closer look at an exciting development in the AI space. Embedding models are the key to transforming complex data into usable insights, driving innovations like smarter chatbots and tailored recommendations. With Azure AI Foundry, Microsoft’s powerful platform, you’ve got the tools to build and scale these models effortlessly. Add in Open Web UI, a intuitive interface for engaging with AI systems, and you’ve got a winning combo that’s hard to beat. In this article, we’ll explore how embedding models on Azure AI Foundry, paired with Open Web UI, are paving the way for accessible and impactful AI solutions for developers and businesses. Let’s dive in! To proceed with configuring the embedding model from Azure AI Foundry on Open Web UI, please firstly configure the requirements below. Requirements: Setup Azure AI Foundry Hub/Projects Deploy Open Web UI – refer to my previous article on how you can deploy Open Web UI on Azure VM. Optional: Deploy LiteLLM with Azure AI Foundry models to work on Open Web UI - refer to my previous article on how you can do this as well. Deploying Embedding Models on Azure AI Foundry Navigate to the Azure AI Foundry site and deploy an embedding model from the “Model + Endpoint” section. For the purpose of this demonstration, we will deploy the “text-embedding-3-large” model by OpenAI. You should be receiving a URL endpoint and API Key to the embedding model deployed just now. Take note of that credential because we will be using it in Open Web UI. Configuring the embedding models on Open Web UI Now head to the Open Web UI Admin Setting Page > Documents and Select Azure Open AI as the Embedding Model Engine. Copy and Paste the Base URL, API Key, the Embedding Model deployed on Azure AI Foundry and the API version (not the model version) into the fields below: Click “Save” to reflect the changes. Expected Output Now let us look into the scenario for when the embedding model configured on Open Web UI and when it is not. Without Embedding Models configured. With Azure Open AI Embedding models configured. Conclusion And there you have it! Embedding models on Azure AI Foundry, combined with the seamless interaction offered by Open Web UI, are truly revolutionizing how we approach AI solutions. This powerful duo not only simplifies the process of building and deploying intelligent systems but also makes cutting-edge technology more accessible to developers and businesses of all sizes. As we move forward, it’s clear that such integrations will continue to drive innovation, breaking down barriers and unlocking new possibilities in the AI landscape. So, whether you’re a seasoned developer or just stepping into this exciting field, now’s the time to explore what Azure AI Foundry and Open Web UI can do for you. Let’s keep pushing the boundaries of what’s possible!1.9KViews0likes0CommentsBuilding a Basic Chatbot with Azure OpenAI
Overview In this turorial, we'll build a simple chatbot that uses Azure OpenAI to generate responses to user queries. To create a basic chatbot, we need to set up a language model resource that enables conversation capabilities. In this tutorial, we will: Set up the Azure OpenAI resource using the Azure AI Foundry portal. Retrieve the API key needed to connect the resource to your chatbot application. Once the API key is configured in your code, you will be able to integrate the language model into your chatbot and enable it to generate responses. By the end of this tutorial, you'll have a working chatbot that can generate responses using the Azure OpenAI model. Signing In and Setting Up Your Azure AI Foundry Workspace Signing In to Azure AI Foundry Open the Azure AI Foundry page in your web browser. Login to your Azure account. If you don't have an account, you can sign up. Setting Up Your Azure AI Foundry Workspace Select + Create project to create a new project. Perform the following tasks: Enter Project name. It must be a unique value. Select Hub you'd like to use (create a new one if needed). Select Create. Setting Up the Azure OpenAI Resource in Azure AI Foundry In this step, you'll learn how to set up the Azure OpenAI resource in Azure AI Foundry. Azure OpenAI is a pre-trained language model that can generate responses to user queries. We'll be using it in our chatbot. Select Models + endpoints from the left side menu. On this page, you can deploy language models and set up Azure AI resources. In this step, we will deploy the Azure OpenAI GPT-4 language model. Select + Deploy model. Select Deploy base model. In this tutorial, we will deploy the GPT-4o model. Select GPT-4o. Select Confirm. Select Deploy. The model will be deployed. Once the deployment is complete, you will see the model listed on the Models + endpoints page. Now that the model is deployed, you can retrieve the API key needed to connect the model to your chatbot application. Select the model you deployed on the Models + endpoints page. ` On the model details page, you can view information about the model, including the API key. We will come back this page later to add the required information into the environment variables. Setting Up the Project and Install the Libraries Now, you will create a folder to work in and set up a virtual environment to develop a program. Creating a Folder to Work Inside It Open a terminal window and type the following command to create a folder named basic-chatbot in the default path. mkdir basic-chatbot Type the following command inside your terminal to navigate to the basic-chatbot folder you created. cd basic-chatbot Creating a Virtual Environment Type the following command inside your terminal to create a virtual environment named .venv. python -m venv .venv Type the following command inside your terminal to activate the virtual environment. .venv\Scripts\activate.bat NOTE If it worked, you should see (.venv) before the command prompt. Installing the Required Packages Type the following commands inside your terminal to install the required packages. openai: A Python library that provides integration with the Azure OpenAI API. python-dotenv: A Python library for managing environment variables stored in an .env file. pip install openai python-dotenv Setting up the Project in Visual Studio Code To create a basic chatbot program, you will need two files: example.py: This file will contain the code to interact with Azure resources. .env: This file will store the Azure credentials and configuration details. NOTE Purpose of the .env File The .env file is essential for storing the Azure information required to connect and use the resources you created. By keeping the Azure credentials in the .env file, you can ensure a secure and organized way to manage sensitive information. Setting Up example.py File Open Visual Studio Code. Select File from the menu bar. Select Open Folder. Select the basic-chatbot folder that you created, which is located at C:\Users\yourUserName\basic-chatbot. In the left pane of Visual Studio Code, right-click and select New File to create a new file named example.py. Add the following code to the example.py file to import the required libraries. from openai import AzureOpenAI from dotenv import load_dotenv import os # Load environment variables from the .env file load_dotenv() # Retrieve environment variables AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT") AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY") AZURE_OPENAI_MODEL_NAME = os.getenv("AZURE_OPENAI_MODEL_NAME") AZURE_OPENAI_CHAT_DEPLOYMENT_NAME = os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME") AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION") # Initialize Azure OpenAI client client = AzureOpenAI( api_key=AZURE_OPENAI_API_KEY, api_version=AZURE_OPENAI_API_VERSION, base_url=f"{AZURE_OPENAI_ENDPOINT}/openai/deployments/{AZURE_OPENAI_CHAT_DEPLOYMENT_NAME}" ) print("Chatbot: Hello! How can I assist you today? Type 'exit' to end the conversation.") while True: user_input = input("You: ") if user_input.lower() == "exit": print("Chatbot: Ending the conversation. Have a great day!") break response = client.chat.completions.create( model=AZURE_OPENAI_MODEL_NAME, messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": user_input} ], max_tokens=200 ) print("Chatbot:", response.choices[0].message.content.strip()) Setting Up .env File To set up your development environment, we will create a .env file and store the necessary credentials directly. NOTE Complete folder structure: └── YourUserName . └── basic-chatbot . ├── example.py . └── .env In the left pane of Visual Studio Code, right-click and select New File to create a new file named .env. Add the following code to the .env file to include your Azure information. AZURE_OPENAI_API_KEY=your_azure_openai_api_key AZURE_OPENAI_ENDPOINT=https://your_azure_openai_endpoint AZURE_OPENAI_MODEL_NAME=your_model_name AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=your_deployment_name AZURE_OPENAI_API_VERSION=your_api_version Retrieving Environment Variables from Azure AI Foundry Now, you will retrieve the required information from Azure AI Foundry and update the .env file. Go to the Models + endpoints page and select your deployed model. On the Model Details page, copy the following information in to the .env file.: AZURE_OPENAI_API_KEY AZURE_OPENAI_ENDPOINT AZURE_OPENAI_MODEL_NAME AZURE_OPENAI_CHAT_DEPLOYMENT_NAME Paste this information into the .env file in the respective placeholders. Running the Chatbot Program Type the following command inside your terminal to run the program and see if it can answer questions. python example.py Interact with the chatbot by typing your questions or messages. The chatbot will generate responses based on the Azure OpenAI model you deployed. NOTE You can find the full example of this chatbot, including the code and .env template, in my GitHub repository: GitHub Repository2.8KViews2likes1CommentBuilding AI-Powered Clinical Knowledge Stores with Azure AI Search
👀 Missed Session 01? Don’t worry—you can still catch up. But first, here’s what AI HLS Ignited is all about: What is AI HLS Ignited? AI HLS Ignited is a Microsoft-led technical series for healthcare innovators, solution architects, and AI engineers. Each session brings to life real-world AI solutions that are reshaping the Healthcare and Life Sciences (HLS) industry. Through live demos, architectural deep dives, and GitHub-hosted code, we equip you with the tools and knowledge to build with confidence. Session 01 Recap: In our first session, we introduced the accelerator MedIndexer - which is an indexing framework designed for the automated creation of structured knowledge bases from unstructured clinical sources. Whether you're dealing with X-rays, clinical notes, or scanned documents, MedIndexer converts these inputs into a schema-driven format optimized for Azure AI Search. This will allow your applications to leverage state-of-the-art retrieval methodologies, including vector search and re-ranking. Moreover, by applying a well-defined schema and vectorizing the data into high-dimensional representations, MedIndexer empowers AI applications to retrieve more precise and context-aware information... The result? AI systems that surface more relevant, accurate, and context-aware insights—faster. 🔍 Turning Your Unstructured Data into Value "About 80% of medical data remains unstructured and untapped after it is created (e.g., text, image, signal, etc.)" — Healthcare Informatics Research, Chungnam National University In the era of AI, the rise of AI copilots and assistants has led to a shift in how we access knowledge. But retrieving clinical data that lives in disparate formats is no trivial task. Building retrieval systems takes effort—and how you structure your knowledge store matters. It’s a cyclic, iterative, and constantly evolving process. That’s why we believe in leveraging enterprise-ready retrieval platforms like Azure AI Search—designed to power intelligent search experiences across structured and unstructured data. It serves as the foundation for building advanced retrieval systems in healthcare. However, implementing Azure AI Search alone is not enough. Mastering its capabilities and applying well-defined patterns can significantly enhance your ability to address repetitive tasks and complex retrieval scenarios. This project aims to accelerate your ability to transform raw clinical data into high-fidelity, high-value knowledge structures that can power your next-generation AI healthcare applications. 🚀 How to Get Started with MedIndexer New to Azure AI Search? Begin with our guided labs to build a strong foundation and get hands-on with the core capabilities. Already familiar with the tech? Jump ahead to the real-world use cases—learn how to build Coded Policy Knowledge Stores and X-ray Knowledge Stores. 🧪 Labs 🧪 Building Your Azure AI Search Index: 🧾 Notebook - Building your first Index Learn how to create and configure an Azure AI Search index to enable intelligent search capabilities for your applications. 🧪 Indexing Data into Azure AI Search: 🧾 Notebook - Ingest and Index Clinical Data Understand how to ingest, preprocess, and index clinical data into Azure AI Search using schema-first principles. 🧪 Retrieval Methods for Azure AI Search: 🧾 Notebook - Exploring Vector Search and Hybrid Retrieval Dive into retrieval techniques such as vector search, hybrid retrieval, and reranking to enhance the accuracy and relevance of search results. 🧪 Evaluation Methods for Azure AI Search: 🧾 Notebook - Evaluating Search Quality and Relevance Learn how to evaluate the performance of your search index using relevance metrics and ground truth datasets to ensure high-quality search results. 🏥 Use Cases 📝 Creating Coded Policy Knowledge Stores In many healthcare systems, policy documents such as pre-authorization guidelines are still trapped in static, scanned PDFs. These documents are critical—they contain ICD codes, drug name coverage, and payer-specific logic—but are rarely structured or accessible in real-time. To solve this, we built a pipeline that transforms these documents into intelligent, searchable knowledge stores. This diagram shows how pre-auth policy PDFs are ingested via blob storage, passed through an OCR and embedding skillset, and then indexed into Azure AI Search. The result: fast access to coded policy data for AI apps. 🧾 Notebook - Creating Coded Policies Knowledge Stores Transform payer policies into machine-readable formats. This use case includes: Preprocessing and cleaning PDF documents Building custom OCR skills Leveraging out-of-the-box Indexer capabilities and embedding skills Enabling real-time AI-assisted querying for ICDs, payer names, drug names, and policy logic Why it matters: This streamlines prior authorization and coding workflows for providers and payors, reducing manual effort and increasing transparency. 🩻 Creating X-ray Knowledge Stores In radiology workflows, X-ray reports and image metadata contain valuable clinical insights—but these are often underutilized. Traditionally, they’re stored as static entries in PACS systems or loosely connected databases. The goal of this use case is to turn those X-ray reports into a searchable, intelligent asset that clinicians can explore and interact with in meaningful ways. This diagram illustrates a full retrieval pipeline where radiology reports are uploaded, enriched through foundational models, embedded, and indexed. The output powers an AI-driven web app for similarity search and decision support. 🧾 Notebook - Creating X-rays Knowledge Stores Turn imaging reports and metadata into a searchable knowledge base. This includes: Leveraging push APIs with custom event-driven indexing pipeline triggered on new X-ray uploads Generating embeddings using Microsoft Healthcare foundation models Providing an AI-powered front-end for X-ray similarity search Why it matters: Supports clinical decision-making by retrieving similar past cases, aiding diagnosis and treatment planning with contextual relevance. 📣 Join Us for the Next Session Help shape the future of healthcare by sharing AI HLS Ignited with your network—and don’t miss what’s coming next! 📅 Register for the upcoming session → AI HLS Ignited Event Page 💻 Explore the code, demos, and architecture → AI HLS Ignited GitHub Repository