azure openai service
195 TopicsMastering Model Context Protocol (MCP): Building Multi Server MCP with Azure OpenAI
Create complex Multi MCP AI Agentic applications. Deep dive into Multi Server MCP implementation, connecting both local custom and ready MCP Servers in a single client session through a custom chatbot interface.558Views4likes2CommentsEnter new era of enterprise communication with Microsoft Translator Pro & document image translation
Microsoft Translator Pro: standalone, native mobile experience We are thrilled to unveil the gated public preview of Microsoft Translator Pro, our robust solution designed for enterprises seeking to dismantle language barriers in the workplace. Available on iOS, Microsoft Translator Pro offers a standalone, native experience, enabling speech-to-speech translated conversations among coworkers, users, or clients within your enterprise ecosystem. Watch how Microsoft Translator Pro transforms a hotel check-in experience by breaking down language barriers. In this video, a hotel receptionist speaks in English, and the app translates and plays the message aloud in Chinese for the traveler. The traveler responds in Chinese, and the app translates and plays the message aloud in English for the receptionist. Key features of the public preview Our enterprise version of the app is packed with features tailored to meet the stringent demands of enterprises: Core feature - speech-to-speech translation: Break language barriers: Real-time speech-to-speech translation allows you to have seamless communication with individuals speaking different languages. Unified experience: View or hear both transcription and translation simultaneously on a single device, ensuring smooth and efficient conversations. On-device translation: Harness the app's speech-to-speech translation capability without an internet connection in limited languages, ensuring your productivity remains unhampered. Full administrator control: Enterprise IT Administrators wield extensive control over the app's deployment and usage within your organization. They can fine-tune settings to manage conversation history, audit, and diagnostic logs, with the ability to disable history or configure automatic exportation of the history to cloud storage. Uncompromised privacy and security: Microsoft Translator Pro provides enterprises with a high level of translation quality and robust security. We know that Privacy and security are top priorities for you. Once granted access by your organization's admin, you can sign in the app with your organizational credentials. Your conversational data remains strictly yours, safeguarded within your Azure tenant. Neither Microsoft nor any external entities have access to your data. Join the Preview To embark on this journey with us, please complete the gating form . Upon meeting the criteria, we will grant your organization access to the paid version of the Microsoft Translator Pro app, which is now available in the US. Learn more and get started: Microsoft Translator Pro documentation. Document translation translates text embedded in images Our commitment to advancing cross-language communication takes a major step forward with a new enhancement in Azure AI Translator’s Document Translation (DT) feature. Previously, Document Translation supported fully digital documents and scanned PDFs. Starting January 2025, with this latest update, the service can also process mixed-content documents, translating both digital text and text embedded within images. Sample document translated from English to Spanish: (Frames in order: Source document, translated output document (image not translated), translated output document with image translation) How It Works To enable this feature, the Document Translation service now leverages Microsoft Azure AI Vision API to detect, extract, and translate text from images within documents. This capability is especially useful for scenarios where documents contain a mix of digital text and image-based text, ensuring complete translations without manual intervention. Getting Started To take advantage of this feature, customers can use the new optional parameter when setting up a translation request: Request A new parameter under "options" called "translateTextWithinImage" has been introduced. This parameter is of type Boolean, accepting "true" or "false." The default value is "false," so you’ll need to set it to "true" to activate the image text translation capability. Response: When this feature is enabled, the response will include additional details for transparency on image processing: totalImageScansSucceeded: The count of successfully translated image scans. totalImageScansFailed: The count of image scans that encountered processing issues. Usage and cost For this feature, customers will need to use the Azure AI Services resource, as this new feature leverages Azure AI Vision services along with Azure AI Translator. The OCR service incurs additional charges based on usage. Pricing details for the OCR service can be found here: Pricing details Learn more and get started (starting January 2025): Translator Documentation These new advancements reflect our dedication to pushing boundaries in Document Translation, empowering enterprises to connect and collaborate more effectively, regardless of language. Stay tuned for more innovations as we continue to expand the reach and capabilities of Microsoft Azure AI Translator.4.1KViews0likes1CommentAI Automation in Azure Foundry through turnkey MCP Integration and Computer Use Agent Models
The Fashion Trends Discovery Scenario In this walkthrough, we'll explore a sample application that demonstrates the power of combining Computer Use (CUA) models with Playwright browser automation to autonomously compile trend information from the internet, while leveraging MCP integration to intelligently catalog and store insights in Azure Blob Storage. The User Experience A fashion analyst simply provides a query like "latest trends in sustainable fashion" to our command-line interface. What happens next showcases the power of agentic AI—the system requires no further human intervention to: Autonomous Web Navigation: The agent launches Pinterest, intelligently locates search interfaces, and performs targeted queries Intelligent Content Discovery: Systematically identifies and interacts with trend images, navigating to detailed pages Advanced Content Analysis: Applies computer vision to analyze fashion elements, colors, patterns, and design trends Intelligent Compilation: Consolidates findings into comprehensive, professionally formatted markdown reports Contextual Storage: Recognizes the value of preserving insights and autonomously offers cloud storage options Technical capabilities leveraged Behind this seamless experience lies a coordination of AI models: Pinterest Navigation: The CUA model visually understands Pinterest's interface layout, identifying search boxes and navigation elements with pixel-perfect precision Search Results Processing: Rather than relying on traditional DOM parsing, our agent uses visual understanding to identify trend images and calculate precise interaction coordinates Content Analysis: Each discovered trend undergoes detailed analysis using GPT-4o's advanced vision capabilities, extracting insights about fashion elements, seasonal trends, and style patterns Autonomous Decision Making: The agent contextually understands when information should be preserved and automatically engages with cloud storage systems Technology Stack Overview At the heart of this solution lies an orchestration of several AI technologies, each serving a specific purpose in creating a truly autonomous agent. The architecture used ``` ┌─────────────────────────────────────────────────────────────────┐ │ Azure AI Foundry │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Responses API │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ │ │ │ │ CUA Model │ │ GPT-4o │ │ Built-in MCP │ │ │ │ │ │ (Interface) │ │ (Content) │ │ Client │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Function Calling Layer │ │ (Workflow Orchestration) │ └─────────────────────────────────────────┘ │ ▼ ┌─────────────────┐ ┌──────────────────┐ │ Playwright │◄──────────────► │ Trends Compiler │ │ Automation │ │ Engine │ └─────────────────┘ └──────────────────┘ │ ▼ ┌─────────────────────┐ │ Azure Blob │ │ Storage (MCP) │ └─────────────────────┘ ``` Azure OpenAI Responses API At the core of the agentic architecture in this solution, the Responses API provides intelligent decision-making capabilities that determine when to invoke Computer Use models for web crawling versus when to engage MCP servers for data persistence. This API serves as the brain of our agent, contextually understanding user intent and autonomously choosing the appropriate tools to fulfill complex multi-step workflows. Computer Use (CUA) Model Our specialized CUA model excels at visual understanding of web interfaces, providing precise coordinate mapping for browser interactions, layout analysis, and navigation planning. Unlike general-purpose language models, the CUA model is specifically trained to understand web page structures, identify interactive elements, and provide actionable coordinates for automated browser control. Playwright Browser Automation Acting as the hands of our agent, Playwright executes the precise actions determined by the CUA model. This robust automation framework translates AI insights into real-world browser interactions, handling everything from clicking and typing to screenshot capture and page navigation with pixel-perfect accuracy. GPT-4o Vision Model for Content Analysis While the CUA model handles interface understanding, GPT-4o provides domain-specific content reasoning. This powerful vision model analyzes fashion trends, extracts meaningful insights from images, and provides rich semantic understanding of visual content—capabilities that complement rather than overlap with the CUA model's interface-focused expertise. Model Context Protocol (MCP) Integration The application showcases the power of agentic AI through its autonomous decision-making around data persistence. The agent intelligently recognizes when compiled information needs to be stored and automatically engages with Azure Blob Storage through MCP integration, without requiring explicit user instruction for each storage operation. Unlike traditional function calling patterns where custom applications must relay MCP calls through client libraries, the Responses API includes a built-in MCP client that directly communicates with MCP servers. This eliminates the need for complex relay logic, making MCP integration as simple as defining tool configurations. Function Calling Orchestration Function calling orchestrates the complex workflow between CUA model insights and Playwright actions. Each step is verified and validated before proceeding, ensuring robust autonomous operation without human intervention throughout the entire trend discovery and analysis process. Let me walk you through the code used in the Application. Agentic Decision Making in Action Let's examine how our application demonstrates true agentic behavior through the main orchestrator in `app.py`: async def main() -> str: """Main entry point demonstrating agentic decision making.""" conversation_history = [] generated_reports = [] while True: user_query = input("Enter your query for fashion trends:-> ") # Add user input to conversation context new_user_message = { "role": "user", "content": [{"type": "input_text", "text": user_query}], } conversation_history.append(new_user_message) # The agent analyzes context and decides on appropriate actions response = ai_client.create_app_response( instructions=instructions, conversation_history=conversation_history, mcp_server_url=config.mcp_server_url, available_functions=available_functions, ) # Process autonomous function calls and MCP tool invocations for output in response.output: if output.type == "function_call": # Agent decides to compile trends function_to_call = available_functions[output.name] function_args = json.loads(output.arguments) function_response = await function_to_call(**function_args) elif output.type == "mcp_tool_call": # Agent decides to use MCP tools for storage print(f"MCP tool call: {output.name}") # MCP calls handled automatically by Responses API Key Agentic Behaviors Demonstrated: Contextual Analysis: The agent examines conversation history to understand whether the user wants trend compilation or storage operations Autonomous Tool Selection: Based on context, the agent chooses between function calls (for trend compilation) and MCP tools (for storage) State Management: The agent maintains conversation context across multiple interactions, enabling sophisticated multi-turn workflows Function Calling Orchestration: Autonomous Web Intelligence The `TrendsCompiler` class in `compiler.py` demonstrates sophisticated autonomous workflow orchestration: class TrendsCompiler: """Autonomous trends compilation with multi-step verification.""" async def compile_trends(self, user_query: str) -> str: """Main orchestration loop with autonomous step progression.""" async with LocalPlaywrightComputer() as computer: state = {"trends_compiled": False} step = 0 while not state["trends_compiled"]: try: if step == 0: # Step 1: Autonomous Pinterest navigation await self._launch_pinterest(computer) step += 1 elif step == 1: # Step 2: CUA-driven search and coordinate extraction coordinates = await self._search_and_get_coordinates( computer, user_query ) if coordinates: step += 1 elif step == 2: # Step 3: Autonomous content analysis and compilation await self._process_image_results( computer, coordinates, user_query ) markdown_report = await self._generate_markdown_report( user_query ) state["trends_compiled"] = True except Exception as e: print(f"Autonomous error handling in step {step}: {e}") state["trends_compiled"] = True return markdown_report Autonomous Operation Highlights: Self-Verifying Steps: Each step validates completion before advancing Error Recovery: Autonomous error handling without human intervention State-Driven Progression: The agent maintains its own execution state No User Prompts: Complete automation from query to final report Pinterest's Unique Challenge: Visual Coordinate Intelligence One of the most impressive demonstrations of CUA model capabilities lies in solving Pinterest's hidden URL challenge: async def _detect_search_results(self, computer) -> List[Tuple[int, int, int, int]]: """Use CUA model to extract image coordinates from search results.""" # Take screenshot for CUA analysis screenshot_bytes = await computer.screenshot() screenshot_b64 = base64.b64encode(screenshot_bytes).decode() # CUA model analyzes visual layout and identifies image boundaries prompt = """ Analyze this Pinterest search results page and identify all trend/fashion images displayed. For each image, provide the exact bounding box coordinates in the format: <click>x1,y1,x2,y2</click> Focus on the main content images, not navigation or advertisement elements. """ response = await self.ai_client.create_cua_response( prompt=prompt, screenshot_b64=screenshot_b64 ) # Extract coordinates using specialized parser coordinates = self.coordinate_parser.extract_coordinates(response.content) print(f"CUA model identified {len(coordinates)} image regions") return coordinates The Coordinate Calculation: def calculate_centers(self, coordinates: List[Tuple[int, int, int, int]]) -> List[Tuple[int, int]]: """Calculate center coordinates for precise clicking.""" centers = [] for x1, y1, x2, y2 in coordinates: center_x = (x1 + x2) // 2 center_y = (y1 + y2) // 2 centers.append((center_x, center_y)) return centers key take aways with this approach: No DOM Dependency: Pinterest's hover-based URL revelation becomes irrelevant Visual Understanding: The CUA model sees what humans see—image boundaries Pixel-Perfect Targeting: Calculated center coordinates ensure reliable clicking Robust Navigation: Works regardless of Pinterest's frontend implementation changes Model Specialization: The Right AI for the Right Job Our solution demonstrates sophisticated AI model specialization: async def _analyze_trend_page(self, computer, user_query: str) -> Dict[str, Any]: """Use GPT-4o for domain-specific content analysis.""" # Capture the detailed trend page screenshot_bytes = await computer.screenshot() screenshot_b64 = base64.b64encode(screenshot_bytes).decode() # GPT-4o analyzes fashion content semantically analysis_prompt = f""" Analyze this fashion trend page for the query: "{user_query}" Provide detailed analysis of: 1. Fashion elements and style characteristics 2. Color palettes and patterns 3. Seasonal relevance and trend timing 4. Target demographics and style categories 5. Design inspiration and cultural influences Format as structured markdown with clear sections. """ # Note: Using GPT-4o instead of CUA model for content reasoning response = await self.ai_client.create_vision_response( model=self.config.vision_model_name, # GPT-4o prompt=analysis_prompt, screenshot_b64=screenshot_b64 ) return { "analysis": response.content, "timestamp": datetime.now().isoformat(), "query_context": user_query } Model Selection Rationale: CUA Model: Perfect for understanding "Where to click" and "How to navigate" GPT-4o: Excels at "What does this mean" and "How is this relevant" Specialized Strengths: Each model operates in its domain of expertise Complementary Intelligence: Combined capabilities exceed individual model limitations Compilation and Consolidation async def _generate_markdown_report(self, user_query: str) -> str: """Consolidate all analyses into comprehensive markdown report.""" if not self.image_analyses: return "No trend data collected for analysis." # Intelligent report structuring report_sections = [ f"# Fashion Trends Analysis: {user_query}", f"*Generated on {datetime.now().strftime('%B %d, %Y')}*", "", "## Executive Summary", await self._generate_executive_summary(), "", "## Detailed Trend Analysis" ] # Process each analyzed trend with intelligent categorization for idx, analysis in enumerate(self.image_analyses, 1): trend_section = [ f"### Trend Item {idx}", analysis.get('analysis', 'No analysis available'), f"*Analysis timestamp: {analysis.get('timestamp', 'Unknown')}*", "" ] report_sections.extend(trend_section) # Add intelligent trend synthesis report_sections.extend([ "## Trend Synthesis and Insights", await self._generate_trend_synthesis(), "", "## Recommendations", await self._generate_recommendations() ]) return "\n".join(report_sections) Intelligent Compilation Features: Automatic Structuring: Creates professional report formats automatically Content Synthesis: Combines individual analyses into coherent insights Temporal Context: Maintains timestamp and query context Executive Summaries: Generates high-level insights from detailed data Autonomous Storage Intelligence Note that there is no MCP Client code that needs to be implemented here. The integration is completely turnkey, through configuration alone. # In app_client.py - MCP tool configuration def create_app_tools(self, mcp_server_url: str, available_functions: Dict[str, Any]) -> List[Dict[str, Any]]: """Configure tools with automatic MCP integration.""" tools = [ { "type": "mcp", "server_label": "azure-storage-mcp-server", "server_url": mcp_server_url, "require_approval": "never", # Autonomous operation "allowed_tools": ["create_container", "list_containers", "upload_blob"], } ] return tools # Agent instructions demonstrate contextual intelligence instructions = f""" Step1: Compile trends based on user query using computer use agent. Step2: Prompt user to store trends report in Azure Blob Storage. Use MCP Server tools to perform this action autonomously. IMPORTANT: Maintain context of previously generated reports. If user asks to store a report, use the report generated in this session. """ Turnkey MCP Integration: Direct API Calls: MCP tools called directly by Responses API No Relay Logic: No custom MCP client implementation required Autonomous Tool Selection: Agent chooses appropriate MCP tools based on context Contextual Storage: Agent understands what to store and when Demo and Code reference Here is the GitHub Repo of the Application described in this post. See a demo of this application in action: Conclusion: Entering the Age of Practical Agentic AI The Fashion Trends Compiler Agent represents Agentic AI applications that work autonomously in real-world scenarios. By combining Azure AI Foundry's turnkey MCP integration with specialized AI models and robust automation frameworks, we've created an agent that doesn't just follow instructions but intelligently navigates complex multi-step workflows with minimal human oversight. Ready to build your own agentic AI solutions? Start exploring Azure AI Foundry's MCP integration and Computer Use capabilities to create the next generation of intelligent automation.419Views3likes0CommentsUse Azure OpenAI and APIM with the OpenAI Agents SDK
The OpenAI Agents SDK provides a powerful framework for building intelligent AI assistants with specialised capabilities. In this blog post, I'll demonstrate how to integrate Azure OpenAI Service and Azure API Management (APIM) with the OpenAI Agents SDK to create a banking assistant system with specialised agents. Key Takeaways: Learn how to connect the OpenAI Agents SDK to Azure OpenAI Service Understand the differences between direct Azure OpenAI integration and using Azure API Management Implement tracing with the OpenAI Agents SDK for monitoring and debugging Create a practical banking application with specialized agents and handoff capabilities The OpenAI Agents SDK The OpenAI Agents SDK is a powerful toolkit that enables developers to create AI agents with specialised capabilities, tools, and the ability to work together through handoffs. It's designed to work seamlessly with OpenAI's models, but can be integrated with Azure services for enterprise-grade deployments. Setting Up Your Environment To get started with the OpenAI Agents SDK and Azure, you'll need to install the necessary packages: pip install openai openai-agents python-dotenv You'll also need to set up your environment variables. Create a `.env` file with your Azure OpenAI or APIM credentials: For Direct Azure OpenAI Connection: # .env file for Azure OpenAI AZURE_OPENAI_API_KEY=your_api_key AZURE_OPENAI_API_VERSION=2024-08-01-preview AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/ AZURE_OPENAI_DEPLOYMENT=your-deployment-name For Azure API Management (APIM) Connection: # .env file for Azure APIM AZURE_APIM_OPENAI_SUBSCRIPTION_KEY=your_subscription_key AZURE_APIM_OPENAI_API_VERSION=2024-08-01-preview AZURE_APIM_OPENAI_ENDPOINT=https://your-apim-name.azure-api.net/ AZURE_APIM_OPENAI_DEPLOYMENT=your-deployment-name Connecting to Azure OpenAI Service The OpenAI Agents SDK can be integrated with Azure OpenAI Service in two ways: direct connection or through Azure API Management (APIM). Option 1: Direct Azure OpenAI Connection from openai import AsyncAzureOpenAI from agents import set_default_openai_client from dotenv import load_dotenv import os # Load environment variables load_dotenv() # Create OpenAI client using Azure OpenAI openai_client = AsyncAzureOpenAI( api_key=os.getenv("AZURE_OPENAI_API_KEY"), api_version=os.getenv("AZURE_OPENAI_API_VERSION"), azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"), azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT") ) # Set the default OpenAI client for the Agents SDK set_default_openai_client(openai_client) Option 2: Azure API Management (APIM) Connection from openai import AsyncAzureOpenAI from agents import set_default_openai_client from dotenv import load_dotenv import os # Load environment variables load_dotenv() # Create OpenAI client using Azure APIM openai_client = AsyncAzureOpenAI( api_key=os.getenv("AZURE_APIM_OPENAI_SUBSCRIPTION_KEY"), # Note: Using subscription key api_version=os.getenv("AZURE_APIM_OPENAI_API_VERSION"), azure_endpoint=os.getenv("AZURE_APIM_OPENAI_ENDPOINT"), azure_deployment=os.getenv("AZURE_APIM_OPENAI_DEPLOYMENT") ) # Set the default OpenAI client for the Agents SDK set_default_openai_client(openai_client) Key Difference: When using Azure API Management, you use a subscription key instead of an API key. This provides an additional layer of management, security, and monitoring for your OpenAI API access. Creating Agents with the OpenAI Agents SDK Once you've set up your Azure OpenAI or APIM connection, you can create agents using the OpenAI Agents SDK: from agents import Agent from openai.types.chat import ChatCompletionMessageParam # Create a banking assistant agent banking_assistant = Agent( name="Banking Assistant", instructions="You are a helpful banking assistant. Be concise and professional.", model="gpt-4o", # This will use the deployment specified in your Azure OpenAI/APIM client tools=[check_account_balance] # A function tool defined elsewhere ) The OpenAI Agents SDK automatically uses the Azure OpenAI or APIM client you've configured, making it seamless to switch between different Azure environments or configurations. Implementing Tracing with Azure OpenAI The OpenAI Agents SDK includes powerful tracing capabilities that can help you monitor and debug your agents. When using Azure OpenAI or APIM, you can implement two types of tracing: 1. Console Tracing for Development Console logging is rather verbose, if you would like to explore the Spans then enable do it like below: from agents import Agent, HandoffInputData, Runner, function_tool, handoff, trace, set_default_openai_client, set_tracing_disabled, OpenAIChatCompletionsModel, set_tracing_export_api_key, add_trace_processor from agents.tracing.processors import ConsoleSpanExporter, BatchTraceProcessor # Set up console tracing console_exporter = ConsoleSpanExporter() console_processor = BatchTraceProcessor(exporter=console_exporter) add_trace_processor(console_processor) 2. OpenAI Dashboard Tracing Currently the spans are being sent to https://api.openai.com/v1/traces/ingest from agents import Agent, HandoffInputData, Runner, function_tool, handoff, trace, set_default_openai_client, set_tracing_disabled, OpenAIChatCompletionsModel, set_tracing_export_api_key, add_trace_processor set_tracing_export_api_key(os.getenv("OPENAI_API_KEY")) Tracing is particularly valuable when working with Azure deployments, as it helps you monitor usage, performance, and behavior across different environments. Note that at the time of writing this article, there is a ongoing bug where OpenAI Agent SDK is fetching the old input_tokens, output_tokens instead of the new prompt_tokens & completion_tokens returned by newer ChatCompletion APIs. Thus you would need to manually update in agents/run.py file to make this work per https://github.com/openai/openai-agents-python/pull/65/files Running Agents with Azure OpenAI To run your agents with Azure OpenAI or APIM, use the Runner class from the OpenAI Agents SDK: from agents import Runner import asyncio async def main(): # Run the banking assistant result = await Runner.run( banking_assistant, input="Hi, I'd like to check my account balance." ) print(f"Response: {result.response.content}") if __name__ == "__main__": asyncio.run(main()) Practical Example: Banking Agents System Let's look at how we can use Azure OpenAI or APIM with the OpenAI Agents SDK to create a banking system with specialized agents and handoff capabilities. 1. Define Specialized Banking Agents We'll create several specialized agents: General Banking Assistant: Handles basic inquiries and account information Loan Specialist: Focuses on loan options and payment calculations Investment Specialist: Provides guidance on investment options Customer Service Agent: Routes inquiries to specialists 2. Implement Handoff Between Agents from agents import handoff, HandoffInputData from agents.extensions import handoff_filters # Define a filter for handoff messages def banking_handoff_message_filter(handoff_message_data: HandoffInputData) -> HandoffInputData: # Remove any tool-related messages from the message history handoff_message_data = handoff_filters.remove_all_tools(handoff_message_data) return handoff_message_data # Create customer service agent with handoffs customer_service_agent = Agent( name="Customer Service Agent", instructions="""You are a customer service agent at a bank. Help customers with general inquiries and direct them to specialists when needed. If the customer asks about loans or mortgages, handoff to the Loan Specialist. If the customer asks about investments or portfolio management, handoff to the Investment Specialist.""", handoffs=[ handoff(loan_specialist_agent, input_filter=banking_handoff_message_filter), handoff(investment_specialist_agent, input_filter=banking_handoff_message_filter), ], tools=[check_account_balance], ) 3. Trace the Conversation Flow from agents import trace async def main(): # Trace the entire run as a single workflow with trace(workflow_name="Banking Assistant Demo"): # Run the customer service agent result = await Runner.run( customer_service_agent, input="I'm interested in taking out a mortgage loan. Can you help me understand my options?" ) print(f"Response: {result.response.content}") if __name__ == "__main__": asyncio.run(main()) Benefits of Using Azure OpenAI/APIM with the OpenAI Agents SDK Integrating Azure OpenAI or APIM with the OpenAI Agents SDK offers several advantages: Enterprise-Grade Security: Azure provides robust security features, compliance certifications, and private networking options Scalability: Azure's infrastructure can handle high-volume production workloads Monitoring and Management: APIM provides additional monitoring, throttling, and API management capabilities Regional Deployment: Azure allows you to deploy models in specific regions to meet data residency requirements Cost Management: Azure provides detailed usage tracking and cost management tools Conclusion The OpenAI Agents SDK combined with Azure OpenAI Service or Azure API Management provides a powerful foundation for building intelligent, specialized AI assistants. By leveraging Azure's enterprise features and the OpenAI Agents SDK's capabilities, you can create robust, scalable, and secure AI applications for production environments. Whether you choose direct Azure OpenAI integration or Azure API Management depends on your specific needs for API management, security, and monitoring. Both approaches work seamlessly with the OpenAI Agents SDK, making it easy to build sophisticated agent-based applications. Repo: https://github.com/hieumoscow/azure-openai-agents Video demo: https://www.youtube.com/watch?v=gJt-bt-vLJY11KViews6likes15CommentsUnlock new dimensions of creativity: Gpt-image-1 and Sora
We're excited to announce Sora, now available in AI Foundry. Learn more about Sora in Azure OpenAI, and its API and video playground, here. To dive deeper into video playground experience, check out this blog post Why does this matter? AI is no longer *just* about text. Here’s why: multimodal models enable deeper understanding. Today AI doesn’t just understand words: it understands context, visuals, and motion. From prompt to product, imagine going from a product description to a full marketing campaign. In many ways, the rise of multimodal AI today is comparable to the inception of photography in the 19 th century—introducing a new creative medium that, like photography, didn’t replace painting but expanded the boundaries of artistic expression. Just a little over a month ago, we released OpenAI's GPT-image-1, and now we’re thrilled to announce Sora, available in public preview in AI Foundry. These models are designed to unlock creativity, accelerate content creation, and empower entire industries. Video in particular has remained the most complex and underdeveloped modality. Teams across industries from marketing to education are bottlenecked by the cost, time, and tools required to create high fidelity video content. What makes these models especially powerful on Azure is the enterprise-grade foundation they run on. Customers get the benefit of first-party infrastructure, network security, identity, billing, and governance—all in one place. That’s huge when you're deploying generative AI at scale. Both models are API-first, meaning they can be deeply integrated into workflows, apps, or customer-facing tools. And for teams looking to explore before they build, we offer an intuitive Image Playground and Video Playground to experiment with different models side by side: in iterative workflows like video generation, go from experimentation to production seamlessly. Of course, it’s not just about capabilities—it’s also about responsibility. These models are designed with safety and provenance in mind, including features like C2PA integration and abuse monitoring, so enterprises can innovate with confidence. Automate pipelines and scale production workflows Sora in Azure OpenAI is uniquely available as an API-so creative teams can build it directly in their tool. Currently Sora supports text-to-video, with image-to-video coming soon, up to 20s in duration, 1080p resolution, and landscape/portrait/square aspect ratios. This level of integration is especially valuable for customers like WPP, where the inability to easily show early concepts and scale big ideas through production in video has long been a creative and operational bottleneck. Through the Sora API, customers can collaborate more effectively with clients by delivering personalized, scalable solutions. The Sora API in particular is great for asynchronous tasks. "API access integrates and speeds up my workflow in a way that other services just couldn't. It just makes my life so much easier” - WPP Want to see this in action? Check out our visionary lab using both gpt-image-1 and sora, tailored to industry use cases Go from experimentation to production with video playground With built in features such as port to VS Code, customers can seamlessly go from experimentation to production: Iterate faster: Experiment with text prompts and adjust generation controls like aspect ratio, resolution and duration Optimize prompts: Re-write prompt syntax with AI and visually compare outcomes across variations using prebuilt industry prompts API based interface: What works in video playground translates directly into code, with predictability Industry Use Cases Sora 🌱 Sustainability Environmental Campaign Videos: Generate compelling short-form content to raise awareness around climate action, recycling, or conservation. Product Sustainability Stories: Visualize supply chain journeys or carbon offset initiatives to showcase brand responsibility. 📚 Education Interactive Learning Content: Generate visual aids for K-12 or higher ed topics like science (e.g protein creation), or language learning. Corporate Training: Produce bite-sized videos for onboarding, compliance, or skill development. ✈️ Travel & Lifestyle Destination Marketing: Generate stunning visuals of destinations to attract visitors Experience Sharing: Create cinematic recaps of trips or aspirational journeys GPT-image-1 🎮 Gaming Concept Art & World Design: Instantly generate immersive environments, characters, and assets from simple prompts—ideal for prototyping and speeding up creative workflows. In-game UI Mockups: Use GPT-Image-1 to visualize interfaces and inventory screens without manual design. 📢 Advertising & Marketing Ad Storyboarding: Generate compelling image or video ad drafts for client pitches or social media A/B testing. Localization at Scale: Tailor visuals and videos to different regions or audiences while keeping brand consistency. Campaign Automation: Go from a product name or brief to full creative assets—banners, thumbnails, short video clips. 🛒 E-commerce Product Showcases: Create studio-quality product images in various styles and settings, including on-model shots or lifestyle scenes. Seasonal Promotions: Refresh visuals for holidays or trends instantly without reshoots or manual editing. Explore how you can integrate these powerful tools into your operations and start transforming your business today Call to Action Get started with Sora and build your own application Learn more about Sora and capabilities Deploy Sora today Learn more about Multimodal AI in AI Foundry and revisit BUILD session Deploy GPT-image-1 today Learn more about GPT-image-1's API5.1KViews0likes1CommentWays to simplify your data ingestion pipeline with Azure AI Search
Azure AI Search introduces new features to simplify RAG (retrieval-augmented generation) data preparation and indexing. Key updates include the GenAI Prompt Skill (in public preview), which leverages Azure AI Foundry and OpenAI chat-completion models to enrich data with transformations like content summarization, image verbalization, and sentiment classification. The Logic Apps Integration provides a no-code ingestion wizard for creating RAG-ready indexes, supporting multiple connectors like SharePoint and Amazon S3 for seamless data ingestion. Together, these functionalities reduce data preparation time, help improving search relevance, and enhance user experiences. Responsible AI guidelines enable ethical usage, while new portal tools streamline workflows for developers.925Views0likes0CommentsNLWeb Pioneers: Success Stories & Use Cases
NLWeb, unveiled at Build 2025 in Satya Nadella’s keynote with Kevin Scott, is an open-source “HTML for chat” that adds a conversational layer to any website—making pages queryable in natural language by both people and AI agents. Early adopters like Tripadvisor, Qdrant, O’Reilly Media, Eventbrite, Inception Labs, and Delish have demonstrated use cases ranging from travel planning and semantic search to expert-style Q&A and personalized recipe discovery. Each NLWeb endpoint also serves as a Model Context Protocol (MCP) server, paving the way for Agent-to-Agent (A2A) workflows and an open, protocol-based agentic web. To explore it yourself, visit the NLWeb GitHub repository and start building your own conversational web experiences.642Views0likes0CommentsHow Amdocs CCoE leveraged Azure AI Agent Service to build intelligent email support agent.
In this blogpost you will learn how Amdocs CCoE team improved their SLA by providing technical support for IT and cloud infrastructure questions and queries. They used Azure AI Agent Service to build an intelligent email agent that helps Amdocs employees with their technical issues. This post will describe the development phases, solution details and the roadmap ahead.391Views0likes0CommentsSuperRAG – How to achieve higher accuracy with Retrieval Augmented Generation
The benefit of this approach is that it can dramatically increase the amount of information retrieved and increase the chances of finding the correct answer. A vector search, which is commonly used in RAG applications, excels at making semantic connections like synonym recognition and misspellings, but doesn’t really understand intent the way a human or LLM does. So, by retrieving many more documents and letting an LLM like GPT-3.5 decide if the document answers the question, we can achieve higher accuracy with our generated answers.7.4KViews5likes2Comments