azure openai service
180 TopicsUnlock Multimodal Data Insights with Azure AI Content Understanding: New Code Samples Available
We are excited to share code samples that leverage the Azure AI Content Understanding service to help you extract insights from your images, documents, videos, and audio content. These code samples are available on GitHub and cover the following: Azure AI integrations Visual Document Search: Leverage Azure Document Intelligence, Content Understanding, Azure Search, and Azure OpenAI to unlock natural language search of document contents for a complex document with pictures of charts and diagrams. Video Chapter Generation: Generate video chapters using Azure Content Understanding and Azure OpenAI. This allows you to break long videos into smaller, labeled parts with key details, making it easier to find, share, and access the most relevant content. Video Content Discovery: Learn how to use Content Understanding, Azure Search, and Azure OpenAI models to process videos and create a searchable index for AI-driven content discovery. Content Understanding Operations Analyzer Templates: An Analyzer enables you to tailor Content Understanding to extract valuable insights from your content based on your specific needs. Start quickly with these ready-made templates. Content Extraction: Learn how Content Understanding API can extract semantic information from various files including performing OCR to recognize tables in documents, transcribing audio files, and analyzing faces in videos. Field Extraction: This example demonstrates how to extract specific fields from your content. For instance, you can identify the invoice amount in a document, capture names mentioned in an audio file, or generate a summary of a video. Analyzer Training: For document scenarios, you can further enhance field extraction performance by providing a few labeled samples. Analyzer management: Create a minimal analyzer, list all analyzers in your resource, and delete any analyzers you no longer need. Azure AI Content Understanding: Turn Multimodal Content into Structured Data Azure AI Content Understanding is a cutting-edge Azure AI offering designed to help businesses seamlessly extract insights from various content types. Built with and for Generative AI, it empowers organizations to seamlessly develop GenAI solutions using the latest models, without needing advanced AI expertise. Content Understanding simplifies the processing of unstructured data stores of documents, images, videos, and audio—transforming them into structured, actionable insights. It is versatile and adaptable across numerous industries and, use case scenarios, offering customization and support for input from multiple data types. Here are a few example use cases: Retrieval Augmented Generation (RAG): Enhance and integrate content from any format to power effective content searches or provide answers to frequent questions in scenarios like customer service or enterprise-wide data retrieval. Post-call analytics: Organizations use Content Understanding to analyze call center or meeting recordings, extracting insights like sentiment, speaker details, and topics discussed, including names, companies, and other relevant data. Insurance claims processing: Automate time-consuming processes like analyzing and handling insurance claims or other low-latency batch processing tasks. Media asset management and content creation: Extract essential features from images and videos to streamline media asset organization and enable entity-based searches for brands, settings, key products, and people. Resources & Documentation To begin extracting valuable insights from your multimodal content, explore the following resources: Azure Content Understanding Overview Azure Content Understanding in Azure AI Foundry FAQs Want to get in touch? We’d love to hear from you! Send us an email at cu_contact@microsoft.com203Views0likes0CommentsWorking with the Realtime API of gpt-4o in python
This post is organized into sections that cover how to: Connect to the Realtime API Handle audio conversations Handle text conversations Handle tool calling The sample web application is built using Chainlit. Connecting to the Realtime API Refer to the code snippet below to establish a WebSocket connection to the Server (API). After establishing that: 1. Implement the receive function to accept responses from the Server. It is used to handle the response content from the server, be it audio or text. More details on this function are provided later in the post, under each section. url = f"{base_url}openai/realtime?api-version={api_version}&deployment={model_name}&api-key={api_key}" async def connect(self): """Connects the client using a WS Connection to the Realtime API.""" if self.is_connected(): # raise Exception("Already connected") self.log("Already connected") self.ws = await websockets.connect( url, additional_headers={ "Authorization": f"Bearer {api_key}", "OpenAI-Beta": "realtime=v1", }, ) print(f"Connected to realtime API....") asyncio.create_task(self.receive()) await self.update_session() 2. Send a client event -update session, to set session level configurations like the system prompt the model should use, the choice of using text or speech or both during the conversation, the neural voice to use in the response, and so forth. self.system_prompt = system_prompt self.event_handlers = defaultdict(list) self.session_config = { "modalities": ["text", "audio"], "instructions": self.system_prompt, "voice": "shimmer", "input_audio_format": "pcm16", "output_audio_format": "pcm16", "input_audio_transcription": {"model": "whisper-1"}, "turn_detection": { "type": "server_vad", "threshold": 0.5, "prefix_padding_ms": 300, "silence_duration_ms": 500, # "create_response": True, ## do not enable this attribute, since it prevents function calls from being detected }, "tools": tools_list, "tool_choice": "auto", "temperature": 0.8, "max_response_output_tokens": 4096, } Handling audio conversation 1. Capture user voice input Chainlit provides events to capture the user voice input from the microphone. .on_audio_chunk async def on_audio_chunk(chunk: cl.InputAudioChunk): openai_realtime: RTWSClient = cl.user_session.get("openai_realtime") if openai_realtime: if openai_realtime.is_connected(): await openai_realtime.append_input_audio(chunk.data) else: print("RealtimeClient is not connected") 2. Process the user voice input a) Convert the audio input received in the previous step to a base64 encoded string. Send the Client event input_audio_buffer.append to the Server, with this audio payload. async def append_input_audio(self, array_buffer): # Check if the array buffer is not empty and send the audio data to the input buffer if len(array_buffer) > 0: await self.send( "input_audio_buffer.append", { "audio": array_buffer_to_base64(np.array(array_buffer)), }, ) b) Once the Server is done receiving the audio chunks, it sends an input_audio_buffer.committed event. Once this event is picked up in the receive function, c) send a Client Event response.create to the Server to elicit a response. async def receive(self): async for message in self.ws: event = json.loads(message) ................................ elif event["type"] == "input_audio_buffer.committed": # user has stopped speaking. The audio delta input from the user captured till now should now be processed by the server. # Hence we need to send a 'response.create' event to signal the server to respond await self.send("response.create", {"response": self.response_config}) ................................. 3. Receiving the response audio Once the response audio events start flowing in from the server: Handle the Server eventresponse.audio.delta, by converting the audio chunks from a base64 encoded string to bytes. Relay this to the UI to play the audio chunks over the speaker. The dispatch function is used to raise this event (see snippet below). async def receive(self): async for message in self.ws: event = json.loads(message) ............................ if event["type"] == "response.audio.delta": # response audio delta events received from server that need to be relayed # to the UI for playback delta = event["delta"] array_buffer = base64_to_array_buffer(delta) append_values = array_buffer.tobytes() _event = {"audio": append_values} # send event to chainlit UI to play this audio self.dispatch("conversation.updated", _event) elif event["type"] == "response.audio.done": # server has finished sending back the audio response to the user query # let the chainlit UI know that the response audio has been completely received self.dispatch("conversation.updated", event) .......................... Play the received audio chunks The Chainlit UI then plays this audio out over the speaker. async def handle_conversation_updated(event): """Used to play the response audio chunks as they are received from the server.""" _audio = event.get("audio") if _audio: await cl.context.emitter.send_audio_chunk( cl.OutputAudioChunk( mimeType="pcm16", data=_audio, track=cl.user_session.get("track_id") ) ) Handling text conversation 1. Capture user text input Apart from handling audio conversation, we can handle the associated transcripts from the audio response, so that the user can have a 'multi modal' way of interacting with the AI Assistant. Chainlit provides events to capture the user input from the chat interface .on_message async def on_message(message: cl.Message): openai_realtime: RTWSClient = cl.user_session.get("openai_realtime") if openai_realtime and openai_realtime.is_connected(): await openai_realtime.send_user_message_content( [{"type": "input_text", "text": message.content}] ) else: await cl.Message( content="Please activate voice mode before sending messages!" ).send() 2. Process the user text input With the user text input received above: 1. Send a Client Event conversation.item.create to the Server with the user text input in the payload. 2. Follow that up with a Client Event response.create event to the Server to elicit a response. 3. Raise a custom event 'conversation.interrupted' to the UI so that it can stop playing any audio response from the previous user query. async def send_user_message_content(self, content=[]): if content: await self.send( "conversation.item.create", { "item": { "type": "message", "role": "user", "content": content, } }, ) # this is the trigger to the server to start responding to the user query await self.send("response.create", {"response": self.response_config}) # raise this event to the UI to pause the audio playback, in case it is doing so already, # when the user submits a query in the chat interface _event = {"type": "conversation_interrupted"} # signal the UI to stop playing audio self.dispatch("conversation.interrupted", _event) 3. Receiving the text response Use the Server Event response.audio_transcript.delta to get the stream of the text data response. This is a transcription of what is already playing as audio on the UI. Relay this data to the UI through a custom event, to populate the chat conversation. The response text gets streamed and displayed in the Chainlit UI. async def receive(self): async for message in self.ws: .................................. elif event["type"] == "response.audio_transcript.delta": # this event is received when the transcript of the server's audio response to the user has started to come in. # send this to the UI to display the transcript in the chat window, even as the audio of the response gets played delta = event["delta"] item_id = event["item_id"] _event = {"transcript": delta, "item_id": item_id} # signal the UI to display the transcript of the response audio in the chat window self.dispatch("conversation.text.delta", _event) elif ( event["type"] == "conversation.item.input_audio_transcription.completed" ): ............................... Handling Tool calling As a part of the Session Update event discussed earlier, we pass a payload of the tools (functions) that this Assistant has access to. In this application, I am using a search function implemented using Tavily. self.session_config = { "modalities": ["text", "audio"], "instructions": self.system_prompt, "voice": "shimmer", ..................... "tools": tools_list, "tool_choice": "auto", "temperature": 0.8, "max_response_output_tokens": 4096, } The function definition and implementation used in this sample application: tools_list = [ { "type": "function", "name": "search_function", "description": "call this function to bring upto date information on the user's query when it pertains to current affairs", "parameters": { "type": "object", "properties": {"search_term": {"type": "string"}}, "required": ["search_term"], }, } ] # Function to perform search using Tavily def search_function(search_term: str): print("performing search for the user query > ", search_term) return TavilySearchResults().invoke(search_term) available_functions = {"search_function": search_function} Handling the response from tool calling When a user request entails a function call, the Server Event response.done does not return an audio. It instead returns the functions that match the intent, along with the arguments to invoke it. In the 'receive' function, check for function call hints in the response. Get the function name and arguments from the response Invoke the function and get the response Send Client Event conversation.item.create to the server with the function call output Follow that up with Client Event response.create to elicit a response from the Server that will then be played out as audio and text. async def receive(self): async for message in self.ws: ........................................................... elif event["type"] == "response.done": ........................................... if "function_call" == output_type: function_name = ( event.get("response", {}) .get("output", [{}])[0] .get("name", None) ) arguments = json.loads( event.get("response", {}) .get("output", [{}])[0] .get("arguments", None) ) tool_call_id = ( event.get("response", {}) .get("output", [{}])[0] .get("call_id", None) ) function_to_call = available_functions[function_name] # invoke the function with the arguments and get the response response = function_to_call(**arguments) print( f"called function {function_name}, and the response is:", response, ) # send the function call response to the server(model) await self.send( "conversation.item.create", { "item": { "type": "function_call_output", "call_id": tool_call_id, "output": json.dumps(response), } }, ) # signal the model(server) to generate a response based on the function call output sent to it await self.send( "response.create", {"response": self.response_config} ) ............................................... Reference links: Watch a short video of this sample application here The Documentation on the Realtime API is available here The GitHub Repo for the application in this post is available here700Views1like2CommentsAnnouncing Data Zones for Azure OpenAI Batch
In Nov 2024, we announced Data Zones on Azure Open AI Service. Today, we’re excited to expand that support to Azure Open AI Service Batch. Data Zone Batch deployments. They enable you to utilize Azure’s global infrastructure to dynamically route traffic to data centers within the Microsoft- defined data zones, ensuring optimal availability for each request. Azure OpenAI Data Zones is a new deployment option that provides enterprises with even more flexibility and control over their data privacy and residency needs. Tailored for organizations in the United States and European Union, Data Zones allow customers to process and store their data within specific geographic boundaries, ensuring compliance with regional data residency requirements while maintaining optimal performance. By spanning multiple regions within these areas, Data Zones offer a balance between the cost-efficiency of global deployments and the control of regional deployments, making it easier for enterprises to manage their AI applications without sacrificing security or speed. Models supported at launch Model Version gpt-4o 2024-08-06 gpt4o-mini 2024-07-18 Support for newer models will be continuously added. Pricing Data Zone Batch will have 50% discount on Data Zone standard pricing. Get started Ready to try Data Zone support in Azure OpenAI Service Batch API? Take it for a spin here. Learn more Deployment types in Azure Open AI Service Azure Open AI Service Batch514Views0likes0CommentsEnhancing Workplace Safety and Efficiency with Azure AI Foundry's Content Understanding
Discover how Azure AI Foundry’s Content Understanding service, featuring the Video Shot Analysis template, revolutionizes workplace safety and efficiency. By leveraging Generative AI to analyze video data, businesses can gain actionable insights into worker actions, posture, safety risks, and environmental conditions. Learn how this cutting-edge tool transforms operations across industries like manufacturing, logistics, and healthcare.424Views2likes0CommentsIntroducing Azure AI Agent Service
Introducing Azure AI Agent Service at Microsoft Ignite 2024 Discover how Azure AI Agent Service is revolutionizing the development and deployment of AI agents. This service empowers developers to build, deploy, and scale high-quality AI agents tailored to business needs within hours. With features like rapid development, extensive data connections, flexible model selection, and enterprise-grade security, Azure AI Agent Service sets a new standard in AI automation41KViews9likes1CommentFrom Vector Databases to Integrated Vector Databases: Revolutionizing AI-Powered Search
Semantic Search and Vector Search have been pivotal capabilities powering AI Assistants driven by Generative AI. They excel when dealing with unstructured data—such as PDF documents, text files, or Word documents—where embeddings can unlock contextually rich and meaningful search results. But what happens when the data ecosystem is more complex? Imagine structured data like customer feedback ratings for timeliness, cleanliness, and professionalism intertwined with unstructured textual comments. To extract actionable insights, such as identifying service quality improvements across centers, traditional vector search alone won’t suffice. Enter Integrated Vector Databases. What Makes Integrated Vector Databases a Game-Changer? Unlike traditional vector databases that require frequent incremental updates of indexes stored separately from the original data, integrated vector databases seamlessly combine structured and unstructured data within the same environment. This integration eliminates the need for periodic indexing runs, enabling real-time search and analytics with reduced overhead. Furthermore, data and embeddings co-reside, streamlining workflows and improving query performance. Major cloud providers, including Azure, now offer managed Integrated Vector Databases such as Azure SQL Database, Azure PostgreSQL Database, and Azure Cosmos DB. This evolution is critical for scenarios that require hybrid search capabilities across both structured and unstructured data. A Real-World Scenario: Hybrid Feedback Analysis To showcase the power of Integrated Vector Databases, let’s dive into a practical application: customer feedback analysis for a service business. Here’s what this entails: Structured Data: Ratings on aspects like overall work quality, timeliness, politeness, and cleanliness. Unstructured Data: Free-flowing textual feedback from customers. Using Python, the feedback is stored in an Azure SQL Database, with embeddings generated for the textual comments via Azure OpenAI’s embedding model. The data is then inserted into the database using a stored procedure, combining the structured ratings with vectorized embeddings for efficient retrieval and analysis. Key Code Highlights 1. Generating Embeddings: The get_embedding function interfaces with Azure OpenAI to convert textual feedback from Customer input into vector embeddings: def get_embedding(text): url = f"{az_openai_endpoint}openai/deployments/{az_openai_embedding_deployment_name}/embeddings?api-version=2023-05-15" response = requests.post(url, headers={"Content-Type": "application/json", "api-key": az_openai_key}, json={"input": text}) return response.json()["data"][0]["embedding"] if response.status_code == 200 else raise Exception("Embedding failed") 2. Storing Feedback:A stored procedure inserts both structured ratings and text embeddings into the database: # Call the stored procedure stored_procedure = """ EXEC InsertServiceFeedback ?, ?, ?, ?, ?, ?, ?, ?, ?, ? """ cursor.execute( stored_procedure, ( schedule_id, customer_id, feedback_text, json.dumps(json.loads(str(get_embedding(feedback_text)))), rating_quality_of_work, rating_timeliness, rating_politeness, rating_cleanliness, rating_overall_experience, feedback_date, ), ) connection.commit() print("Feedback inserted successfully.") response_message = ( "Service feedback captured successfully for the schedule_id: " + str(schedule_id) ) Building an Autonomous Agent with LangGraph The next step is building an intelligent system that automates operations based on customer input. Here’s where LangGraph, a framework for Agentic Systems, shines. The application we’re discussing empowers customers to: View available service appointment slots. Book service appointments. Submit feedback post-service. Search for information using an AI-powered search index over product manuals. What Makes This Agent Special? This agent exhibits autonomy through: Tool Calling: Based on customer input and context, it decides which tools to invoke without manual intervention. State Awareness: The agent uses a centralized state object to maintain context (e.g., customer details, past service records, current datetime) for dynamic tool execution. Natural Interactions: Customer interactions are processed naturally, with no custom logic required to integrate data or format inputs. For example, when a customer provides feedback, the agent autonomously: Prompts for all necessary details. Generates embeddings for textual feedback. Inserts the data into the Integrated Vector Database after confirming the input. Code Walkthrough: Creating the Agent 1. Define Tools: Tools are the building blocks of the agent, enabling operations like fetching service slots or storing feedback: tools = [ store_service_feedback, fetch_customer_information, get_available_service_slots, create_service_appointment_slot, perform_search_based_qna, ] 2. Define State:State ensures the agent remembers user context across interactions: class State(TypedDict): messages: list[AnyMessage] customer_info: str current_datetime: str # fetch the customer information from the database and load that into the context in the State def customer_info(state: State): if state.get("customer_info"): return {"customer_info": state.get("customer_info")} else: state["customer_info"] = fetch_customer_information.invoke({}) return {"customer_info": state.get("customer_info")} 3. Build the Graph: LangGraph’s state graph defines how tools, states, and prompts interact: builder = StateGraph(State) builder.add_node("chatbot", Assistant(service_scheduling_runnable)) builder.add_node("fetch_customer_info", customer_info) builder.add_edge("fetch_customer_info", "chatbot") builder.add_node("tools", tool_node) builder.add_edge(START, "fetch_customer_info") builder.add_edge("tools", "chatbot") graph = builder.compile() There is no custom code required to invoke the tools. It is automatically done based on the intent in the Customer input. 4. Converse with the Agent: The application seamlessly transitions between tools based on user input and state: def stream_graph_updates(user_input: str): events = graph.stream( {"messages": [("user", user_input)]}, config, subgraphs=True, stream_mode="values", ) l_events = list(events) msg = list(l_events[-1]) r1 = msg[-1]["messages"] # response_to_user = msg[-1].messages[-1].content print(r1[-1].content) while True: try: user_input = input("User: ") if user_input.lower() in ["quit", "exit", "q"]: print("Goodbye!") break stream_graph_updates(user_input) except Exception as e: print("An error occurred:", e) traceback.print_exc() # stream_graph_updates(user_input) break Agent Demo See a demo of this app in action here: The source code of this Agent App is available inthis GitHub Repo Conclusion The fusion of Integrated Vector Databases with LangGraph’s agentic capabilities unlocks a new era of AI-powered applications. By unifying structured and unstructured data in a single system and empowering agents to act autonomously, organizations can streamline workflows and gain deeper insights from their data. This approach demonstrates the power of evolving from simple vector search to hybrid, integrated systems—paving the way for smarter, more autonomous AI solutions.283Views0likes0CommentsDify work with Microsoft AI Search
Please refer to my repo to get more AI resources, wellcome to star it: https://github.com/xinyuwei-david/david-share.git This article if from one of my repo: https://github.com/xinyuwei-david/david-share/tree/master/LLMs/ollama-Dify Dify work with Microsoft AI Search Dify is an open-source platform for developing large language model (LLM) applications. It combines the concepts of Backend as a Service (BaaS) and LLMOps, enabling developers to quickly build production-grade generative AI applications. Dify offers various types of tools, including first-party and custom tools. These tools can extend the capabilities of LLMs, such as web search, scientific calculations, image generation, and more. On Dify, you can create more powerful AI applications, like intelligent assistant-type applications, which can complete complex tasks through task reasoning, step decomposition, and tool invocation. Dify works with AI Search Demo Till now, Dify could not integrate with Microsoft directly via default Dify web portal. Let me show how to achieve it. Please click below pictures to see my demo video on Yutube: https://www.youtube.com/watch?v=20GjS6AtjTo Dify works with AI Search Configuration steps Configure on AI search Create index, make sure you could get the result from AI search index: Run dify on VM via docker: root@a100vm:~# docker ps |grep -i dify 5d6c32a94313 langgenius/dify-api:0.8.3 "/bin/bash /entrypoi…" 3 months ago Up 3 minutes 5001/tcp docker-worker-1 264e477883ee langgenius/dify-api:0.8.3 "/bin/bash /entrypoi…" 3 months ago Up 3 minutes 5001/tcp docker-api-1 2eb90cd5280a langgenius/dify-sandbox:0.2.9 "/main" 3 months ago Up 3 minutes (healthy) docker-sandbox-1 708937964fbb langgenius/dify-web:0.8.3 "/bin/sh ./entrypoin…" 3 months ago Up 3 minutes 3000/tcp docker-web-1 Create customer tool in Dify portal,set schema: schema details: { "openapi": "3.0.0", "info": { "title": "Azure Cognitive Search Integration", "version": "1.0.0" }, "servers": [ { "url": "https://ai-search-eastus-xinyuwei.search.windows.net" } ], "paths": { "/indexes/wukong-doc1/docs": { "get": { "operationId": "getSearchResults", "parameters": [ { "name": "api-version", "in": "query", "required": true, "schema": { "type": "string", "example": "2024-11-01-preview" } }, { "name": "search", "in": "query", "required": true, "schema": { "type": "string" } } ], "responses": { "200": { "description": "Successful response", "content": { "application/json": { "schema": { "type": "object", "properties": { "@odata.context": { "type": "string" }, "value": { "type": "array", "items": { "type": "object", "properties": { "@search.score": { "type": "number" }, "chunk_id": { "type": "string" }, "parent_id": { "type": "string" }, "title": { "type": "string" }, "chunk": { "type": "string" }, "text_vector": { "type": "SingleCollection" }, } } } } } } } } } } } } } Set AI Search AI key: Do search test: Input words: Create a workflow on dify: Check AI search stage: Check LLM stage: Run workflow: Get workflow result:331Views0likes0CommentsHow to Evaluate & Upgrade Model Versions in the Azure OpenAI Service
As an Azure OpenAI customer, you have access to the most advanced artificial intelligence models powered by OpenAI. These models are constantly improving and evolving, which means that you can benefit from the latest innovations and enhancements, including improved speed, improved safety systems, and reduced costs. However, this also means that older model versions will eventually be deprecated and retired. Learn how to use evaluation tools to test your prompts and applications with new model versions, to highlight any necessary adjustments or optimizations.8.8KViews5likes0CommentsExploring Azure AI Agent Service: A Leap in Conversational AI
TheAzure OpenAI Assistants API introduced powerful features such as: Conversation State Management: Efficiently handling token limits and state. Automated Tool Actions: Automatically interpreting user inputs to execute actions, such as writing Python code (Code interpreter) or performing vector searches. Function Calling: Dynamically identifying and suggesting appropriate custom functions for execution. Azure AI Agent Service enhances these capabilities by integrating advanced tools and actions, streamlining the developer experience even further: REST APIs: Supports REST APIs compliant with Open API 3.0. It automatically identifies and invokes the appropriate API based on user intent using Swagger definitions. Azure Function Apps: Azure Function Apps can be automatically invoked based on user intent, as with REST APIs mentioned above. Knowledge Tools: Includes Bing Search and Azure AI Search, for data grounding. Azure Logic Apps:Tooling support for Azure Logic Apps is expected but not available at the time of this writing. Currently, function calling must be used to identify and invoke the appropriate Logic App manually. Choice of Language Models: Apart from gpt-4o, gpt-4o-mini models, it also supports the use ofLlama 3.1-70B-instruct, Mistral-large-2407, Cohere command R+ Working with the Azure AI Agent Service I have explored the Azure AI Agent Service’s capabilities through the sample Contoso Retail Shopping Assistant, a simple conversational AI bot. It implements a combination of tool actions, mentioned below REST API Integration: Users can search for orders by category, by category and price, or place orders. Using Swagger definitions, the Azure AI Agent service dynamically identifies and calls APIs based on user input without custom code. Azure Logic Apps: Handles shipment order creation post-purchase. Manual function calling is used for creating shipment orders. Apart from that, the App uses: Natural Language Processing: Leverages the gpt-4o-mini model to understand and process user queries effectively. Microsoft Bot Framework is used to build the sample App. Creating the Agent Create the AI Agent with access to the necessary tools. OpenApiTool - for REST API invocation FunctionTool- for function calling (to trigger Azure Logic App) functions = FunctionTool(functions=user_functions) def create_agent(): project_client = AIProjectClient.from_connection_string( credential=DefaultAzureCredential(), conn_str=config.az_agentic_ai_service_connection_string, ) # read the swagger file for the Contoso Retail Fashion API definition with open("./data-files/swagger.json", "r") as f: openapi_spec = jsonref.loads(f.read()) auth = OpenApiAnonymousAuthDetails() # Initialize agent OpenApi tool using the read in OpenAPI spec api_tool = OpenApiTool( name="contoso_retail_fashion_api", spec=openapi_spec, description="help users order a product based on id and quantity, search products by category, and search for products based on their category and price.", auth=auth, ) # Initialize agent toolset with user functions toolset = ToolSet() toolset.add(functions) toolset.add(api_tool) # print("toolsets definition", toolset.definitions) agent = project_client.agents.create_agent( model="gpt-4o-mini", name="contoso-retail-fashions-ai-agent", instructions="You are an AI Assistant tasked with helping the customers of Contoso retail fashions with their shopping requirements. You have access to the APIs in contoso_retail_fashion_api that you need to call to respond to the user queriest", tools=toolset.definitions, ) print(f"created agent with id {agent.id}") The function that makes the call to Azure Logic Apps to create the delivery order: def create_delivery_order(order_id: str, destination: str) -> str: """ creates a consignment delivery order (i.e. a shipment order) for the given order_id and destination location :param order_id (str): The order number of the purchase made by the user. :param destination (str): The location where the order is to be delivered. :return: generated delivery order number. :rtype: Any """ api_url = config.az_logic_app_url print("making the Logic app call.................") # make a HTTP POST API call with json payload response = requests.post( api_url, json={"order_id": order_id, "destination": destination}, headers={"Content-Type": "application/json"}, ) print("response from logic app", response.text) return json.dumps(response.text) For simplicity, the HTTP Callable endpoint from the Azure Portal is used directly, complete with the SAS Token. The function 'create_delivery_order' calls this URL directly to trigger the Logic App. The agent needs to be created only once. The id of the created agent must be set in the '.env' config of the Bot App. Using the Agent in the Bot App Retrieve the agent created in the previous step, based on the agent id. # Create Azure OpenAI client self.project_client = AIProjectClient.from_connection_string( credential=DefaultAzureCredential(), conn_str=self.config.az_agentic_ai_service_connection_string, ) # retrieve the agent already created self.agent = self.project_client.agents.get_agent(DefaultConfig.az_assistant_id) print("retrieved agent with id ", self.agent.id) Create a thread for the User Session, which the Bot App persists across conversation exchanges with the user. Every message from the user during the session gets added to this thread, and 'run' called to execute the user request. l_thread = conversation_data.thread if l_thread is None: # Create a thread conversation_data.thread = self.project_client.agents.create_thread() l_thread = conversation_data.thread # Create message to thread message = self.project_client.agents.create_message( thread_id=l_thread.id, role="user", content=turn_context.activity.text ) run = self.project_client.agents.create_run( thread_id=l_thread.id, assistant_id=self.agent.id ) print(f"Created thread run, ID: {run.id}") while run.status in ["queued", "in_progress", "requires_action"]: time.sleep(1) run = self.project_client.agents.get_run( thread_id=l_thread.id, run_id=run.id ) if run.status == "requires_action" and isinstance( run.required_action, SubmitToolOutputsAction ): print("Run requires function call to be done..") tool_calls = run.required_action.submit_tool_outputs.tool_calls if not tool_calls: print("No tool calls provided - cancelling run") self.project_client.agents.cancel_run( thread_id=l_thread.id, run_id=run.id ) break tool_outputs = [] for tool_call in tool_calls: if isinstance(tool_call, RequiredFunctionToolCall): try: print(f"Executing tool call: {tool_call}") output = functions.execute(tool_call) tool_outputs.append( ToolOutput( tool_call_id=tool_call.id, output=output, ) ) except Exception as e: print(f"Error executing tool_call {tool_call.id}: {e}") print(f"Tool outputs: {tool_outputs}") if tool_outputs: self.project_client.agents.submit_tool_outputs_to_run( thread_id=l_thread.id, run_id=run.id, tool_outputs=tool_outputs, ) print(f"Current run status: {run.status}") Note that there is no code to be written to make the REST API calls. The OpenApiTool calls the appropriate API automatically during the thread run. Only in the scenario where the delivery order needs to be created, the execution flows into the function calling block, where the identified function is manually executed. The source code ofthe Bot App discussed in this post is available here See a demo Watch a demo of the sample app, below. Additional Resources For more details, check out: Azure AI Agent Service documentation. Microsoft Bot Framework Documentation Azure Logic Apps Documentation1.1KViews2likes0CommentsAnnouncing the Azure AI RAG Vercel Next.JS Template
We’re excited to introduce the Azure AI RAG Vercel Next.JS Template, now available in the Vercel Templates Marketplace. This new template empowers developers to easily deploy retrieval-augmented generation (RAG) applications using Azure AI Search and Azure OpenAI—all with just a few clicks. What’s Included? One-Click RAG Integration: Quickly stand up a RAG application that incorporates battle-tested retrieval from Azure AI Search, and advanced modes likeHybrid Search and Semantic Ranking. This lets you ground Azure OpenAI model responses in relevant, up-to-date data with the top-quality candidates. Fully-Managed AI Stack: Leverage Vercel’s global hosting platform and the Vercel AI SDK’s Azure provider to run inference and embedding generation on Azure OpenAI. Modern Web App Foundation: Built with Next.js and integrate with the useChat hook for real-time streaming responses, this template ensures a smooth user experience. It also includes pre-configuration for Tailwind CSS, and a vector embedding retrieval workflow backed by Azure AI Search. Why It Matters By combining Vercel’s performance-optimized hosting with Azure AI’s robust retrieval and OpenAI model integrations, this template slashes setup time for a RAG solution. Instead of spending days configuring vector databases, embedding pipelines, and streaming endpoints, you can start delivering answers and insights in minutes with a powerful web framework in Next.JS. It’s a powerful jumpstart for any project that wants to harness the powers of Generative AI. Learn More & Get Started Today Template: Azure AI RAG Chatbot GitHub Repo: Azure-Samples/azure-ai-vercel-rag-starter: Sample retrieval-augmented generation (RAG) app template using Azure AI Search, Azure OpenAI, and Vercel AI SDK Learn more about Azure AI Search: Azure AI Search Learn more about Azure OpenAI: Azure OpenAI355Views1like0Comments