AzureAI
8 TopicsUnable to locate and add a VM (GPU family) to my available VM options.
I am using azure AI foundry and need to run GPU workload but N-series VM options do not appear when i try to add quota Only CPU families like D and E are listed How can i enable or request N-series GPU VMs in my subscription and region25Views0likes1CommentChaining and Streaming with Responses API in Azure
Responses API is an enhancement of the existing Chat Completions API. It is stateful and supports agentic capabilities. As a superset of the Chat Completions class, it continues to support functionality of chat completions. In addition, reasoning models, like GPT-5 result in better model intelligence when compared to Chat Completions. It has input flexibility, supporting a range of input types. It is currently available in the following regions on Azure and can be used with all the models available in the region. The API supports response streaming, chaining and also function calling. In the examples below, we use the gpt-5-nano model for a simple response, a chained response and streaming responses. To get started update the installed openai library. pip install --upgrade openai Simple Message 1) Build the client with the following code from openai import OpenAI client = OpenAI( base_url=endpoint, api_key=api_key, ) 2) The response received is an id which can then be used to retrieve the message. # Non-streaming request resp_id = client.responses.create( model=deployment, input=messages, ) 3) Message is retrieved using the response id from previous step response = client.responses.retrieve(resp_id.id) Chaining For a chained message, an extra step is sharing the context. This is done by sending the response id in the subsequent requests. resp_id = client.responses.create( model=deployment, previous_response_id=resp_id.id, input=[{"role": "user", "content": "Explain this at a level that could be understood by a college freshman"}] ) Streaming A different function call is used for streaming queries. client.responses.stream( model=deployment, input=messages, # structured messages ) In addition, the streaming query response has to be handled appropriately till end of event stream for event in s: # Accumulate only text deltas for clean output if event.type == "response.output_text.delta": delta = event.delta or "" text_out.append(delta) # Echo streaming output to console as it arrives print(delta, end="", flush=True) The code is available in the following github link - https://github.com/arunacarunac/ResponsesAPI Additional details are available in the following links - Azure OpenAI Responses API - Azure OpenAI | Microsoft LearnPacketMind: My Take on Building a Smarter DPI Tool with Azure AI
Just wanted to share a small but meaningful project I recently put together PacketMind. It’s a lightweight Deep Packet Inspection (DPI) tool designed to help detect suspicious network traffic using Azure’s AI capabilities. And, honestly, this project is a personal experiment that stemmed from one simple thought: Why does DPI always have to be bulky, expensive, and stuck in legacy systems? I mean, think about it. Most of the time, we have to jump through hoops just to get basic packet inspection features, let alone advanced AI-powered traffic analysis. So I figured – let’s see how far we can go by combining Azure’s language models with some good old packet sniffing on Linux. What’s Next? Let’s be honest – PacketMind is an early prototype. There’s a lot I’d love to add: - GUI Interface for easier use - Custom Model Integration (right now it’s tied to a specific Azure model) - More Protocol Support – think beyond HTTP/S - Alerting Features – maybe even Slack/Discord hooks But for now, I’m keeping it simple and focusing on making the core functionality solid. Why Share This? You know, I could’ve just kept this as a side project on my machine, but sharing is part of the fun. If even one person finds PacketMind useful or gets inspired to build something similar, I’ll consider it a win. So, if you’re into networking, AI, or just like to mess with packet data for fun – check it out. Fork it, test it, break it, and let me know how you’d make it better. Here’s the repo: https://github.com/DrHazemAli/packetmind Would love to hear your thoughts, suggestions, or just a thumbs up if you think it’s cool. Cheers!72Views1like0CommentsUsing artificial intelligence to verify document compliance
Organizations of all domains and sizes are actively exploring ways to leverage Artificial intelligence and infuse it into it's business. There are several business challenges in which AI technologies have already made a significant impact to organization's bottom lines; one of these challenges is in the domain of legal document review, processing, and compliance. Any business that regularly reviews and processes legal documents (e.g. financial services, professional services, legal firms) are inundated with both open contracts and repositories of previously executed agreements, all of which have historically been managed by humans. Though humans may bring the required domain expertise their ability to review dense and lengthy legal agreements is manual, slow, and subject to human error. Efforts to modernize these operations began with documents being digitized (i.e. contracts either originating as a digital form or being uploaded via .pdf, post-execution). The next opportunity to innovate in the legal document domain now includes processing these digitized documents through AI services to extract key dates, phrases, or contract terms and create rules to identify outliers or point out terms/conditions for further review. As a note, humans are still involved in the document compliance process but further down the value chain where their abilities to reason and leverage their domain expertise is required. Whether it’s a vendor agreement that must include an arbitration clause, or a loan document requiring specific disclosures, ensuring the presence of these clauses may prove vital in reducing the legal exposure for an organization. With AI, we can now automate much of the required analysis and due diligence that takes place before a legal agreement is ever signed. From classical algorithms like cosine similarity to advanced reasoning using large language models (LLMs), Microsoft Azure offers powerful tools that enable AI solutions that can compare documents and validate their contents. Attached is a link to an Azure AI Document Compliance Proof of Concept-Toolkit. This repo will help rapidly build AI-powered document compliance proof-of-concepts. It leverages multiple similarity-analysis techniques to verify that legal, financial or other documents include all required clauses—and expose them via a simple REST API. Key Features of the Document Compliance PoC toolkit: Clause Verification - Detect and score the presence of required clauses in any document. Multi-Technique Similarity - Compare documents using TF-IDF, cosine similarity over embeddings, and more. Modular Architecture - Swap in your preferred NLP models or similarity algorithms with minimal changes. Extensible Examples - Sample configs and test documents to help you get started in minutes. Please note, this repo is in active development, the API and UI are not operational280Views1like0CommentsLearning: AgentChat Swarm
I am trying to use autogen agentchat swarm team, specifically in websocket based application. Facing issues with async setup and swarm usage. If someone has done work in agentchat or swarm domain, have some sort of tutorial code and can share, that will be of great help. Thanks!87Views2likes0CommentsMulti Model Deployment with Azure AI Foundry Serverless, Python and Container Apps
Intro Azure AI Foundry is a comprehensive AI suite, with a vast set of serverless and managed models offerings designed to democratize AI deployment. Whether you’re running a small startup or an 500 enterprise, Azure AI Foundry provides the flexibility and scalability needed to implement and manage machine learning and AI models seamlessly. By leveraging Azure’s robust cloud infrastructure, you can focus on innovating and delivering value, while Azure takes care of the heavy lifting behind the scenes. In this demonstration, we delve into building an Azure Container Apps stack. This innovative approach allows us to deploy a Web App that facilitates interaction with three powerful models: GPT-4, Deepseek, and PHI-3. Users can select from these models for Chat Completions, gaining invaluable insights into their actual performance, token consumption, and overall efficiency through real-time metrics. This deployment not only showcases the versatility and robustness of Azure AI Foundry but also provides a practical framework for businesses to observe and measure AI effectiveness, paving the way for data-driven decision-making and optimized AI solutions. Azure AI Foundry: The evolution Azure AI Foundry represents the next evolution in Microsoft’s AI offerings, building on the success of Azure AI and Cognitive Services. This unified platform is designed to streamline the development, deployment, and management of AI solutions, providing developers and enterprises with a comprehensive suite of tools and services. With Azure AI Foundry, users gain access to a robust model catalog, collaborative GenAIOps tools, and enterprise-grade security features. The platform’s unified portal simplifies the AI development lifecycle, allowing seamless integration of various AI models and services. Azure AI Foundry offers the flexibility and scalability needed to bring your AI projects to life, with deep insights and fast adoption path for the users. The Model Catalog allows us to filter and compare models per our requirements and easily create deployments directly from the Interface. Building the Application Before describing the methodology and the process, we have to make sure our dependencies are in place. So let’s have a quick look on the prerequisites of our deployment. GitHub - passadis/ai-foundry-multimodels: Azure AI Foundry multimodel utilization and performance metrics Web App. Azure AI Foundry multimodel utilization and performance metrics Web App. - passadis/ai-foundry-multimodels github.com Prerequisites Azure Subscription Azure AI Foundry Hub with a project in East US. The models are all supported in East US. VSCode with the Azure Resources extension There is no need to show the Azure Resources deployment steps, since there are numerous ways to do it and i have also showcased that in previous posts. In fact, it is a standard set of services to support our Micro-services Infrastructure: Azure Container Registry, Azure Key Vault, Azure User Assigned Managed identity, Azure Container Apps Environment and finally our Azure AI Foundry Model deployments. Frontend – Vite + React + TS The frontend is built using Vite and React and features a dropdown menu for model selection, a text area for user input, real-time response display, as well as loading states and error handling. Key considerations in the frontend implementation include the use of modern React patterns and hooks, ensuring a responsive design for various screen sizes, providing clear feedback for user interactions, and incorporating elegant error handling. The current implementation allows us to switch models even after we have initiated a conversation and we can keep up to 5 messages as Chat History. The uniqueness of our frontend is the performance information we get for each response, with Tokens, Tokens per Second and Total Time. Backend – Python + FastAPI The backend is built with FastAPI and is responsible for model selection and configuration, integrating with Azure AI Foundry, processing requests and responses, and handling errors and validation. A directory structure as follows can help us organize our services and utilize the modular strengths of Python: backend/ ├── app/ │ ├── __init__.py │ ├── main.py │ ├── config.py │ ├── api/ │ │ ├── __init__.py │ │ └── routes.py │ ├── models/ │ │ ├── __init__.py │ │ └── request_models.py │ └── services/ │ ├── __init__.py │ └── azure_ai.py ├── run.py # For Local runs ├── Dockerfile ├── requirements.txt └── .env Azure Container Apps A powerful combination allows us to easily integrate both using Dapr, since it is natively supported and integrated in our Azure Container Apps. try { const response = await fetch('/api/v1/generate', { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ model: selectedModel, prompt: userInput, parameters: { temperature: 0.7, max_tokens: 800 } }), }); However we need to correctly configure NGINX to proxy the request to the Dapr Sidecar since we are using Container Images. # API endpoints via Dapr location /api/v1/ { proxy_pass http://localhost:3500/v1.0/invoke/backend/method/api/v1/; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; Azure Key Vault As always all our secret variables like the API Endpoints and the API Keys are stored in Key Vault. We create a Key Vault Client in our Backend and we call each key only the time we need it. That makes our deployment more secure and efficient. Deployment Considerations When deploying your application: Set up proper environment variables Configure CORS settings appropriately Implement monitoring and logging Set up appropriate scaling policies Azure AI Foundry: Multi Model Architecture The solution is built on Azure Container Apps for serverless scalability. The frontend and backend containers are hosted in Azure Container Registry and deployed to Container Apps with Dapr integration for service-to-service communication. Azure Key Vault manages sensitive configurations like API keys through a user-assigned Managed Identity. The backend connects to three Azure AI Foundry models (DeepSeek, GPT-4, and Phi-3), each with its own endpoint and configuration. This serverless architecture ensures high availability, secure secret management, and efficient model interaction while maintaining cost efficiency through consumption-based pricing. Conclusion This Azure AI Foundry Models Demo showcases the power of serverless AI integration in modern web applications. By leveraging Azure Container Apps, Dapr, and Azure Key Vault, we’ve created a secure, scalable, and cost-effective solution for AI model comparison and interaction. The project demonstrates how different AI models can be effectively compared and utilized, providing insights into their unique strengths and performance characteristics. Whether you’re a developer exploring AI capabilities, an architect designing AI solutions, or a business evaluating AI models, this demo offers practical insights into Azure’s AI infrastructure and serverless computing potential. References Azure AI Foundry Azure Container Apps Azure AI – Documentation AI learning hub CloudBlogger: Text To Speech with ContainersHow to create your personal AI powered Email Assistant
Crafting an AI Powered Email Assistant with Semantic Kernel and Neon Serverless PostgreSQL Intro In the realm of Artificial Intelligence, crafting applications that seamlessly blend advanced capabilities with user-friendly design is no small feat. Today, we take you behind the scenes of building an AI Powered Email Assistant, a project that leverages Semantic Kernel for embedding generation and indexing, Neon PostgreSQL for vector storage, and the Azure OpenAI API for generative AI capabilities. This blog post is a practical guide to implementing a powerful AI-driven solution from scratch. The Vision Our AI Powered Email Assistant is designed to: Draft emails automatically using input prompts. Enable easy approval, editing, and sending via Microsoft Graph API. Create and store embeddings of the Draft and Send emails in NEON Serverless PostgreSQL DB. Provide a search feature to retrieve similar emails based on contextual embeddings. This application combines cutting-edge AI technologies and modern web development practices, offering a seamless user experience for drafting and managing emails. The Core Technologies of our AI Powered Email Assistant 1. Semantic Kernel Semantic Kernel simplifies the integration of AI services into applications. It provides robust tools for text generation, embedding creation, and memory management. For our project, Semantic Kernel acts as the foundation for: Generating email drafts via Azure OpenAI. Creating embeddings for storing and retrieving contextual data. 2. Vector Indexing with Neon PostgreSQL Neon, a serverless PostgreSQL solution, allows seamless storage and retrieval of embeddings using the pgvector extension. Its serverless nature ensures scalability and reliability, making it perfect for real-time AI applications. 3. Azure OpenAI API With Azure OpenAI, the project harnesses models like gpt-4 and text-embedding-ada-002 for generative text and embedding creation. These APIs offer unparalleled flexibility and power for building AI-driven workflows. How We Built our AI Powered Email Assistant Step 1: Frontend – A React-Based Interface The frontend, built in React, provides users with a sleek interface to: Input recipient details, subject, and email description. Generate email drafts with a single click. Approve, edit, and send emails directly. We incorporated a loading spinner to enhance user feedback and search functionality for retrieving similar emails. Key Features: State Management: For handling draft generation and email sending. API Integration: React fetch calls connect seamlessly to backend APIs. Dynamic UI: A real-time experience for generating and reviewing drafts. The backend, powered by ASP.NET Core, uses Semantic Kernel for AI services and Neon for vector indexing. Key backend components include Semantic Kernel Services: Text Embedding Generation: Uses Azure OpenAI’s text-embedding-ada-002 to create embeddings for email content. Draft Generation: The AI Powered Email Assistant creates email drafts based on user inputs using Azure OpenAI gpt-4 model (OpenAI Skill). public async Task<string> GenerateEmailDraftAsync(string subject, string description) { try { var chatCompletionService = _kernel.GetRequiredService<IChatCompletionService>(); var message = new ChatMessageContent( AuthorRole.User, $"Draft a professional email with the following details:\nSubject: {subject}\nDescription: {description}" ); var result = await chatCompletionService.GetChatMessageContentAsync(message.Content ?? string.Empty); return result?.Content ?? string.Empty; } catch (Exception ex) { throw new Exception($"Error generating email draft: {ex.Message}", ex); } } } Vector Indexing with Neon: Embedding Storage: Stores embeddings in Neon using the pgvector extension. Contextual Search: Retrieves similar emails by calculating vector similarity. Email Sending via Microsoft Graph: Enables sending emails directly through an authenticated Microsoft Graph API integration. Key Backend Features: Middleware for PIN Authentication: Adds a secure layer to ensure only authorized users access the application. CORS Policies: Allow safe fronted-backend communication. Swagger Documentation: The Swagger Docs that simplify API testing during development. Step 3: Integration with Neon The pgvector extension in Neon PostgreSQL facilitates efficient vector storage and similarity search. Here’s how we integrated Neon into the project: Table Design: A dedicated table for embeddings with columns for subject, content,type, embedding, and created_at. The type column can hold 2 values draft or sent in case the users want to explore previous unsent drafts. Index Optimization: Optimizing the index can save us a lot of time and effort before facing performance issues with CREATE INDEX ON embeddings USING ivfflat (embedding) WITH (lists = 100); Search Implementation: Using SQL queries with vector operations to find the most relevant embeddings. Enhanced Serverless Out-Of-the-box: Even the free SKU offers Read Replica and Autoscaling making it Enterprise-ready. Why This Approach Stands Out Efficiency: By storing embeddings instead of just raw data, the system maintains privacy while enabling rich contextual searches. Scalability: Leveraging Neon’s serverless capabilities ensures that the application can grow without bottlenecks. Autoscale is enabled User-Centric Design: The combination of React’s dynamic frontend and Semantic Kernel’s advanced AI delivers a polished user experience. Prerequisites Azure Account with OpenAI access Microsoft 365 Developer Account NEON PostgreSQL Account .NET 8 SDK Node.js and npm Visual Studio Code or Visual Studio 2022 Step 1: Setting Up Azure Resources Azure OpenAI Setup: Create an Azure OpenAI resource Deploy two models: GPT-4 for text generation text-embedding-ada-002 for embeddings Note down endpoint and API key Entra ID App Registration: Create new App Registration Required API Permissions: Microsoft Graph: Mail.Send (Application) Microsoft Graph: Mail.ReadWrite (Application) Generate Client Secret Note down Client ID and Tenant ID Step 2: Database Setup NEON PostgreSQL: Create a new project Create database Enable pgvector extension Save connection string Step 3: Backend Implementation (.NET) Project Structure: /Controllers - EmailController.cs (handles email operations) - HomeController.cs (root routing) - VectorSearchController.cs (similarity search) /Services - EmailService.cs (Graph API integration) - SemanticKernelService.cs (AI operations) - VectorSearchService.cs (embedding operations) - OpenAISkill.cs (email generation) Key Components: SemanticKernelService: Initializes Semantic Kernel Manages AI model connections Handles prompt engineering EmailService: Microsoft Graph API integration Email sending functionality Authentication management VectorSearchService: Generates embeddings Manages vector storage Performs similarity searches Step 5: Configuration Create new dotnet project with: dotnet new webapi -n SemanticKernelEmailAssistant Configure appsettings.json for your Connections. Install Semantic Kernel ( Look into the SemanticKernelEmailAssistant.csproj for all packages and versions) – Versions are Important ! When all of your files are complete, then you can execute: dotnet build & dotnet publish -c Release To test locally simply run dotnet run Step 5: React Frontend Start a new React App with: npx create-react-app ai-email-assistant Change directory in the newly created Copy all files from Git and run npm install Initialize Tailwind npx tailwindcss init (if you see any related errors) Step 6: Deploy to Azure Both our Apps are Containerized with Docker, so pay attention to get the Dockerfile for each App. Use: [ docker build -t backend . ] and tag and push: [ docker tag backend {acrname}.azurecr.io/backend:v1 ] , [ docker push {acrname}.azurecr.io/backend:v1 ]. The same applies for the Frontend. Make sure to login to Azure Container Registry with: az acr login --name $(az acr list -g myresourcegroup --query "[].{name: name}" -o tsv) We will be able to see our new Repo on Azure Container Registry and deploy our Web Apps Troubleshooting and Maintenance Backend Issues: Use Swagger (/docs) for API testing and debugging. Check Azure Key Vault for PIN and credential updates. Embedding Errors: Ensure pgvector is correctly configured in Neon PostgreSQL. Verify the Azure OpenAI API key and endpoint are correct. Frontend Errors: Use browser dev tools to debug fetch requests. Ensure environment variables are correctly set during build and runtime. Conclusion In today’s rapidly evolving tech landscape, building an AI-powered application is no longer a daunting task, primarily thanks to groundbreaking technologies like Semantic Kernel, Neon PostgreSQL, and Azure OpenAI. Most importantly, this project clearly demonstrates how these powerful tools can seamlessly work together to deliver a robust, scalable, and user-friendly solution. First and foremost, the integration of Semantic Kernel effectively streamlines AI orchestration and prompt management. Additionally, Neon PostgreSQL provides exceptional serverless database capabilities that automatically scale with your application’s needs. Furthermore, Azure OpenAI’s reliable API and advanced language models consistently ensure high-quality AI responses and content generation. Moreover, whether you’re developing a customer service bot, content creation tool, or data analysis platform, this versatile technology stack offers the essential flexibility and power to bring your innovative ideas to life. Consequently, if you’re ready to create your own AI application, the powerful combination of Semantic Kernel and Neon serves as your ideal starting point. Above all, this robust foundation successfully balances sophisticated functionality with straightforward implementation, while simultaneously ensuring seamless scalability as your project continues to grow and evolve. References: Semantic Kernel Vector Store Embeddings NEON ProjectHow to process multiple receipts on one scan
Dear Community, I am building a simple receipt recognizer solution and I face the issue when users tend to fill a full A4 with receipts. As far as I understand, the built-in receipt model does not work with multiple items on one page. Tried a few libraries, to process these scans with Python (like cv2), but in general, edge detection does not work, because many times there's no visible edge, receipts are rotated or partly pushed under each other. Is there any (AI) solution that can help me to first extract the distinct receipts before feeding it to document intelligence? I don't need the text at this point, only the receipts as image. Thanks, vm85Views0likes0Comments