azure ai search
147 TopicsPantone’s Palette Generator enhances creative exploration with agentic AI on Azure
Color can be powerful. When creative professionals shape the mood and direction of their work, color plays a vital role because it provides context and cues for the end product or creation. For more than 60 years, creatives from all areas of design—including fashion, product, and digital—have turned to Pantone color guides to translate inspiration into precise, reproducible color choices. These guides offer a shared language for colors, as well as inspiration and communication across industries. Once rooted in physical tools, Pantone has evolved to meet the needs of modern creators through its trend forecasting, consulting services, and digital platform. Today, Pantone Connect and its multi-agent solution called the Pantone Palette Generator seamlessly bring color inspiration and accuracy into everyday design workflows (as well as the New York City mayoral race). Simply by typing in a prompt, designers can generate palettes in seconds. Available in Pantone Connect, the tool uses Azure services like Microsoft Foundry, Azure AI Search, and Azure Cosmos DB to serve up the company’s vast collection of trend and color research from the color experts at the Pantone Color Institute. reached in seconds instead of days. Now, with Microsoft Foundry, creatives can use agents to get instant color palettes and suggestions based on human insights and trend direction.” Turning Pantone’s color legacy into an AI offering The Palette Generator accelerates the process of researching colors and helps designers find inspiration or validate some of their ideas through trend-backed research. “Pantone wants to be where our customers are,” says Rohani Jotshi, Director of Software Engineering and Data at Pantone. “As workflows become increasingly digital, we wanted to give our customers a way to find inspiration while keeping the same level of accuracy and trust they expect from Pantone.” The Palette Generator taps into thousands of articles from Pantone’s Color Insider library, as well as trend guides and physical color books in a way that preserves the company’s color standards science while streamlining the creative process. Built entirely on Microsoft Foundry, the solution uses Azure AI Search for agentic retrieval-augmented generation (RAG) and Azure OpenAI in Foundry Models to reason over the data. It quickly serves up palette options in response to questions like “Show me soft pastels for an eco-friendly line of baby clothes” or “I want to see vibrant metallics for next spring.” Over the course of two months, the Pantone team built the initial proof of concept for the Palette Generator, using GitHub Copilot to streamline the process and save over 200 hours of work across multiple sprints. This allowed Pantone’s engineers to focus on improving prompt engineering, adding new agent capabilities, and refining orchestration logic rather than writing repetitive code. Building a multi-agent architecture that accelerates creativity The Pantone team worked with Microsoft to develop the multi-agent architecture, which is made up of three connected agents. Using Microsoft Agent Framework—an open source development kit for building AI orchestration systems—it was a straightforward process to bring the agents together into one workflow. “The Microsoft team recommended Microsoft Agent Framework and when we tried it, we saw how it was extremely fast and easy to create architectural patterns,” says Kristijan Risteski, Solutions Architect at Pantone. “With Microsoft Agent Framework, we can spin up a model in five lines of code to connect our agents.” When a user types in a question, they interact with an orchestrator agent that routes prompts and coordinates the more specialized agents. Behind the scenes an additional agent retrieves contextually relevant insights from Pantone’s proprietary Color Insider dataset. Using Azure AI Search with vectorized data indexing, this agent interprets the semantics of a user’s query rather than relying solely on keywords. A third agent then applies rules from color science to assemble a balanced palette. This agent ensures the output is a color combination that meets harmony, contrast, and accessibility standards. The result is a set of Pantone-curated colors that match the emotional and aesthetic tone of the request. “All of this happens in seconds,” says Risteski. To manage conversation flow and achieve long-term data persistence, Pantone uses Azure Cosmos DB, which stores user sessions, prompts, and results. The database not only enables designers to revisit past palette explorations but also provides Pantone with valuable usage intelligence to refine the system over time. “We use Azure Cosmos DB to track inputs and outputs,” says Risteski. “That data helps us fine-tune prompts, measure engagement, and plan how we’ll train future models.” Improving accuracy and performance with Azure AI Search With Azure AI Search, the Palette Generator can understand the nuance of color language. Instead of relying solely on keyword searches that might miss the complexity of words like “vibrant” or “muted,” Pantone’s team decided to use a vectorized index for more accurate palette results. Using the built-in vectorization capability of Azure AI Search, the team converted their color knowledge base—including text-based color psychology and trend articles—into numerical embeddings. “Overall, vector search gave us better results because it could understand the intent of the prompt, not just the words,“ says Risteski. “If someone types, ‘Show me colors that feel serene and oceanic,’ the system understands intent. It finds the right references across our color psychology and trend archives and delivers them instantly.” The team also found ways to reduce latency as they evolved their proof of concept. Initially, they encountered slow inference times and performance lags when retrieving search results. By switching from GPT-4.1 to GPT-5, latency improved. And using Azure AI Search to manage ranking and filtering results helped reduce the number of calls to the large language model (LLM). “With Azure, we just get the articles, put them in a bucket, and say ‘index it now,’ says Risteski. “It takes one or two minutes—and that’s it. The results are so much better than traditional search.” Moving from inspiration to palettes faster The Palette Generator has transformed how designers and color enthusiasts interact with Pantone’s expertise. What once took weeks of research and review can now be done in seconds. “Typically, if someone wanted to develop a palette for a product launch, it might take many months of research,” says Jotshi. “Now, they can type one sentence to describe their inspiration then immediately find Pantone-backed insight and options. Human curation will still be hugely important, but a strong set of starting options can significantly accelerate the palette development process.” Expanding the palette: The next phase for Pantone’s design agent Rapidly launching the Palette Generator in beta has redefined what the Pantone engineering team thought was possible. “We’re a small development team, but with Azure we built an enterprise-grade AI system in a matter of weeks,” says Risteski. “That’s a huge win for us.” Next up, the team plans to migrate the entire orchestration layer to Azure Functions, moving to a fully scalable, serverless deployment. This will allow Pantone to run its agents more efficiently, handle variable workloads automatically, and integrate seamlessly with other Azure products such as Microsoft Foundry and Azure Cosmos DB. At the same time, Pantone plans to expand its multi-agent system to include new specialized agents, including one focused on palette harmony and another focused on trend prediction.398Views1like0CommentsFoundry IQ: Unlocking ubiquitous knowledge for agents
Introducing Foundry IQ by Azure AI Search in Microsoft Foundry. Foundry IQ is a centralized knowledge layer that connects agents to data with the next generation of retrieval-augmented generation (RAG). Foundry IQ includes the following features: Knowledge bases: Available directly in the new Foundry portal, knowledge bases are reusable, topic-centric collections that ground multiple agents and applications through a single API. Automated indexed and federated knowledge sources – Expand what data an agent can reach by connecting to both indexed and remote knowledge sources. For indexed sources, Foundry IQ delivers automatic indexing, vectorization, and enrichment for text, images, and complex documents. Agentic retrieval engine in knowledge bases – A self-reflective query engine that uses AI to plan, select sources, search, rank and synthesize answers across sources with configurable “retrieval reasoning effort.” Enterprise-grade security and governance – Support for document-level access control, alignment with existing permissions models, and options for both indexed and remote data. Foundry IQ is available in public preview through the new Foundry portal and Azure portal with Azure AI Search. Foundry IQ is part of Microsoft's intelligence layer with Fabric IQ and Work IQ.17KViews4likes0CommentsFoundry IQ: boost response relevance by 36% with agentic retrieval
The latest RAG performance evaluations and results for knowledge bases and built-in agentic retrieval engine. Foundry IQ by Azure AI Search is a unified knowledge layer for agents, designed to improve response performance, automate RAG workflows and enable enterprise-ready grounding. These evaluations tested RAG performance for knowledge bases and new features including retrieval reasoning effort and federated sources like web and SharePoint for M365. Foundry IQ and Azure AI Search are part of Microsoft Foundry.3.4KViews4likes0CommentsPush method for Azure AI Search
You may be aware that you can build indexes in Azure AI Search by pulling data from known data sources, like Azure Blob storage, SQL server, OneLake, etc. When using the ‘pull’ method, the built-in indexers run either on a schedule you define or provided you trigger them on-demand at intervals of at least five minutes or longer. What you may not realize is that there is an alternative way to send data to an index: the ‘push’ method. With this approach, the search client can push data to AI Search for initial data ingestion, incremental updates or deletions. Push vs. Pull: Which Approach Fits Your Scenario? Both Push and Pull methods in Azure AI Search are powerful ways to load data into an index. Each has its strengths, and the right choice depends on your requirements. Here’s how they compare: Feature Push Model (APIs) Pull Model (Indexers) Control & Flexibility Full control over timing, batch size, and operations (upload, merge, delete). Limited to scheduled runs and indexer configuration. Latency Near real-time updates—trigger indexing as often as needed. Dependent on scheduled polling intervals or on-demand runs. Data Source Support Works with any source that can produce JSON matching your schema. Limited to supported connectors (Azure Blob, SQL, SharePoint, etc.). Parallelism Customizable—design your own pipeline for concurrency and throughput. Managed internally by indexer; less granular control. AI Enrichment Requires custom implementation if needed. Built-in skillsets for enrichment and integrated vectorization. Best For Higher-indexing performance, higher complexity orchestration; high-demanding indexing timelines. Rapid setup and automation where indexing frequency and performance is less critical. Reference this article for more information about the Push mechanism: Data import and data ingestion - Azure AI Search | Microsoft Learn Push model uses APIs to upload documents into an existing search index. You can upload documents individually or in batches up to 1000 per batch, or 16MB per batch, whichever limit comes first. Step-by-Step: Pushing Data to Your Index Here’s an example of how to use the REST API POST method to ‘push’ the new content to existing index, and of the CURL command to search the new content. Let’s say, you have an existing index: https://xxxxxxxxxxxxxxxx.search.windows.net Step 1. Get search service URL from Azure portal. Step 2. Get the index name from AI Search service. Step 3. Get the API key from AI Search service. Step 4. Get the index fields. Step 5. Use POST command to ‘push’ the new content to the existing index. Below is a POST command example. When pushing data to the index, you need to specify the field names. This code shows how to insert two new chunks with document key ‘chunk_id’ with values ‘chunk-003’ and ‘chunk-004’. After sending the POST request, below is the result: Step 6. Verify the new content searchable in the index. Since the new documents were inserted as chunks, you can search by using keywords. Below is a CURL command in PowerShell. If you have any feedback or questions about this article and how this is useful to you, don’t hesitate to reach out. What’s next? Learn how to push embeddings in an AI Search index for vector search. Set a vectorizer to automatically embed text queries.374Views0likes0CommentsSimplify Search Development with the New Azure AI Search Wizard
Azure AI Search has introduced the new “Import Data” wizard—a unified, modernized experience that streamlines index creation across keyword, RAG, and multimodal RAG workflows. By merging the legacy keyword search wizard with the vectorization flow used for advanced AI scenarios, this update simplifies how users connect to data sources, configure skillsets, and build query-ready indexes. The new wizard supports semantic ranking, integrated vectorization, and multimodal enrichment, with expanded connector options like Azure Queues, OneDrive for Business, and SharePoint Online via Logic Apps. During the phased rollout, both the classic and new wizards will coexist, but users are encouraged to switch early to take advantage of enhanced capabilities and prepare for the eventual retirement of the legacy experience. Whether you're building traditional search or intelligent retrieval systems, the new wizard offers a faster, more intuitive path to production-ready indexes.520Views0likes0CommentsThe Future of AI: From Noise to Insight - An AI Agent for Customer Feedback
This post explores how Microsoft’s AI Futures team built a multi-agent system to transform scattered customer feedback into actionable insights. The solution aggregates feedback from multiple channels, uses advanced language models to cluster themes, summarize content, and identify sentiment, and delivers prioritized insights directly in Microsoft Teams. With human-in-the-loop safeguards, the system accelerates triage, prioritization, and follow-ups while maintaining compliance and traceability. Future enhancements include richer automation, trend visualization, and expanded feedback sources.366Views0likes0CommentsBYOPI - Design your own custom private AI Search indexer with no code ADF (SQLServer on private VM)
Executive Summary Building a fully private search indexing solution using Azure Data Factory (ADF) to sync SQL Server data from private VM to Azure AI Search is achievable but comes with notable complexities and limitations. This blog shares my journey, discoveries, and honest assessment of the BYOPI (Build Your Own Private Indexer) architecture. Architectural flow: Table of Contents Overall Setup How ADF works in this approach with Azure AI Search Challenges - discovered Pros and Cons: An Honest Assessment Conclusion and Recommendations 1. Overall Setup: Phase 1: Resource Group & Network Setup : create resource group and vNET (virtual network) in any region of your choice Phase 2: Deploy SQL Server VM: Phase 3: Create Azure Services - ADF (Azure Data Factory), Azure AI Search and AKV (Azure Key Vault) service from portal or from your choice of deployment. Phase 4: Create Private Endpoints for all the services in their dedicated subnets: Phase 5: Configure SQL Server on VM : connect to VM via bastion and setup database, tables & SP: Sample metadata used as below: CREATE DATABASE BYOPI_DB; GO USE BYOPI_DB; GO CREATE TABLE Products ( ProductId INT IDENTITY(1,1) PRIMARY KEY, ProductName NVARCHAR(200) NOT NULL, Description NVARCHAR(MAX), Category NVARCHAR(100), Price DECIMAL(10,2), InStock BIT DEFAULT 1, Tags NVARCHAR(500), IsDeleted BIT DEFAULT 0, CreatedDate DATETIME DEFAULT GETDATE(), ModifiedDate DATETIME DEFAULT GETDATE() ); CREATE TABLE WatermarkTable ( TableName NVARCHAR(100) PRIMARY KEY, WatermarkValue DATETIME ); INSERT INTO WatermarkTable VALUES ('Products', '2024-01-01'); CREATE PROCEDURE sp_update_watermark @TableName NVARCHAR(100), @NewWatermark DATETIME AS BEGIN UPDATE WatermarkTable SET WatermarkValue = @NewWatermark WHERE TableName = @TableName; END; INSERT INTO Products (ProductName, Description, Category, Price, Tags) VALUES ('Laptop Pro', 'High-end laptop', 'Electronics', 1299.99, 'laptop,computer'), ('Office Desk', 'Adjustable desk', 'Furniture', 599.99, 'desk,office'), ('Wireless Mouse', 'Bluetooth mouse', 'Electronics', 29.99, 'mouse,wireless'); Phase 6: Install Self-Hosted Integration Runtime Create SHIR in ADF: Go to ADF resource in Azure Portal Click "Open Azure Data Factory Studio" Note: You need to access from a VM in the same VNet or via VPN since ADF is private In ADF Studio, click Manage (toolbox icon) Select Integration runtimes → "+ New" Select "Azure, Self-Hosted" → "Self-Hosted" Name: SHIR-BYOPI or of your choice Click "Create" Copy Key1 (save it) Install SHIR on VM In the VM (via Bastion): Open browser, go to: https://www.microsoft.com/download/details.aspx?id=39717 Download and install Integration Runtime During setup: Launch Configuration Manager Paste the Key1 from Step 14 Click "Register" Wait for "Connected" status Phase 7: Create Search Index through below powershell script and saving it as search_index.ps1 $searchService = "search-byopi" $apiKey = "YOUR-ADMIN-KEY" $headers = @{ 'api-key' = $apiKey 'Content-Type' = 'application/json' } $index = @{ name = "products-index" fields = @( @{name="id"; type="Edm.String"; key=$true} @{name="productName"; type="Edm.String"; searchable=$true} @{name="description"; type="Edm.String"; searchable=$true} @{name="category"; type="Edm.String"; filterable=$true; facetable=$true} @{name="price"; type="Edm.Double"; filterable=$true} @{name="inStock"; type="Edm.Boolean"; filterable=$true} @{name="tags"; type="Collection(Edm.String)"; searchable=$true} ) } | ConvertTo-Json -Depth 10 Invoke-RestMethod ` -Uri "https://$searchService.search.windows.net/indexes/products-index?api-version=2020-06-30" ` -Method PUT ` -Headers $headers ` -Body $index Phase 8: Configure AKV & ADF Components - Link AKV and ADF for secrets Create Key Vault Secrets Navigate to kv-byopi (created AKV resource) in Portal Go to "Access policies" Click "+ Create" Select permissions: Get, List for secrets Select principal: adf-byopi-private Create Go to "Secrets" → "+ Generate/Import": Name: sql-password, Value: <> Name: search-api-key, Value: Your search key Create Linked Services in ADF Access ADF Studio from the VM (since it's private): Key Vault Linked Service: Manage → Linked services → "+ New" Search "Azure Key Vault" Configure: Name: LS_KeyVault Azure Key Vault: kv-byopi Integration runtime: AutoResolveIntegrationRuntime Test connection → Create SQL Server Linked Service: "+ New" → "SQL Server" Configure: Name: LS_SqlServer Connect via: SHIR-BYOPI Server name: localhost Database: BYOPI_DB Authentication: SQL Authentication User: sqladmin Password: Select from Key Vault → LS_KeyVault → sql-password Test → Create Azure Search Linked Service: "+ New" → "Azure Search" Configure: Name: LS_AzureSearch URL: https://search-byopi.search.windows.net Connect via: SHIR-BYOPI - Important - use SHIR API Key: From Key Vault → LS_KeyVault → search-api-key Test → Create Phase 9: Create ADF Datasets and PipelineCreate Datasets SQL Products Dataset: Author → Datasets → "+" → "New dataset" Select "SQL Server" → Continue Select "Table" → Continue Properties: Name: DS_SQL_Products Linked service: LS_SqlServer Table: Select Products click OK Watermark Dataset: Repeat with: Name: DS_SQL_Watermark Table: WatermarkTable Search Dataset: "+" → "Azure Search" Properties: Name: DS_Search_Index Linked service: LS_AzureSearch Index name: products-index Create Pipeline Author → Pipelines → "+" → "Pipeline" Name: PL_BYOPI_Private From Activities → General, drag "Lookup" activity Configure Lookup 1: Name: LookupOldWatermark Settings: Source dataset: DS_SQL_Watermark Query: below sql SELECT WatermarkValue FROM WatermarkTable WHERE TableName='Products' - **First row only**: ✓ Add another Lookup: Name: LookupNewWatermark Query: below sql SELECT MAX(ModifiedDate) as NewWatermark FROM Products Add Copy Data activity: Name: CopyToSearchIndex Source: Dataset: DS_SQL_Products Query: sql SELECT CAST(ProductId AS NVARCHAR(50)) as id, ProductName as productName, Description as description, Category as category, Price as price, InStock as inStock, Tags as tags, CASE WHEN IsDeleted = 1 THEN 'delete' ELSE 'upload' END as [@search.action] FROM Products WHERE ModifiedDate > '@{activity('LookupOldWatermark').output.firstRow.WatermarkValue}' AND ModifiedDate <= '@{activity('LookupNewWatermark').output.firstRow.NewWatermark}' Sink: Dataset: DS_Search_Index Write behavior: Merge Batch size: 1000 Add Stored Procedure activity: Name: UpdateWatermark SQL Account: LS_SqlServer Stored procedure: sp_update_watermark Parameters: TableName: Products NewWatermark: @{activity('LookupNewWatermark').output.firstRow.NewWatermark} Connect activities with success conditions Phase 10: Test and Schedule Test Pipeline Click "Debug" in pipeline Monitor in Output panel Check for green checkmarks Create Trigger In pipeline, click "Add trigger" → "New/Edit" Click "+ New" Configure: Name: TR_Hourly Type: Schedule Recurrence: Every 1 Hour OK → Publish All Monitor Go to Monitor tab View Pipeline runs Check Trigger runs Your pipeline should look like this: Phase 11: Validation & Testing Verify Private Connectivity From the VM, run PowerShell: # Test DNS resolution (should return private IPs) nslookup adf-byopi-private.datafactory.azure.net # Should show private IP like : 10.0.2.x nslookup search-byopi.search.windows.net # Should show private IP like : 10.0.2.x nslookup kv-byopi.vault.azure.net # Should show private IP like : 10.0.2.x # Test Search $headers = @{ 'api-key' = 'YOUR-KEY' } Invoke-RestMethod -Uri "https://search-byopi.search.windows.net/indexes/products-index/docs?`$count=true&api-version=2020-06-30" -Headers $headers Test Data Sync (adding few records) and verify in search index: -- Add test record INSERT INTO Products (ProductName, Description, Category, Price, Tags) VALUES ('Test Product Private', 'Testing private pipeline', 'Test', 199.99, 'test,private'); -- Trigger pipeline manually or wait for schedule -- Then verify in Search index 2. How ADF works in this approach with Azure AI search: Azure AI Search uses a REST API for indexing or called as uploading. When ADF sink uploads data to AI Search, it's actually making HTTP POST requests: for example - POST https://search-byopi.search.windows.net/indexes/products-index/docs/index?api-version=2020-06-30 Content-Type: application/json api-key: YOUR-ADMIN-KEY { "value": [ { "@search.action": "upload", "id": "1", "productName": "Laptop", "price": 999.99 }, { "@search.action": "delete", "id": "2" } ] } Delete action used here is soft delete and not hard delete. pipeline query: SELECT CAST(ProductId AS NVARCHAR(50)) as id, -- Renamed to match index field ProductName as productName, -- Renamed to match index field Description as description, Category as category, Price as price, InStock as inStock, Tags as tags, CASE WHEN IsDeleted = 1 THEN 'delete' ELSE 'upload' END as [@search.action] -- Special field with @ prefix FROM Products WHERE ModifiedDate > '2024-01-01' ``` Returns this resultset: ``` id | productName | description | category | price | inStock | tags | @search.action ----|----------------|------------------|-------------|--------|---------|----------------|--------------- 1 | Laptop Pro | High-end laptop | Electronics | 1299 | 1 | laptop,computer| upload 2 | Office Chair | Ergonomic chair | Furniture | 399 | 1 | chair,office | upload 3 | Deleted Item | Old product | Archive | 0 | 0 | old | delete The @search.action Field - The Magic Control This special field tells Azure AI Search what to do with each document: @search.action What It Does When to Use What Happens If Document... upload Insert OR Update Most common - upsert operation Exists: Updates it<br>Doesn't exist: Creates it merge Update only When you know it exists Exists: Updates specified fields<br>Doesn't exist: ERROR mergeOrUpload Update OR Insert Safe update Exists: Updates fields<br>Doesn't exist: Creates it delete Remove from index To remove documents Exists: Deletes it<br>Doesn't exist: Ignores (no error) ADF automatically converts SQL results to JSON format required by Azure Search: { "value": [ { "@search.action": "upload", "id": "1", "productName": "Laptop Pro", "description": "High-end laptop", "category": "Electronics", "price": 1299.00, "inStock": true, "tags": "laptop,computer" }, { "@search.action": "upload", "id": "2", "productName": "Office Chair", "description": "Ergonomic chair", "category": "Furniture", "price": 399.00, "inStock": true, "tags": "chair,office" }, { "@search.action": "delete", "id": "3" // For delete, only ID is needed } ] } ADF doesn't send all records at once. It batches them based on writeBatchSize and each batch is a separate HTTP POST to Azure Search How ADF will detect new changes and run batches: Watermark will be updated after each successful ADF run to detect new changes as below: Handling different scenarios: Scenario 1: No Changes Between Runs: Run at 10:00 AM: - Old Watermark: 09:45:00 - New Watermark: 10:00:00 - Query: WHERE ModifiedDate > '09:45' AND <= '10:00' - Result: 0 rows - Action: Still update watermark to 10:00 - Why: Prevents reprocessing if changes come later Scenario 2: Bulk Insert Happens: Someone inserts 5000 records at 10:05 AM Run at 10:15 AM: - Old Watermark: 10:00:00 - New Watermark: 10:15:00 - Query: WHERE ModifiedDate > '10:00' AND <= '10:15' - Result: 5000 rows - Action: Process all 5000, update watermark to 10:15 Scenario 3: Pipeline Fails Run at 10:30 AM: - Old Watermark: 10:15:00 (unchanged from last success) - Pipeline fails during Copy activity - Watermark NOT updated (still 10:15:00) Next Run at 10:45 AM: - Old Watermark: 10:15:00 (still the last successful) - New Watermark: 10:45:00 - Query: WHERE ModifiedDate > '10:15' AND <= '10:45' - Result: Gets ALL changes from 10:15 to 10:45 (30 minutes of data) - No data loss! Note: There is still room for improvement by refining this logic to handle more advanced scenarios. However, I have not examined the logic in depth, as the goal here is to review how the overall setup functions, identify its limitations, and compare it with the indexing solutions available in AI Search. 3. Challenges - disovered: When I tried to set out to build a private search indexer for SQL Server data residing on an Azure VM with no public IP, the solution seemed straightforward: use Azure Data Factory to orchestrate the data movement to Azure AI Search. The materials made it sound simple. The reality? It's possible, but the devil is in the details. What We Needed: ✅ SQL Server on private VM (no public IP) ✅ Azure AI Search with private endpoint ✅ No data over public internet ✅ Support for full CRUD operations ✅ Near real-time synchronization ✅ No-code/low-code solution Reality Check: ⚠️ DELETE operations not natively supported in ADF sink ⚠️ Complex networking requirements ⚠️ Higher costs than expected ⚠️ Significant setup complexity ✅ But it IS possible with workarounds Components Required Azure VM: ~$150/month (D4s_v3) Self-Hosted Integration Runtime: Free (runs on VM) Private Endpoints: ~$30/month (approx 3 endpoints) Azure Data Factory: ~$15-60/month (depends on frequency) Azure AI Search: ~$75/month (Basic tier) Total: ~$270-315/month** The DELETE Challenge: Despite Azure AI Search REST API fully supporting delete operations via @search.action, ADF's native Azure Search sink does NOT support delete operations. -- This SQL query with delete action SELECT ProductId as id, CASE WHEN IsDeleted = 1 THEN 'delete' ELSE 'upload' END as [@search.action] FROM Products -- Will NOT delete documents in Azure Search when using Copy activity -- The @search.action = 'delete' is ignored by ADF sink! Nevertheless, there is a workaround using the Web Activity approach or by calling the REST API from the ADF side to perform the delete operation. { "name": "DeleteViaREST", "type": "Web", "typeProperties": { "url": "https://search.windows.net/indexes/index/docs/index", "method": "POST", "body": { "value": [ {"@search.action": "delete", "id": "123"} ] } } } Development Challenges No Direct Portal Access: With ADF private, you need: Jump box in the same VNet VPN connection Bastion for access Testing Complexity: Can't use Postman from local machine Need to test from within VNet Debugging requires multiple tools 4. Pros and Cons: An Honest Assessment: Pros: Security: Complete network isolation Compliance: Meets strict requirements No-Code: Mostly configuration-based Scalability: Can handle large datasets Monitoring: Built-in ADF monitoring Managed Service: Microsoft handles updates Cons: DELETE Complexity: Not natively supported Cost: Higher than expected Setup Complexity: Many moving parts Debugging: Difficult with private endpoints Hidden Gotchas: - SHIR requires Windows VM (Linux in preview) - Private endpoint DNS propagation delays - ADF Studio timeout with private endpoints - SHIR auto-update can break pipelines 5. Conclusion and Recommendations: When to Use BYOPI: ✅ Good Fit: Strict security requirements Needs indexing from an un-supported scenarios for example SQL server residing on private VM Budget > $500/month Team familiar with Azure networking Read-heavy workloads ❌ Poor Fit: Simple search requirements Budget conscious Need real-time updates Heavy DELETE operations Small team without Azure expertise BYOPI works, but it's more complex and expensive than initially expected. The lack of native DELETE support in ADF sink is a significant limitation that requires workarounds. Key Takeaways It works but requires significant effort DELETE (hard) operations need workarounds Costs will be higher than expected Complexity is substantial for a "no-code" solution Alternative solutions might be better for many scenarios Disclaimer: The sample scripts provided in this article are provided AS IS without warranty of any kind. The author is not responsible for any issues, damages, or problems that may arise from using these scripts. Users should thoroughly test any implementation in their environment before deploying to production. Azure services and APIs may change over time, which could affect the functionality of the provided scripts. Always refer to the latest Azure documentation for the most up-to-date information. Thanks for reading this blog! I hope you've found this approach of creating own private indexing solution for Azure AI Search (BYOPI) useful 😀206Views1like0CommentsInteractive AI Avatars: Building Voice Agents with Azure Voice Live API
Azure Voice Live API recently reached General Availability, marking a significant milestone in conversational AI technology. This unified API surface doesn't just enable speech-to-speech capabilities for AI agents—it revolutionizes the entire experience by streaming interactions through lifelike avatars. Built on the powerful speech-to-speech capabilities of the GPT-4 Realtime model, Azure Voice Live API offers developers unprecedented flexibility: - Out-of-the-box or custom avatars from Azure AI Services - Wide range of neural voices, including Indic languages like the one featured in this demo - Single API interface that handles both audio processing and avatar streaming - Real-time responsiveness with sub-second latency In this post, I'll walk you through building a retail e-commerce voice agent that demonstrates this technology. While this implementation focuses on retail apparel, the architecture is entirely generic and can be adapted to any domain—healthcare, banking, education, or customer support—by simply changing the system prompt and implementing domain-specific tools integration. The Challenge: Navigating Uncharted Territory At the time of writing, documentation for implementing avatar features with Azure Voice Live API is minimal. The protocol-specific intricacies around avatar video streaming and the complex sequence of steps required to establish a live avatar connection were quite overwhelming. This is where Agent mode in GitHub Copilot in Visual Studio Code proved extremely useful. Through iterative conversations with the AI agent, I successfully discovered the approach to implement avatar streaming without getting lost in low-level protocol details. Here's how different AI models contributed to this solution: - Claude Sonnet 4.5: Rapidly architected the application structure, designing the hybrid WebSocket + WebRTC architecture with TypeScript/Vite frontend and FastAPI backend - GPT-5-Codex (Preview): Instrumental in implementing the complex avatar streaming components, handling WebRTC peer connections, and managing the bidirectional audio flow Architecture Overview: A Hybrid Approach The architecture comprises of these components 🐳 Container Application Architecture Vite Server: Node.js-based development server that serves the React application. In development, it provides hot module replacement and proxies API calls to `FastAPI`. In production, the React app is built into static files served by FastAPI. FastAPI with ASGI: Python web framework running on `uvicorn ASGI server`. ASGI (Asynchronous Server Gateway Interface) enables handling multiple concurrent connections efficiently, crucial for WebSocket connections and real-time audio processing. 🤖 AI & Voice Services Integration Azure Voice Live API: Primary service that manages the connection to GPT-4 Realtime Model, provides avatar video generation, neural text-to-speech, and WebSocket gateway functionality GPT-4 Realtime Model: Accessed through Azure Voice Live API for real-time audio processing, function calling, and intelligent conversation management 🔄 Communication Flows Audio Flow: Browser → WebSocket → FastAPI → WebSocket → Azure Voice Live API → GPT-4 Realtime Model Video Flow: Browser ↔ WebRTC Direct Connection ↔ Azure Voice Live API (bypasses backend for performance) Function Calls: GPT-4 Realtime (via Voice Live) → FastAPI Tools → Business APIs → Response → GPT-4 Realtime (via Voice Live) 🤖 Business process automation Workflows / RAG Shipment Logic App Agent: Analyzes orders, validates data, creates shipping labels, and updates tracking information Conversation Analysis Agent: Azure Logic App Reviews complete conversations, performs sentiment analysis, generates quality scores with justification, and stores insights for continuous improvement Knowledge Retrieval: Azure AI Search is used to reason over manuals and help respond to Customer queries on policies, products The solution implements a hybrid architecture that leverages both WebSocket proxying and direct WebRTC connections for optimal performance. This design ensures the conversational audio flow remains manageable and secure through the backend, while the bandwidth-intensive avatar video streams directly to the browser for optimal performance. The flow used in the Avatar communication: ``` Frontend FastAPI Backend Azure Voice Live API │ │ │ │ 1. Request Session │ │ │─────────────────────────►│ │ │ │ 2. Create Session │ │ │─────────────────────────►│ │ │ │ │ │ 3. Session Config │ │ │ (with avatar settings)│ │ │─────────────────────────►│ │ │ │ │ │ 4. session.updated │ │ │ (ICE servers) │ │ 5. ICE servers │◄─────────────────────────│ │◄─────────────────────────│ │ │ │ │ │ 6. Click "Start Avatar" │ │ │ │ │ │ 7. Create RTCPeerConn │ │ │ with ICE servers │ │ │ │ │ │ 8. Generate SDP Offer │ │ │ │ │ │ 9. POST /avatar-offer │ │ │─────────────────────────►│ │ │ │ 10. Encode & Send SDP │ │ │─────────────────────────►│ │ │ │ │ │ 11. session.avatar. │ │ │ connecting │ │ │ (SDP answer) │ │ 12. SDP Answer │◄─────────────────────────│ │◄─────────────────────────│ │ │ │ │ │ 13. setRemoteDescription │ │ │ │ │ │ 14. WebRTC Handshake │ │ │◄─────────────────────────┼─────────────────────────►│ │ (Direct Connection) │ │ │ │ │ │ 15. Video/Audio Stream │ │ │◄────────────────────────────────────────────────────│ │ (Bypasses Backend) │ │ ``` For more technical details, refer to the technical details behind the implementation, refer to the GitHub Repo shared in this post. Here is a video of the demo of the application in action.845Views3likes0CommentsUsing the Voice Live API in Azure AI Foundry
In this blog post, we’ll explore the Voice Live API from Azure AI Foundry. Officially released for general availability on October 1, 2025, this API unifies speech recognition, generative AI, and text-to-speech capabilities into a single, streamlined interface. It removes the complexity of manually orchestrating multiple components and ensures a consistent developer experience across all models, making it easy to switch and experiment. What sets Voice Live API apart are its advanced conversational enhancements, including: Semantic Voice Activity Detection (VAD) that’s robust against background noise and accurately detects when a user intends to speak. Semantic end-of-turn detection that supports natural pauses in conversation. Server-side audio processing features like noise suppression and echo cancellation, simplifying client-side development. Let’s get started. 1. Getting Started with Voice Live API The Voice Live API ships with an SDKthat lets you open a single realtime WebSocket connection and then do everything—stream microphone audio up, receive synthesized audio/text/function‑call events down— without writing any of the low-level networking plumbing. This is how the connection is opened with the Python SDK. from azure.ai.voicelive.aio import connect from azure.core.credentials import AzureKeyCredential async with connect( endpoint=VOICE_LIVE_ENDPOINT, # https://<your-foundry-resource>.cognitiveservices.azure.com/ credential=AzureKeyCredential(VOICE_LIVE_KEY), model="gpt-4o-realtime", connection_options={ "max_msg_size": 10 * 1024 * 1024, # allow streamed PCM "heartbeat": 20, # keep socket alive "timeout": 20, # network resilience }, ) as connection: Notice that you don't need an underlying deployment nor manage any generative AI models, as the API handles all the underlying infrastructure. Immediately after connecting, declare what kind of conversation you want. This is where you “teach” the session the model instructions, which voice to synthesize, what tool functions it may call, and how to detect speech turns: from azure.ai.voicelive.models import ( RequestSession, Modality, AzureStandardVoice, InputAudioFormat, OutputAudioFormat, AzureSemanticVad, ToolChoiceLiteral, AudioInputTranscriptionOptions ) session_config = RequestSession( modalities=[Modality.TEXT, Modality.AUDIO], instructions="Assist the user with account questions succinctly.", voice=AzureStandardVoice(name="alloy", type="azure-standard"), input_audio_format=InputAudioFormat.PCM16, output_audio_format=OutputAudioFormat.PCM16, turn_detection=AzureSemanticVad( threshold=0.5, prefix_padding_ms=300, silence_duration_ms=500 ), tools=[ # optional { "name": "get_user_information", "description": "Retrieve profile and limits for a user", "input_schema": { "type": "object", "properties": {"user_id": {"type": "string"}}, "required": ["user_id"] } } ], tool_choice=ToolChoiceLiteral.AUTO, input_audio_transcription=AudioInputTranscriptionOptions(model="whisper-1"), ) await connection.session.update(session=session_config) After session setup, it is pure event-driven flow: async for event in connection: if event.type == ServerEventType.RESPONSE_AUDIO_DELTA: playback_queue.put(event.delta) elif event.type == ServerEventType.CONVERSATION_ITEM_CREATED and event.item.type == ItemType.FUNCTION_CALL: handle_function_call(event) That’s the core: one connection, one session config message, then pure event-driven flow. 2. Deep Dive: Tool (Function) Handling in the Voice Live SDK In the Voice Live context, “tools” are model-invocable functions you expose with a JSON schema. The SDK streams a structured function call request (name + incrementally streamed arguments), you execute real code locally, then feed the JSON result back so the model can incorporate it into its next spoken (and/or textual) turn. Let’s unpack the full lifecycle. First, the model emits a CONVERSATION_ITEM_CREATED event whose item.type == FUNCTION_CALL if event.item.type == ItemType.FUNCTION_CALL: await self._handle_function_call_with_improved_pattern(event, connection) Arguments stream (possibly token-by-token) until the SDK signals RESPONSE_FUNCTION_CALL_ARGUMENTS_DONE. Optionally, the SDK may also complete the “response” segment with RESPONSE_DONE before you run the tool. Then we execute the local Python function, and explicitly request a new model response via connection.response.create(), telling the model to incorporate the tool result into a natural-language (and audio) answer. async def _handle_function_call(self, created_evt, connection): call_item = created_evt.item # ResponseFunctionCallItem name = call_item.name call_id = call_item.call_id prev_id = call_item.id # 1. Wait until arguments are fully streamed args_done = await _wait_for_event( connection, {ServerEventType.RESPONSE_FUNCTION_CALL_ARGUMENTS_DONE} ) assert args_done.call_id == call_id arguments = args_done.arguments # JSON string # 2. (Optional) Wait for RESPONSE_DONE to avoid race with model finishing segment await _wait_for_event(connection, {ServerEventType.RESPONSE_DONE}) # 3. Execute func = self.available_functions.get(name) if not func: # Optionally send an error function output return result = await func(arguments) # Implementations are async in this sample # 4. Send output output_item = FunctionCallOutputItem(call_id=call_id, output=json.dumps(result)) await connection.conversation.item.create( previous_item_id=prev_id, item=output_item ) # 5. Trigger follow-up model response await connection.response.create() 3. Sample App: Try the repo with sample app we have created, together with all the infrastructure required already automated. This sample app simulates a friendly real‑time contact‑center rep who can listen continuously, understand you as you speak, instantly look up things like your credit card’s upcoming due date or a product detail via function calls, and then answer back naturally in a Brazilian Portuguese neural voice with almost no lag. Behind the scenes it streams your microphone audio to Azure’s Voice Live (GPT‑4o realtime) model, transcribes and reasons on the fly, selectively triggers lightweight “get user information” or “get product information” lookups to Azure AI Search , and speaks responses right back to you. Happy Coding!534Views0likes0Comments