azure openai
8 TopicsStaying in the flow: SleekFlow and Azure turn customer conversations into conversions
A customer adds three items to their cart but never checks out. Another asks about shipping, gets stuck waiting eight minutes, only to drop the call. A lead responds to an offer but is never followed up with in time. Each of these moments represents lost revenue, and they happen to businesses every day. SleekFlow was founded in 2019 to help companies turn those almost-lost-customer moments into connection, retention, and growth. Today we serve more than 2,000 mid-market and enterprise organizations across industries including retail and e-commerce, financial services, healthcare, travel and hospitality, telecommunications, real estate, and professional services. In total, those customers rely on SleekFlow to orchestrate more than 600,000 daily customer interactions across WhatsApp, Instagram, web chat, email, and more. Our name reflects what makes us different. Sleek is about unified, polished experiences—consolidating conversations into one intelligent, enterprise-ready platform. Flow is about orchestration—AI and human agents working together to move each conversation forward, from first inquiry to purchase to renewal. The drive for enterprise-ready agentic AI Enterprises today expect always-on, intelligent conversations—but delivering that at scale proved daunting. When we set out to build AgentFlow, our agentic AI platform, we quickly ran into familiar roadblocks: downtime that disrupted peak-hour interactions, vector search delays that hurt accuracy, and costs that ballooned under multi-tenant workloads. Development slowed from limited compatibility with other technologies, while customer onboarding stalled without clear compliance assurances. To move past these barriers, we needed a foundation that could deliver the performance, trust, and global scale enterprises demand. The platform behind the flow: How Azure powers AgentFlow We chose Azure because building AgentFlow required more than raw compute power. Chatbots built on a single-agent model often stall out. They struggle to retrieve the right context, they miss critical handoffs, and they return answers too slowly to keep a customer engaged. To fix that, we needed an ecosystem capable of supporting a team of specialized AI agents working together at enterprise scale. Azure Cosmos DB provides the backbone for memory and context, managing short-term interactions, long-term histories, and vector embeddings in containers that respond in 15–20 milliseconds. Powered by Azure AI Foundry, our agents use Azure OpenAI models within Azure AI Foundry to understand and generate responses natively in multiple languages. Whether in English, Chinese, or Portuguese, the responses feel natural and aligned with the brand. Semantic Kernel acts as the conductor, orchestrating multiple agents, each of which retrieves the necessary knowledge and context, including chat histories, transactional data, and vector embeddings, directly from Azure Cosmos DB. For example, one agent could be retrieving pricing data, another summarizing it, and a third preparing it for a human handoff. The result is not just responsiveness but accuracy. A telecom provider can resolve a billing question while surfacing an upsell opportunity in the same dialogue. A financial advisor can walk into a call with a complete dossier prepared in seconds rather than hours. A retailer can save a purchase by offering an in-stock substitute before the shopper abandons the cart. Each of these conversations is different, yet the foundation is consistent on AgentFlow. Fast, fluent, and focused: Azure keeps conversations moving Speed is the heartbeat of a good conversation. A delayed answer feels like a dropped call, and an irrelevant one breaks trust. For AgentFlow to keep customers engaged, every operation behind the scenes has to happen in milliseconds. A single interaction can involve dozens of steps. One agent pulls product information from embeddings, another checks it against structured policy data, and a third generates a concise, brand-aligned response. If any of these steps lag, the dialogue falters. On Azure, they don’t. Azure Cosmos DB manages conversational memory and agent state across dedicated containers for short-term exchanges, long-term history, and vector search. Sharded DiskANN indexing powers semantic lookups that resolve in the 15–20 millisecond range—fast enough that the customer never feels a pause. Microsoft Phi’s model Phi-4 as well as Azure OpenAI in Foundry Models like o3-mini and o4-mini, provide the reasoning, and Azure Container Apps scale elastically, so performance holds steady during event-driven bursts, such as campaign broadcasts that can push the platform from a few to thousands of conversations per minute, and during daily peak-hour surges. To support that level of responsiveness, we run Azure Container Apps on the Pay-As-You-Go consumption plan, using KEDA-based autoscaling to expand from five idle containers to more than 160 within seconds. Meanwhile, Microsoft Orleans coordinates lightweight in-memory clustering to keep conversations sleek and flowing. The results are tangible. Retrieval-augmented generation recall improved from 50 to 70 percent. Execution speed is about 50 percent faster. For SleekFlow’s customers, that means carts are recovered before they’re abandoned, leads are qualified in real time, and support inquiries move forward instead of stalling out. With Azure handling the complexity under the hood, conversations flow naturally on the surface—and that’s what keeps customers engaged. Secure enough for enterprises, human enough for customers AgentFlow was built with security-by-design as a first principle, giving businesses confidence that every interaction is private, compliant, and reliable. On Azure, every AI agent operates inside guardrails enterprises can depend on. Azure Cosmos DB enforces strict per-tenant isolation through logical partitioning, encryption, and role-based access control, ensuring chat histories, knowledge bases, and embeddings remain auditable and contained. Models deployed through Azure AI Foundry, including Azure OpenAI and Microsoft Phi, process data entirely within SleekFlow’s Azure environment and guarantees it is never used to train public models, with activity logged for transparency. And Azure’s certifications—including ISO 27001, SOC 2, and GDPR—are backed by continuous monitoring and regional data residency options, proving compliance at a global scale. But trust is more than a checklist of certifications. AgentFlow brings human-like fluency and empathy to every interaction, powered by Azure OpenAI running with high token-per-second throughput so responses feel natural in real time. Quality control isn’t left to chance. Human override workflows are orchestrated through Azure Container Apps and Azure App Service, ensuring AI agents can carry conversations confidently until they’re ready for human agents. Enterprises gain the confidence to let AI handle revenue-critical moments, knowing Azure provides the foundation and SleekFlow provides the human-centered design. Shaping the next era of conversational AI on Azure The benefits of Azure show up not only in customer conversations but also in the way our own teams work. Faster processing speeds and high token-per-second throughput reduce latency, so we spend less time debugging and more time building. Stable infrastructure minimizes downtime and troubleshooting, lowering operational costs. That same reliability and scalability have transformed the way we engineer AgentFlow. AgentFlow started as part of our monolithic system. Shipping new features used to take about a month of development and another week of heavy testing to make sure everything held together. After moving AgentFlow to a microservices architecture on Azure Container Apps, we can now deploy updates almost daily with no down time or customer impact. And this is all thanks to native support for rolling updates and blue-green deployments. This agility is what excites us most about what's ahead. With Azure as our foundation, SleekFlow is not simply keeping pace with the evolution of conversational AI—we are shaping what comes next. Every interaction we refine, every second we save, and every workflow we streamline brings us closer to our mission: keeping conversations sleek, flowing, and valuable for enterprises everywhere.263Views3likes0CommentsPantone’s Palette Generator enhances creative exploration with agentic AI on Azure
Color can be powerful. When creative professionals shape the mood and direction of their work, color plays a vital role because it provides context and cues for the end product or creation. For more than 60 years, creatives from all areas of design—including fashion, product, and digital—have turned to Pantone color guides to translate inspiration into precise, reproducible color choices. These guides offer a shared language for colors, as well as inspiration and communication across industries. Once rooted in physical tools, Pantone has evolved to meet the needs of modern creators through its trend forecasting, consulting services, and digital platform. Today, Pantone Connect and its multi-agent solution called the Pantone Palette Generator seamlessly bring color inspiration and accuracy into everyday design workflows (as well as the New York City mayoral race). Simply by typing in a prompt, designers can generate palettes in seconds. Available in Pantone Connect, the tool uses Azure services like Microsoft Foundry, Azure AI Search, and Azure Cosmos DB to serve up the company’s vast collection of trend and color research from the color experts at the Pantone Color Institute. reached in seconds instead of days. Now, with Microsoft Foundry, creatives can use agents to get instant color palettes and suggestions based on human insights and trend direction.” Turning Pantone’s color legacy into an AI offering The Palette Generator accelerates the process of researching colors and helps designers find inspiration or validate some of their ideas through trend-backed research. “Pantone wants to be where our customers are,” says Rohani Jotshi, Director of Software Engineering and Data at Pantone. “As workflows become increasingly digital, we wanted to give our customers a way to find inspiration while keeping the same level of accuracy and trust they expect from Pantone.” The Palette Generator taps into thousands of articles from Pantone’s Color Insider library, as well as trend guides and physical color books in a way that preserves the company’s color standards science while streamlining the creative process. Built entirely on Microsoft Foundry, the solution uses Azure AI Search for agentic retrieval-augmented generation (RAG) and Azure OpenAI in Foundry Models to reason over the data. It quickly serves up palette options in response to questions like “Show me soft pastels for an eco-friendly line of baby clothes” or “I want to see vibrant metallics for next spring.” Over the course of two months, the Pantone team built the initial proof of concept for the Palette Generator, using GitHub Copilot to streamline the process and save over 200 hours of work across multiple sprints. This allowed Pantone’s engineers to focus on improving prompt engineering, adding new agent capabilities, and refining orchestration logic rather than writing repetitive code. Building a multi-agent architecture that accelerates creativity The Pantone team worked with Microsoft to develop the multi-agent architecture, which is made up of three connected agents. Using Microsoft Agent Framework—an open source development kit for building AI orchestration systems—it was a straightforward process to bring the agents together into one workflow. “The Microsoft team recommended Microsoft Agent Framework and when we tried it, we saw how it was extremely fast and easy to create architectural patterns,” says Kristijan Risteski, Solutions Architect at Pantone. “With Microsoft Agent Framework, we can spin up a model in five lines of code to connect our agents.” When a user types in a question, they interact with an orchestrator agent that routes prompts and coordinates the more specialized agents. Behind the scenes an additional agent retrieves contextually relevant insights from Pantone’s proprietary Color Insider dataset. Using Azure AI Search with vectorized data indexing, this agent interprets the semantics of a user’s query rather than relying solely on keywords. A third agent then applies rules from color science to assemble a balanced palette. This agent ensures the output is a color combination that meets harmony, contrast, and accessibility standards. The result is a set of Pantone-curated colors that match the emotional and aesthetic tone of the request. “All of this happens in seconds,” says Risteski. To manage conversation flow and achieve long-term data persistence, Pantone uses Azure Cosmos DB, which stores user sessions, prompts, and results. The database not only enables designers to revisit past palette explorations but also provides Pantone with valuable usage intelligence to refine the system over time. “We use Azure Cosmos DB to track inputs and outputs,” says Risteski. “That data helps us fine-tune prompts, measure engagement, and plan how we’ll train future models.” Improving accuracy and performance with Azure AI Search With Azure AI Search, the Palette Generator can understand the nuance of color language. Instead of relying solely on keyword searches that might miss the complexity of words like “vibrant” or “muted,” Pantone’s team decided to use a vectorized index for more accurate palette results. Using the built-in vectorization capability of Azure AI Search, the team converted their color knowledge base—including text-based color psychology and trend articles—into numerical embeddings. “Overall, vector search gave us better results because it could understand the intent of the prompt, not just the words,“ says Risteski. “If someone types, ‘Show me colors that feel serene and oceanic,’ the system understands intent. It finds the right references across our color psychology and trend archives and delivers them instantly.” The team also found ways to reduce latency as they evolved their proof of concept. Initially, they encountered slow inference times and performance lags when retrieving search results. By switching from GPT-4.1 to GPT-5, latency improved. And using Azure AI Search to manage ranking and filtering results helped reduce the number of calls to the large language model (LLM). “With Azure, we just get the articles, put them in a bucket, and say ‘index it now,’ says Risteski. “It takes one or two minutes—and that’s it. The results are so much better than traditional search.” Moving from inspiration to palettes faster The Palette Generator has transformed how designers and color enthusiasts interact with Pantone’s expertise. What once took weeks of research and review can now be done in seconds. “Typically, if someone wanted to develop a palette for a product launch, it might take many months of research,” says Jotshi. “Now, they can type one sentence to describe their inspiration then immediately find Pantone-backed insight and options. Human curation will still be hugely important, but a strong set of starting options can significantly accelerate the palette development process.” Expanding the palette: The next phase for Pantone’s design agent Rapidly launching the Palette Generator in beta has redefined what the Pantone engineering team thought was possible. “We’re a small development team, but with Azure we built an enterprise-grade AI system in a matter of weeks,” says Risteski. “That’s a huge win for us.” Next up, the team plans to migrate the entire orchestration layer to Azure Functions, moving to a fully scalable, serverless deployment. This will allow Pantone to run its agents more efficiently, handle variable workloads automatically, and integrate seamlessly with other Azure products such as Microsoft Foundry and Azure Cosmos DB. At the same time, Pantone plans to expand its multi-agent system to include new specialized agents, including one focused on palette harmony and another focused on trend prediction.478Views1like0CommentsBuilding a Scalable Web Crawling and Indexing Pipeline with Azure storage and AI Search
In the ever-evolving world of data management, keeping search indexes up-to-date with dynamic data can be challenging. Traditional approaches, such as manual or scheduled indexing, are resource-intensive, delay-prone, and difficult to scale. Azure Blob Trigger combined with an AI Search Indexer offers a cutting-edge solution to overcome these challenges, enabling real-time, scalable, and enriched data indexing. This blog explores how Blob Trigger, integrated with Azure Cognitive Search, transforms the indexing process by automating workflows and enriching data with AI capabilities. It highlights the step-by-step process of configuring Blob Storage, creating Azure Functions for triggers, and seamlessly connecting with an AI-powered search index. The approach leverages Azure's event-driven architecture, ensuring efficient and cost-effective data management.2.5KViews7likes10CommentsBuilding an AI-Powered ESG Consultant Using Azure AI Services: A Case Study
In today's corporate landscape, Environmental, Social, and Governance (ESG) compliance has become increasingly important for stakeholders. To address the challenges of analyzing vast amounts of ESG data efficiently, a comprehensive AI-powered solution called ESGai has been developed. This blog explores how Azure AI services were leveraged to create a sophisticated ESG consultant for publicly listed companies. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh The Challenge: Making Sense of Complex ESG Data Organizations face significant challenges when analyzing ESG compliance data. Manual analysis is time-consuming, prone to errors, and difficult to scale. ESGai was designed to address these pain points by creating an AI-powered virtual consultant that provides detailed insights based on publicly available ESG data. Solution Architecture: The Three-Agent System ESGai implements a sophisticated three-agent architecture, all powered by Azure's AI capabilities: Manager Agent: Breaks down complex user queries into manageable sub-questions containing specific keywords that facilitate vector search retrieval. The system prompt includes generalized document headers from the vector database for context. Worker Agent: Processes the sub-questions generated by the Manager, connects to the vector database to retrieve relevant text chunks, and provides answers to the sub-questions. Results are stored in Cosmos DB for later use. Director Agent: Consolidates the answers from the Worker agent into a comprehensive final response tailored specifically to the user's original query. It's important to note that while conceptually there are three agents, the Worker is actually a single agent that gets called multiple times - once for each sub-question generated by the Manager. Current Implementation State The current MVP implementation has several limitations that are planned for expansion: Limited Company Coverage: The vector database currently stores data for only 2 companies, with 3 documents per company (Sustainability Report, XBRL, and BRSR). Single Model Deployment: Only one GPT-4o model is currently deployed to handle all agent functions. Basic Storage Structure: The Blob container has a simple structure with a single directory. While Azure Blob storage doesn't natively support hierarchical folders, the team plans to implement virtual folders in the future. Free Tier Limitations: Due to funding constraints, the AI Search service is using the free tier, which limits vector data storage to 50MB. Simplified Vector Database: The current index stores all 6 files (3 documents × 2 companies) in a single vector database without filtering capabilities or schema definition. Azure Services Powering ESGai The implementation of ESGai leverages multiple Azure services for a robust and scalable architecture: Azure AI Services: Provides pre-built APIs, SDKs, and services that incorporate AI capabilities without requiring extensive machine learning expertise. This includes access to 62 pre-trained models for chat completions through the AI Foundry portal. Azure OpenAI: Hosts the GPT-4o model for generating responses and the Ada embedding model for vectorization. The service combines OpenAI's advanced language models with Azure's security and enterprise features. Azure AI Foundry: Serves as an integrated platform for developing, deploying, and governing generative AI applications. It offers a centralized management centre that consolidates subscription information, connected resources, access privileges, and usage quotas. Azure AI Search (formerly Cognitive Search): Provides both full-text and vector search capabilities using the OpenAI ada-002 embedding model for vectorization. It's configured with hybrid search algorithms (BM25 RRF) for optimal chunk ranking. Azure Storage Services: Utilizes Blob Storage for storing PDFs, Business Responsibility Sustainability Reports (BRSRs), and other essential documents. It integrates seamlessly with AI Search using indexers to track database changes. Cosmos DB: Employs MongoDB APIs within Cosmos DB as a NoSQL database for storing chat history between agents and users. Azure App Services: Hosts the web application using a B3-tier plan optimized for cost efficiency, with GitHub Actions integrated for continuous deployment. Project Evolution: From Concept to Deployment The development of ESGai followed a structured approach through several phases: Phase 1: Data Cleaning Extracted specific KPIs from XML/XBRL datasets and BRSR reports containing ESG data for 1,000 listed companies Cleaned and standardized data to ensure consistency and accuracy Phase 2: RAG Framework Development Implemented Retrieval-Augmented Generation (RAG) to enhance responses by dynamically fetching relevant information Created a workflow that includes query processing, data retrieval, and response generation Phase 3: Initial Deployment Deployed models locally using Docker and n8n automation tools for testing Identified the need for more scalable web services Phase 4: Transition to Azure Services Migrated automation workflows from n8n to Azure AI Foundry services Leveraged Azure's comprehensive suite of AI services, storage solutions, and app hosting capabilities Technical Implementation Details Model Configurations: The GPT model is configured with: Model version: 2024-11-20 Temperature: 0.7 Max Response Token: 800 Past Messages: 10 Top-p: 0.95 Frequency/Presence Penalties: 0 The embedding model uses OpenAI-text-embedding-Ada-002 with 1536 dimensions and hybrid semantic search (BM25 RRF) algorithms. Cost Analysis and Efficiency A detailed cost breakdown per user query reveals: App Server: $390-400 AI Search: $5 per query RAG Query Processing: $4.76 per query Agent-specific costs: Manager: $0.05 (30 input tokens, 210 output tokens) Worker: $3.71 (1500 input tokens, 1500 output tokens) Director: $1.00 (600 input tokens, 600 output tokens) Challenges and Solutions The team faced several challenges during implementation: Quota Limitations: Initial deployments encountered token quota restrictions, which were resolved through Azure support requests (typically granted within 24 hours). Cost Optimization: High costs associated with vectorization required careful monitoring. The team addressed this by shutting down unused services and deploying on services with free tiers. Integration Issues: GitHub Actions raised errors during deployment, which were resolved using GitHub's App Service Build Service. Azure UI Complexity: The team noted that Azure AI service naming conventions were sometimes confusing, as the same name is used for both parent and child resources. Free Tier Constraints: The AI Search service's free tier limitation of 50MB for vector data storage restricts the amount of company information that can be included in the current implementation. Future Roadmap The current implementation is an MVP with several areas for expansion: Expand the database to include more publicly available sustainability reports beyond the current two companies Optimize token usage by refining query handling processes Research alternative embedding models to reduce costs while maintaining accuracy Implement a more structured storage system with virtual folders in Blob storage Upgrade from the free tier of AI Search to support larger data volumes Develop a proper schema for the vector database to enable filtering and more targeted searches Scale to multiple GPT model deployments for improved performance and redundancy Conclusion ESGai demonstrates how advanced AI techniques like Retrieval-Augmented Generation can transform data-intensive domains such as ESG consulting. By leveraging Azure's comprehensive suite of AI services alongside a robust agent-based architecture, this solution provides users with actionable insights while maintaining scalability and cost efficiency. https://youtu.be/5-oBdge6Q78?si=Vb9aHx79xk3VGYAh262Views0likes0CommentsAzure OpenAI Content Filter Result is always content_filter_error
I'm exploring blocklists as a solution for OpenAI not detecting sensitive words (specifically "wrist-cutting" in my local language (Cantonese) (to be fair not even Chinese AIs know the word) I have created a Blocklist with 1 entry: Term: [鎅𰾛𠝹]手 Type: Regex It can block inputs with ease: { "error": { "message": "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", "type": null, "param": "prompt", "code": "content_filter", "status": 400, "innererror": { "code": "ResponsibleAIPolicyViolation", "content_filter_result": { "custom_blocklists": { "details": [ { "filtered": true, "id": "ChineseBlockList" } ], "filtered": true }, "hate": { "filtered": false, "severity": "safe" }, "profanity": { "filtered": false, "detected": false }, "self_harm": { "filtered": false, "severity": "safe" }, "sexual": { "filtered": false, "severity": "safe" }, "violence": { "filtered": false, "severity": "safe" } } } } } However, it cannot block outputs. { "choices": [ { "content_filter_result": { "error": { "code": "content_filter_error", "message": "The contents are not filtered" } }, "content_filter_results": {}, "finish_reason": "stop", "index": 0, "logprobs": null, "message": { "content": "𠝹手(也寫作“拍手”)是一種手部動作,通常是將雙手合攏並用力拍打在一起,發出聲音。這個動作常用於表達讚賞、鼓勵或慶祝,像是在演出結束後觀眾的掌聲,或是在某些活動中用來引起注意。𠝹手也可以用於節奏感的表達,像是在音樂中隨著節拍拍手。這個動作在許多文化中都有其獨特的意義和用途。", "refusal": null, "role": "assistant" } } ], "created": 1737702254, "id": "chatcmpl-At81eUTIzDkZPCKznSKr19YMJU1ud", "model": "gpt-4o-mini-2024-07-18", "object": "chat.completion", "prompt_filter_results": [ { "prompt_index": 0, "content_filter_results": { "custom_blocklists": { "filtered": false }, "hate": { "filtered": false, "severity": "safe" }, "profanity": { "filtered": false, "detected": false }, "self_harm": { "filtered": false, "severity": "safe" }, "sexual": { "filtered": false, "severity": "safe" }, "violence": { "filtered": false, "severity": "safe" } } } ], "system_fingerprint": "fp_5154047bf2", "usage": { "completion_tokens": 138, "completion_tokens_details": { "accepted_prediction_tokens": 0, "audio_tokens": 0, "reasoning_tokens": 0, "rejected_prediction_tokens": 0 }, "prompt_tokens": 34, "prompt_tokens_details": { "audio_tokens": 0, "cached_tokens": 0 }, "total_tokens": 172 } }905Views0likes4CommentsDell APEX File Storage for Microsoft Azure brings a powerful new option to our customers
Dell PowerScale OneFS has been trusted by customers across all industries to provide performant, resilient, and scalable multiprotocol file storage for nearly two decades. At Ignite 2024 we announced a new Dell managed variant that compliments the existing offering to give you powerful choice from a proven industry leader.971Views0likes0CommentsSearch Selected Documents with Azure Open AI BYOD Search
I have azure blob with 3 documents (Doc 1, Doc 2, Doc 3) and they are being search through azure open and azure search service which is working perfectly fine. But my requirement is to restrict my search to only Doc 2 and Doc 3. But m unable to find any option through which i can do this. How can this be done? Below is the sample body- { "dataSources": [ { "type": "AzureCognitiveSearch", "parameters": { "endpoint": "https://@@@@@@@@@@@@@.search.windows.net", "key": "@@@@@@@@@@@@@", "indexName": "abcd-index", "semanticConfiguration": "", "queryType": "simple", "fieldsMapping": { "contentFieldsSeparator": "\n", "contentFields": [ "content" ], "filepathField": null, "titleField": "title", "urlField": null }, "inScope": true, "roleInformation": "You are an AI assistant that helps people find information." } } ], "messages": [ { "role": "user", "content": "tell me about solution versioning" } ] }657Views0likes4Comments