DiskANN
6 TopicsSubgenAI makes AI practical, scalable, and sustainable with Azure Database for PostgreSQL
Authors: Abe Omorogbe, Senior Program Manager at Microsoft and Julia Schröder Langhaeuser, VP of Product Serenity Star at SubgenAI AI agents are thriving in pilots and prototypes. However, scaling them across organizations is more difficult. A recent MIT report shows that 95 percent of projects fail to reach production. Long development cycles, lack of observability, and compliance hurdles leave enterprises struggling to deliver production-ready agents. SubgenAI, a European generative AI company that focuses on democratizing AI for businesses and governments, saw an opportunity to change this. Its flagship platform, Serenity Star, transforms AI agent development from a code-heavy, fragmented process into a streamlined, no-code experience. Built on Microsoft Azure Database for PostgreSQL, Semantic Kernel, and Microsoft Foundry, Serenity Star empowers organizations to deploy production-grade AI agents in minutes, not months. SubgenAI’s mission is to make generative AI accessible, scalable, and secure for every organization. Whether you're a startup or a multinational, Serenity Star offers the tools to build intelligent agents tailored to your business logic, with full control over data and deployment. “Many things must happen around it in the coming years. Serenity Star is designed to solve problems like data control, compliance, and decision ethics—so companies can unleash the full potential of generative AI without compromising trust or profitability” - Lorenzo Serratosa Simplifying complex AI agent development Technical and operational challenges are inherent in enterprise-wide AI agent deployments. Examples include time-consuming iteration cycles, lack of observability and cost control, security concerns, and data sovereignty requirements. Serenity Star addresses these pain points by handling the entire AI agent lifecycle while providing enterprise-grade security and compliance features. Users can focus on defining their agent's purpose and behavior rather than wrestling with technical implementation details. Its framework focuses on four essentials for AI agents: the brain (underlying model), knowledge (accessible information), behavior (programmed responses), and tools (external system integrations). This framework directly influenced the technology stack choices for Serenity Star, with Azure Database for PostgreSQL powering the knowledge retrieval and Semantic Kernel enabling flexible model orchestration. Real-world architecture in action When a user query comes in, Serenity Star uses the vector capabilities of Azure Database for PostgreSQL to retrieve the most relevant knowledge. That context, combined with the user’s input, forms a complete prompt. Semantic Kernel then routes the request to the right large language model, ensuring the agent delivers accurate and context-aware responses. Serenity Star’s native connectors to platforms such as Microsoft Teams, WhatsApp, and Google Tag Manager are also part of this architecture, delivering answers directly in the collaboration and communication tools enterprises already use every day. Figure 1: Serenity Star Architecture This routing and orchestration architecture applies to the multi-tenant SaaS deployments and dedicated customer instances offered by Serenity Star. Azure Database for PostgreSQL provides native Row-Level Security (RLS) capabilities, a key advantage for securely managing multi-tenant environments. Multi-tenant deployments allow organizations to get started quickly with lower overhead, while dedicated instances meet the needs of enterprises with strict compliance and data sovereignty requirements. Optimizing for scale The same architecture that powers retrieval, routing, and multi-channel delivery also provides a foundation for performance at scale. As adoption grows, the team continuously monitors query volume, response times, and resource efficiency across both multi-tenant and dedicated environments. To stay ahead of demand, SubgenAI actively experiments with new Azure Database for PostgreSQL features such as DiskANN for faster vector search. These optimizations keep latency low even as more users and connectors are added. The result is a platform that maintains sub-60-second response times for 99 percent of chart generations, regardless of deployment model or integration point. With this systematic approach to scaling, organizations can deploy fully functional AI agents that are connected to their preferred communication platforms in just 15 minutes instead of hours. For enterprises that have struggled with failed AI projects, Serenity Star offers not only a secure and compliant solution but also one proven to grow with their needs. Why Azure Database for PostgreSQL is a cornerstone The knowledge component of AI agents relies heavily on retrieval-augmented generation (RAG) systems that perform similarity searches against embedded content. This requires a database capable of handling efficient vector search while maintaining enterprise-grade reliability and security. SubgenAI evaluated multiple vector database options. However, Azure Database for PostgreSQL with PGVector emerged as the clear winner. There were several compelling reasons for this. One is its mature technology, which provides immediate credibility with enterprise customers. Two, the ability to scale GenAI use cases with features like DiskANN for accurate and scalable vector search. There, the flexibility and appeal of using an open-source database with a vibrant and fast-moving community. As CPO Leandro Harillo explains: “When we tell them their data runs on Azure Database for PostgreSQL, it’s a relief. It's a well-known technology versus other options that were born with this new AI revolution.” As an open-source relational database management system, Azure Database for PostgreSQL offers extensibility and seamless integration with Microsoft’s enterprise ecosystem. It has a trusted reputation that appeals to organizations with strict data sovereignty and compliance requirements such as those in healthcare and insurance where reliability and governance are non-negotiable. The integration with Azure's broader ecosystem also simplified implementation. With Serenity Star built entirely on Azure infrastructure, Azure Database for PostgreSQL provided seamless connectivity and consistent performance characteristics. The fast response times necessary for real-time agent interactions are the result, along with maintaining the reliability demanded by enterprise customers. Semantic Kernel: Enabling model flexibility at scale Enterprise AI success requires the ability to experiment with different models and adapt quickly as technology evolves. Semantic Kernel makes this possible, supporting over 300 LLMs and embedding models through a unified interface. With Serenity Star, organizations can make genuine choices about their AI implementations without vendor lock-in. Companies can use embedding models from OpenAI through Azure deployments, ensuring their information remains in their own infrastructure while accessing cutting-edge capabilities. If business requirements change or new models emerge, switching becomes a configuration change rather than a development project. Semantic Kernel's comprehensive connector ecosystem also accelerated SubgenAI's own development process. Interfaces for different vector databases enabled rapid prototyping and comparison during the evaluation phase. “Semantic Kernel helped us to be able to try the different ones and choose the one that fit better for us,” notes Julia Schroder, VP of Product. The SubgenAI team has also extended Semantic Kernel to support more features in Azure Database for PostgreSQL, which is easier because of how well-known and popular PostgreSQL is. SubgenAI has also contributed improvements back to the community. This collaborative approach ensures the platform benefits from the latest developments while helping advance the broader ecosystem. Proven impact of Azure Database for PostgreSQL across industries Because organizations struggle to deliver production-ready agents because of long development cycles, lack of observability, and compliance, the effectiveness of Azure Database for PostgreSQL and other Azure services is reflected in deployment metrics and customer feedback. Production-ready agents typically require around 30 iterations for basic implementations. Complex use cases demand significantly more refinement. One GenAI customer in medical education required over 200 iterations to perfect an agent that evaluates medical students through complex case analysis. Azure PostgreSQL and other Azure services support hour-long iteration cycles rather than week-long sprints, which made this level of refinement economically feasible. Cost efficiency is another significant advantage. SubgenAI provisions and configures models in Microsoft Foundry, which eliminates idling GPU resources while providing detailed cost breakdowns. Users can see exactly how tokens are consumed across prompt text, RAG context, and tool usage, enabling data-driven optimization decisions. Consulting partnerships validate the platform's market position. One consulting firm with 50,000 employees is delighted with the easier implementation, faster deployment, and reliable production performance. Conclusion The combination of Azure Database for PostgreSQL and Semantic Kernel has enabled SubgenAI to address the fundamental challenges that cause 95 percent of enterprise AI projects to fail. Organizations using Serenity Star bypass the traditional barriers of lengthy development cycles, limited observability, and compliance hurdles that typically derail AI initiatives. The platform's architecture delivers measurable results, including a 50 percent reduction in coding time, support for complex agents requiring 200+ iterations, and deployment capabilities that compress months-long projects into 15-minute implementations. Azure Database for PostgreSQL provides the enterprise-grade foundation that customers in regulated industries require, while Semantic Kernel ensures organizations retain flexibility as AI technology evolves. This technological partnership creates a reliable pathway for companies to deploy production-ready AI agents without sacrificing data sovereignty or operational control. Through the reliability of Azure Database for PostgreSQL and the flexibility of Semantic Kernel, Serenity Star delivers an enterprise-ready foundation that makes AI practical, scalable, and sustainable.218Views1like0CommentsStaying in the flow: SleekFlow and Azure turn customer conversations into conversions
A customer adds three items to their cart but never checks out. Another asks about shipping, gets stuck waiting eight minutes, only to drop the call. A lead responds to an offer but is never followed up with in time. Each of these moments represents lost revenue, and they happen to businesses every day. SleekFlow was founded in 2019 to help companies turn those almost-lost-customer moments into connection, retention, and growth. Today we serve more than 2,000 mid-market and enterprise organizations across industries including retail and e-commerce, financial services, healthcare, travel and hospitality, telecommunications, real estate, and professional services. In total, those customers rely on SleekFlow to orchestrate more than 600,000 daily customer interactions across WhatsApp, Instagram, web chat, email, and more. Our name reflects what makes us different. Sleek is about unified, polished experiences—consolidating conversations into one intelligent, enterprise-ready platform. Flow is about orchestration—AI and human agents working together to move each conversation forward, from first inquiry to purchase to renewal. The drive for enterprise-ready agentic AI Enterprises today expect always-on, intelligent conversations—but delivering that at scale proved daunting. When we set out to build AgentFlow, our agentic AI platform, we quickly ran into familiar roadblocks: downtime that disrupted peak-hour interactions, vector search delays that hurt accuracy, and costs that ballooned under multi-tenant workloads. Development slowed from limited compatibility with other technologies, while customer onboarding stalled without clear compliance assurances. To move past these barriers, we needed a foundation that could deliver the performance, trust, and global scale enterprises demand. The platform behind the flow: How Azure powers AgentFlow We chose Azure because building AgentFlow required more than raw compute power. Chatbots built on a single-agent model often stall out. They struggle to retrieve the right context, they miss critical handoffs, and they return answers too slowly to keep a customer engaged. To fix that, we needed an ecosystem capable of supporting a team of specialized AI agents working together at enterprise scale. Azure Cosmos DB provides the backbone for memory and context, managing short-term interactions, long-term histories, and vector embeddings in containers that respond in 15–20 milliseconds. Powered by Azure AI Foundry, our agents use Azure OpenAI models within Azure AI Foundry to understand and generate responses natively in multiple languages. Whether in English, Chinese, or Portuguese, the responses feel natural and aligned with the brand. Semantic Kernel acts as the conductor, orchestrating multiple agents, each of which retrieves the necessary knowledge and context, including chat histories, transactional data, and vector embeddings, directly from Azure Cosmos DB. For example, one agent could be retrieving pricing data, another summarizing it, and a third preparing it for a human handoff. The result is not just responsiveness but accuracy. A telecom provider can resolve a billing question while surfacing an upsell opportunity in the same dialogue. A financial advisor can walk into a call with a complete dossier prepared in seconds rather than hours. A retailer can save a purchase by offering an in-stock substitute before the shopper abandons the cart. Each of these conversations is different, yet the foundation is consistent on AgentFlow. Fast, fluent, and focused: Azure keeps conversations moving Speed is the heartbeat of a good conversation. A delayed answer feels like a dropped call, and an irrelevant one breaks trust. For AgentFlow to keep customers engaged, every operation behind the scenes has to happen in milliseconds. A single interaction can involve dozens of steps. One agent pulls product information from embeddings, another checks it against structured policy data, and a third generates a concise, brand-aligned response. If any of these steps lag, the dialogue falters. On Azure, they don’t. Azure Cosmos DB manages conversational memory and agent state across dedicated containers for short-term exchanges, long-term history, and vector search. Sharded DiskANN indexing powers semantic lookups that resolve in the 15–20 millisecond range—fast enough that the customer never feels a pause. Microsoft Phi’s model Phi-4 as well as Azure OpenAI in Foundry Models like o3-mini and o4-mini, provide the reasoning, and Azure Container Apps scale elastically, so performance holds steady during event-driven bursts, such as campaign broadcasts that can push the platform from a few to thousands of conversations per minute, and during daily peak-hour surges. To support that level of responsiveness, we run Azure Container Apps on the Pay-As-You-Go consumption plan, using KEDA-based autoscaling to expand from five idle containers to more than 160 within seconds. Meanwhile, Microsoft Orleans coordinates lightweight in-memory clustering to keep conversations sleek and flowing. The results are tangible. Retrieval-augmented generation recall improved from 50 to 70 percent. Execution speed is about 50 percent faster. For SleekFlow’s customers, that means carts are recovered before they’re abandoned, leads are qualified in real time, and support inquiries move forward instead of stalling out. With Azure handling the complexity under the hood, conversations flow naturally on the surface—and that’s what keeps customers engaged. Secure enough for enterprises, human enough for customers AgentFlow was built with security-by-design as a first principle, giving businesses confidence that every interaction is private, compliant, and reliable. On Azure, every AI agent operates inside guardrails enterprises can depend on. Azure Cosmos DB enforces strict per-tenant isolation through logical partitioning, encryption, and role-based access control, ensuring chat histories, knowledge bases, and embeddings remain auditable and contained. Models deployed through Azure AI Foundry, including Azure OpenAI and Microsoft Phi, process data entirely within SleekFlow’s Azure environment and guarantees it is never used to train public models, with activity logged for transparency. And Azure’s certifications—including ISO 27001, SOC 2, and GDPR—are backed by continuous monitoring and regional data residency options, proving compliance at a global scale. But trust is more than a checklist of certifications. AgentFlow brings human-like fluency and empathy to every interaction, powered by Azure OpenAI running with high token-per-second throughput so responses feel natural in real time. Quality control isn’t left to chance. Human override workflows are orchestrated through Azure Container Apps and Azure App Service, ensuring AI agents can carry conversations confidently until they’re ready for human agents. Enterprises gain the confidence to let AI handle revenue-critical moments, knowing Azure provides the foundation and SleekFlow provides the human-centered design. Shaping the next era of conversational AI on Azure The benefits of Azure show up not only in customer conversations but also in the way our own teams work. Faster processing speeds and high token-per-second throughput reduce latency, so we spend less time debugging and more time building. Stable infrastructure minimizes downtime and troubleshooting, lowering operational costs. That same reliability and scalability have transformed the way we engineer AgentFlow. AgentFlow started as part of our monolithic system. Shipping new features used to take about a month of development and another week of heavy testing to make sure everything held together. After moving AgentFlow to a microservices architecture on Azure Container Apps, we can now deploy updates almost daily with no down time or customer impact. And this is all thanks to native support for rolling updates and blue-green deployments. This agility is what excites us most about what's ahead. With Azure as our foundation, SleekFlow is not simply keeping pace with the evolution of conversational AI—we are shaping what comes next. Every interaction we refine, every second we save, and every workflow we streamline brings us closer to our mission: keeping conversations sleek, flowing, and valuable for enterprises everywhere.235Views3likes0CommentsBuild Smarter with Azure HorizonDB
By: Maxim Lukiyanov, PhD, Principal PM Manager; Abe Omorogbe, Senior Product Manager; Shreya R. Aithal, Product Manager II; Swarathmika Kakivaya, Product Manager II Today, at Microsoft Ignite, we are announcing a new PostgreSQL database service - Azure HorizonDB. You can read the announcement here, and in this blog you can learn more about HorizonDB’s AI features and development tools. Azure HorizonDB is designed for the full spectrum of modern database needs - from quickly building new AI applications, to scaling enterprise workloads to unprecedented levels of performance and availability, to managing your databases efficiently and securely. To help with building new AI applications we are introducing 3 features: DiskANN Advanced Filtering, built-in AI model management, and integration with Microsoft Foundry. To help with database management we are introducing a set of new capabilities in PostgreSQL extension for Visual Studio Code, as well as announcing General Availability of the extension. Let’s dive into AI features first. DiskANN Advanced Filtering We are excited to announce a new enhancement in the Microsoft’s state of the art vector indexing algorithm DiskANN – DiskANN Advanced Filtering. Advanced Filtering addresses a common problem in vector search – combining vector search with filtering. In real-world applications where queries often include constraints like price ranges, ratings, or categories, traditional vector search approaches, such as pgvector’s HNSW, rely on multiple step retrieval and post-filtering, which can make search extremely slow. DiskANN Advanced Filtering solves this by combining filter and search into one operation - while the graph of vectors is traversed during the vector search, each vector is also checked for filter predicate match, ensuring that only the correct vectors are retrieved. Under the hood, it works in a 3-step process: first creating a bitmap of relevant rows using indexes on attributes such as price or rating, then performing a filter-aware graph traversal against the bitmap, and finally, validating and ordering the results for accuracy. This integrated approach delivers dramatically faster and more efficient filtered vector searches. Initial benchmarks show that enabling Advanced Filtering on DiskANN reduces query latency by up to 3x, depending on filter selectivity. AI Model Management Another exciting feature of HorizonDB is AI Model Management. This feature automates Microsoft Foundry model provisioning during database deployment and instantly activates database semantic operators. This eliminates tens of setup and configuration steps and simplifies the development of new AI apps and agents. AI Model Management elevates the experience of using semantic operators within PostgreSQL. When activated, it provisions key models for embedding, semantic ranking and generation via Foundry, installs and configures the azure_ai extension to enable the operators, establishes secure connections, integrates model management, monitoring and cost management within HorizonDB. What would otherwise require significant manual effort and context-switching between Foundry and PostgreSQL for configuration, management, and monitoring is now possible with just a few clicks, all without leaving the PostgreSQL environment. You can also continue to bring your own Foundry models, with a simplified and enhanced process for registering your custom model endpoints in the azure_ai extension. Microsoft Foundry Integration Microsoft Foundry offers a comprehensive technology stack for building AI apps and agents. But building modern agents capable of reasoning, acting, and collaborating is impossible without connection to data. To facilitate that connection, we are excited to announce a new PostgreSQL connector in Microsoft Foundry. The connector is designed using a new standard in data connectivity – Model Context Protocol (MCP). It enables Foundry agents to interact with HorizonDB securely and intelligently, using natural language instead of SQL, and leveraging Microsoft Entra ID to ensure secure connection. In addition to HorizonDB this connector also supports Azure Database for PostgreSQL (ADP). This integration allows Foundry agents to perform tasks like: Exploring database schemas Retrieving records and insights Performing analytical queries Executing vector similarity searches for semantic search use cases All through natural language, without compromising enterprise security or compliance. To get started with Foundry Integration, follow these setup steps to deploy your own HorizonDB (requires participation in Private Preview) or ADP and connect it to Foundry in just a few steps. PostgreSQL extension for VS Code is Generally Available We’re excited to announce that the PostgreSQL extension for Visual Studio Code is now Generally Available. This extension garnered significant popularity within the PostgreSQL community since it’s preview in May’25 reaching more than 200K installs. It is the easiest way to connect to a PostgreSQL database from your favorite editor, manage your databases, and take advantage of built-in AI capabilities without ever leaving VS Code. The extension works with any PostgreSQL whether it's on-premises or in the cloud, and also supports unique features of Azure HorizonDB and Azure Database for PostgreSQL (ADP). One of the key new capabilities is Metrics Intelligence, which uses Copilot and real-time telemetry of HorizonDB or ADP to help you diagnose and fix performance issues in seconds. Instead of digging through logs and query plans, you can open the Performance Dashboard, see a CPU spike, and ask Copilot to investigate. The extension sends a rich prompt that tells Copilot to analyze live metrics, identify the root cause, and propose an actionable fix. For example, Copilot might find a full table scan on a large table, recommend a composite index on the filter columns, create that index, and confirm the query plan now uses it. The result is dramatic: you can investigate and resolve the CPU spike in seconds, with no manual scripting or guesswork, and with no prior PostgreSQL expertise required. The extension also makes it easier to work with graph data. HorizonDB and ADP support open-source graph extension Apache AGE. This turns these services into fully managed graph databases. You can run graph queries against HorizonDB and immediately visualize the results as an interactive graph inside VS Code. This helps you understand relationships in your data faster, whether you’re exploring customer journeys, network topologies, or knowledge graphs - all without switching tools. In Conclusion Azure HorizonDB brings together everything teams need to build, run, and manage modern, AI-powered applications on PostgreSQL. With DiskANN Advanced Filtering, you can deliver low-latency, filtered vector search at scale. With built-in AI Model Management and Microsoft Foundry integration, you can provision models, wire up semantic operators, and connect agents to your data with far fewer steps and far less complexity. And with the PostgreSQL extension for Visual Studio Code, you get an intuitive, AI-assisted experience for performance tuning and graph visualization, right inside the tools you already use. HorizonDB is now available in private preview. If you’re interested in building AI apps and agents on a fully managed, PostgreSQL-compatible service with built-in AI and rich developer tooling, sign-up for Private Preview: https://aka.ms/PreviewHorizonDB.777Views4likes0CommentsAdopting Hybrid Search with Azure Cosmos DB
In an era where data accessibility and retrieval are crucial, Azure Cosmos DB introduces Hybrid Search, a cutting-edge feature that merges the capabilities of Vector Search and Full-Text Search. This integration enhances search relevance by combining semantic understanding with traditional keyword-based methods, making it ideal for diverse applications such as e-commerce, content management, and AI-driven chatbots. The blog provides a comprehensive guide on enabling and configuring Hybrid Search within Azure Cosmos DB, detailing the processes for setting up Vector Search and Full-Text Search. It also explores the underlying mechanics of Hybrid Search, which utilizes Reciprocal Rank Fusion (RRF) to combine multiple scoring functions for more accurate search results. Additionally, practical use cases and a step-by-step project example demonstrate how to implement an enterprise knowledge management system using Nest.js integrated with Azure Cosmos DB's Hybrid Search capabilities. This powerful combination offers developers and businesses the tools needed to create sophisticated, efficient, and intelligent search experiences within their applications.773Views2likes1Comment