open source
113 TopicsJanuary 2026 Recap: Azure Database for PostgreSQL
We just dropped the đđŽđťđđŽđżđ đŽđŹđŽđ˛ đżđ˛đ°đŽđ˝ for Azure Database for PostgreSQL and this oneâs all about developer velocity, resiliency, and production-ready upgrades. January 2026 Recap: Azure Database for PostgreSQL ⢠PostgreSQL 18 support via Terraform (create + upgrade) ⢠Premium SSD v2 (Preview) with HA, replicas, Geo-DR & MVU ⢠Latest PostgreSQL minor version releases ⢠Ansible module GA with latest REST API features ⢠Zone-redundant HA now configurable via Azure CLI ⢠SDKs GA (Go, Java, JS, .NET, Python) on stable APIs Read the full January 2026 recap here and see whatâs new (and whatâs coming) - January 2026 Recap: Azure Database for PostgreSQLSupporting ChatGPT on PostgreSQL in Azure
Affan Dar, Vice President of Engineering, PostgreSQL at Microsoft Adam Prout, Partner Architect, PostgreSQL at Microsoft Panagiotis Antonopoulos, Distinguished Engineer, PostgreSQL at Microsoft The OpenAI engineering team recently published a blog post describing how they scaled their databases by 10x over the past year, to support 800 million monthly users. To do so, OpenAI relied on Azure Database for PostgreSQL to support important services like ChatGPT and the Developer API. Collaborating with a customer experiencing rapid user growth has been a remarkable journey. One key observation is that PostgreSQL works out of box for very large-scale points. As many in the public domain have noted, ChatGPT grew to 800M+ users before OpenAI started moving new and shardable workloads to Azure Cosmos DB. Nevertheless, supporting the growth of one of the largest Postgres deployments was a great learning experience for both of our teams. Our OpenAI friends did an incredible job at reacting fast and adjusting their systems to handle the growth. Similarly, the Postgres team at Azure worked to further tune the service to support the increasing OpenAI workload. The changes we made were not limited to OpenAI, hence all our Azure Database for PostgreSQL customers with demanding workloads have benefited. A few of the enhancements and the work that led to these are listed below. Changing the network congestion protocol to reduce replication lag Azure Database for PostgreSQL used the default CUBIC congestion control algorithm for replication traffic to replicas both within and outside the region. Leading up to one of the OpenAI launch events, we observed that several geo-distributed read replicas occasionally experienced replication lag. Replication from the primary server to the read replicas would typically operate without issues; however, at times, the replicas would unexpectedly begin falling behind the primary for reasons that were not immediately clear. This lag would not recover on its own and would grow to a point when, eventually, automation would restart the read replica. Once restarted, the read replica would once again catch up, only to repeat this cycle again within a day or less. After an extensive debugging effort, we traced the root cause to how the TCP congestion control algorithm handled a higher rate of packet drops. These drops were largely a result of high point-to-point traffic between the primary server and its replicas, compounded by the existing TCP window settings. Packet drops across regions are not unexpected; however, the default congestion control algorithm (CUBIC) treats packet loss as a sign of congestion and does an aggressive backoff. In comparison, the Bottleneck Bandwidth and Round-trip propagation time (BBR) congestion control algorithm is less sensitive to packet drops. Switching to BBR, adding SKU specific TCP window settings, and switching to fair queuing network discipline (which can control pacing of outgoing packets at hardware level) resolved this issue. Weâll also note that one of our seasoned PostgreSQL committers provided invaluable insights during this process, helping us pinpoint the issue more effectively. Scaling out with Read replicas PostgreSQL primaries, if configured properly, work amazingly well in supporting a large number of read replicas. In fact, as noted in the OpenAI engineering blog, a single primary has been able to power around 50+ replicas across multiple regions. However, going beyond this increases the chance of impacting the primary. For this reason, we added the cascading replica support to scale out reads even further. But this brings in a number of additional failure modes that need to be handled. The system must carefully orchestrate repairs around lagging and failing intermediary nodes, safely repointing replicas to new intermediary nodes while performing catch up or rewind in a mission critical setup. Furthermore, disaster recovery (DR) scenarios can require a fast rebuild of a replica and as data movement across regions is a costly and time-consuming operation, we developed the ability to create a geo replica from a snapshot of another replica in the same region. This feature avoids the traditional full data copy process, which may take hours or even days depending on the size of the data, by leveraging data for that cluster that already exists in that region. This feature will soon be available for all our customers as well. Scaling out Writes These improvements solved the read replica lag problems and read scale but did not help address the growing write scale for OpenAI. At some point, the balance tipped and it was obvious that the IOPs limits of a single PostgreSQL primary instance will not cut it anymore. As a result OpenAI decided to move new and shardable workloads to Azure Azure Cosmos DB, which is our default recommended NoSQL store for fully elastic workloads. However, some workloads, as noted in the OpenAI blog are much harder to shard. While OpenAI is using Azure Database for PostgreSQL flexible server, several of the write scaling requirements that came up have been baked into our new Azure HorizonDB offering, which entered private preview in November 2025. Some of the architectural innovations are described in the following sections. Azure HorizonDB scalability design To better support more demanding workloads, Azure HorizonDB introduces a new storage layer for Postgres that delivers significant performance and reliability enhancements: More efficient read scale out. Postgres read replicas no longer need to maintain their own copy of the data. They can read pages from the single copy maintained by the storage layer. Lower latency Write-Ahead Logging (WAL) writes and higher throughput page reads via two purpose-built storage services designed for WAL storage and Page storage. Durability and high availability responsibilities are shifted from the Postgres primary to the storage layer, allowing Postgres to dedicate more resources to executing transactions and queries. Postgres failovers are faster and more reliable. To understand how Azure HorizonDB delivers these capabilities, letâs look at its highâlevel architecture as shown in FigureâŻ1. It follows a log-centric storage model, where the PostgreSQL writeahead log (WAL) is the sole mechanism used to durably persist changes to storage. PostgreSQL compute nodes never write data pages to storage directly in Azure HorizonDB. Instead, pages and other on-disk structures are treated as derived state and are reconstructed and updated from WAL records by the data storage fleet. Azure HorizonDB storage uses two separate storage services for WAL and data pages. This separation allows each to be designed and optimized for the very different patterns of reads and writes PostgreSQL does against WAL files in contrast to data pages. The WAL server is optimized for very low latency writes to the tail of a sequential WAL stream and the Page server is designed for random reads and writes across potentially many terabytes of pages. These two separate services work together to enable Postgres to handle IO intensive OLTP workloads like OpenAIâs. The WAL server can durably write a transaction across 3 availability zones using a single network hop. The typical PostgreSQL replication setup with a hot standby (Figure 2) requires 4 hops to do the same work. Each hop is a component that can potentially fail or slow down and delay a commit. Azure HorizonDB page service can scale out page reads to many hundreds of thousands of IOPs for each Postgres instance. It does this by sharding the data in Postgres data files across a fleet of page servers. This spreads the reads across many high performance NVMe disks on each page server. 2 - WAL Writes in HorizonDB Another key design principle for Azure HorizonDB was to move durability and high availability related work off PostgreSQL compute allowing it to operate as a stateless compute engine for queries and transactions. This approach gives Postgres more CPU, disk and network to run your applicationâs business logic. Table 1 summarizes the different tasks that community PostgreSQL has to do, which Azure HorizonDB moves to its storage layer. Work like dirty page writing and checkpointing are no longer done by a Postgres primary. The work for sending WAL files to read replicas is also moved off the primary and into the storage layer â having many read replicas puts no load on the Postgres primary in Azure HorizonDB. Backups are handled by Azure Storage via snapshots, Postgres isnât involved. Task Resource Savings Postgres Process Moved WAL sending to Postgres replicas Disk IO, Network IO Walsender WAL archiving to blob storage Disk IO, Network IO Archiver WAL filtering CPU, Network IO Shared Storage Specific (*) Dirty Page Writing Disk IO background writer Checkpointing Disk IO checkpointer PostgreSQL WAL recovery Disk IO, CPU startup recovering PostgreSQL read replica redo Disk IO, CPU startup recovering PostgreSQL read replica shared storage Disk IO background, checkpointer Backups Disk IO pg_dump, pg_basebackup, pg_backup_start, pg_backup_stop Full page writes Disk IO Backends doing WAL writing Hot standby feedback Vacuum accuracy walreceiver Table 1 - Summary of work that the Azure HorizonDB storage layer takes over from PostgreSQL The shared storage architecture of Azure HorizonDB is the fundamental building block for delivering exceptional read scalability and elasticity which are critical for many workloads. Users can spin up read replicas instantly without requiring any data copies. Page Servers are able to scale and serve requests from all replicas without any additional storage costs. Since WAL replication is entirely handled by the storage service, the primaryâs performance is not impacted as the number of replicas changes. Each read replica can scale independently to serve different workloads, allowing for workload isolation. Finally, this architecture allows Azure HorizonDB to substantially improve the overall experience around high availability (HA). HA replicas can now be added without any data copying or storage costs. Since the data is shared between the replicas and continuously updated by Page Servers, secondary replicas only replay a portion of the WAL and can easily keep up with the primary, reducing failover times. The shared storage also guarantees that there is a single source of truth and the old primary never diverges after a failover. This prevents the need for expensive reconciliation, using pg_rewind, or other techniques and further improves availability. Azure HorizonDB was designed from the ground up with learnings from large scale customers, to meet the requirements of the most demanding workloads. The improved performance, scalability and availability of the Azure HorizonDB architecture make Azure a great destination for Postgres workloads.1.5KViews9likes0CommentsFrom Oracle to Azure: How Quadrant Technologies accelerates migrations
This blog was authored by Manikyam Thukkapuram, Director, Alliances & Engineering at Quadrant Technologies; and Thiwagar Bhalaji, Migration Engineer and DevOps Architect at Quadrant Technologies Over the past 20+ years, Quadrant Technologies has accelerated database modernization for hundreds of organizations. As momentum to the cloud continues to grow, a major focus for our business has been migrating on-premises Oracle databases to Azure. Weâve found that landing customers in Azure Database for PostgreSQL has been the best option both in terms of cost savings and efficiency. Azure Migrate is by far the best way to get them there. With Azure Migrate, weâre able to streamline migrations that traditionally took months, into weeks. As a Microsoft solutions partner, we help customers migrate to Azure and develop Azure-based solutions. Weâre known as âthe great modernization specialistsâ because many of our customers come to us with complex legacy footprints, outdated infrastructure, and monolithic applications that can be challenging to move to the cloud. But we excel at untangling these complex environments. And with our Q-Migrator tool, which is a wrapper around Azure Migrate, weâre able to automate and accelerate these kinds of migrations. Manual steps slowed down timelines In general, each migration we lead includes a discovery phase, a compatibility assessment, and the migration execution. In discovery, we identify every server, database, and application in a customerâs environment and map their interactions. Next, we assess each assetâs readiness for Azure and plan for optimal cloud configurations. Finally, we bring the plan to life, integrating applications, moving workloads, and validating performance. Before adopting Azure Migrate, each of these phases involved manual tasks for our team. During our discovery process we manually collected inventory and wrote custom scripts to track server relationships and database dependencies. Our engineers also had to dig through configuration files and use third-party assessment tools for aspects like VM utilization and Oracle schema. When we mapped compatibility, we worked from static data to predict cost estimates and sizing, as opposed to operating from real-time telemetry. By the time we reached the migration phase, fragmented tooling and inconsistent assessments made it difficult to maintain accuracy and efficiency. Hidden dependencies sometimes surfaced late in the process, causing unexpected rework and delays. Streamlining migrations with Azure Migrate To automate and streamline these manual tasks, we developed Q-Migrator, which is our in-house framework built around Azure Migrate. Now we can offer clients an efficient, agentless approach to discovery, assessment, and migration. As part of our on-premises database migration initiatives, we rely on Azure Migrate to seamlessly migrate a wide range of structured databases (including MySQL, Microsoft SQL Server, PostgreSQL, and Oracle) from on-premises environments to Azure IaaS and PaaS. For instance, for an on-premises PostgreSQL migration, we begin by setting up an Azure Migrate appliance in the clientâs environment to automatically discover servers, databases, and applications. That generates a complete inventory and dependency map that identifies every relationship between servers and databases. From there, we run an assessment through Azure Migrate to check compatibility, identify blockers, and right-size target environments for Azure Database for PostgreSQL. By integrating Azure Database Migration Service (DMS), we can replicate data continuously until cutover, ensuring near-zero downtime. In addition, Azure DMS provides robust telemetry and analytics for deep visibility into every stage of the process. This unified and automated workflow not only replaces manual steps but also increases reliability and accelerates delivery. Teams benefit from a consolidated dashboard for planning, execution, and performance tracking, driving efficiency throughout the migration lifecycle. 75% faster deployment, 60% cost savings Since implementing Azure Migrate, which now facilitates discovery and assessment for on-premises PostgreSQL workloads, weâve accelerated deployment by 75% compared to traditional migration methods. Weâve also reduced costs for our clients by up to 60 percent. Automated discovery alone reduces that phase by nearly 40%, and dependency mapping now takes a fraction of the effort. With the integrated dashboard in Azure Migrate we can also track progress across discovery, assessment, and migration in one place. This eliminates the need for multiple third-party tools. These efficiencies allow us to deliver complex migrations on tighter timelines without sacrificing quality or reliability. Rounding out the modernization journey with AKS As âthe great modernization specialists,â weâre often asked which is the best database for landing Oracle workloads in the cloud. From our experience, Azure Database for PostgreSQL is ideal for enterprises seeking cost-efficient and secure PostgreSQL deployments. Its managed services reduce operational overhead while maintaining high availability, compliance, and scalability. Plus, seamless integration with Azure AI services allows us to innovate for clients and keep them ahead of the curve. We also recognize that database migration is only the first step for many clientsâmodernizing the application layer delivers even greater scalability, security, and manageability. When clients come to Quadrant for a broader modernization strategy, we often use Azure Kubernetes Service (AKS) to containerize their applications and break monoliths into microservices. AKS delivers a cloud-native architecture alongside database modernization. This integration supports DevOps practices, simplifies deployments, and allows customers to take full advantage of elastic cloud infrastructure. More innovation to come Overall, Azure Migrate and Azure Database for PostgreSQL, Azure Database for MySQL, and Azure SQL Database have redefined how we deliver database modernization, and our close collaboration with Microsoft has made it possible. By engaging early with Microsoft, we can validate migration architectures and gain insights into best practices for high-performance and secure cloud deployments. Access to Microsoft experts helps us fine-tune our designs, optimize performance, and resolve complex issues quickly. Weâre also investing in AI-driven automation using Azure OpenAI in Foundry Models to analyze migration data, optimize queries, and predict performance outcomes. These innovations allow us to deliver more intelligent, adaptive solutions tailored to each customerâs unique environment.311Views2likes0CommentsSubgenAI makes AI practical, scalable, and sustainable with Azure Database for PostgreSQL
Authors: Abe Omorogbe, Senior Program Manager at Microsoft and Julia SchrĂśder Langhaeuser, VP of Product Serenity Star at SubgenAI AI agents are thriving in pilots and prototypes. However, scaling them across organizations is more difficult. A recent MIT report shows that 95 percent of projects fail to reach production. Long development cycles, lack of observability, and compliance hurdles leave enterprises struggling to deliver production-ready agents. SubgenAI, a European generative AI company that focuses on democratizing AI for businesses and governments, saw an opportunity to change this. Its flagship platform, Serenity Star, transforms AI agent development from a code-heavy, fragmented process into a streamlined, no-code experience. Built on Microsoft Azure Database for PostgreSQL, Semantic Kernel, and Microsoft Foundry, Serenity Star empowers organizations to deploy production-grade AI agents in minutes, not months. SubgenAIâs mission is to make generative AI accessible, scalable, and secure for every organization. Whether you're a startup or a multinational, Serenity Star offers the tools to build intelligent agents tailored to your business logic, with full control over data and deployment. âMany things must happen around it in the coming years. Serenity Star is designed to solve problems like data control, compliance, and decision ethicsâso companies can unleash the full potential of generative AI without compromising trust or profitabilityâ - Lorenzo Serratosa Simplifying complex AI agent development Technical and operational challenges are inherent in enterprise-wide AI agent deployments. Examples include time-consuming iteration cycles, lack of observability and cost control, security concerns, and data sovereignty requirements. Serenity Star addresses these pain points by handling the entire AI agent lifecycle while providing enterprise-grade security and compliance features. Users can focus on defining their agent's purpose and behavior rather than wrestling with technical implementation details. Its framework focuses on four essentials for AI agents: the brain (underlying model), knowledge (accessible information), behavior (programmed responses), and tools (external system integrations). This framework directly influenced the technology stack choices for Serenity Star, with Azure Database for PostgreSQL powering the knowledge retrieval and Semantic Kernel enabling flexible model orchestration. Real-world architecture in action When a user query comes in, Serenity Star uses the vector capabilities of Azure Database for PostgreSQL to retrieve the most relevant knowledge. That context, combined with the userâs input, forms a complete prompt. Semantic Kernel then routes the request to the right large language model, ensuring the agent delivers accurate and context-aware responses. Serenity Starâs native connectors to platforms such as Microsoft Teams, WhatsApp, and Google Tag Manager are also part of this architecture, delivering answers directly in the collaboration and communication tools enterprises already use every day. Figure 1: Serenity Star Architecture This routing and orchestration architecture applies to the multi-tenant SaaS deployments and dedicated customer instances offered by Serenity Star. Azure Database for PostgreSQL provides native Row-Level Security (RLS) capabilities, a key advantage for securely managing multi-tenant environments. Multi-tenant deployments allow organizations to get started quickly with lower overhead, while dedicated instances meet the needs of enterprises with strict compliance and data sovereignty requirements. Optimizing for scale The same architecture that powers retrieval, routing, and multi-channel delivery also provides a foundation for performance at scale. As adoption grows, the team continuously monitors query volume, response times, and resource efficiency across both multi-tenant and dedicated environments. To stay ahead of demand, SubgenAI actively experiments with new Azure Database for PostgreSQL features such as DiskANN for faster vector search. These optimizations keep latency low even as more users and connectors are added. The result is a platform that maintains sub-60-second response times for 99 percent of chart generations, regardless of deployment model or integration point. With this systematic approach to scaling, organizations can deploy fully functional AI agents that are connected to their preferred communication platforms in just 15 minutes instead of hours. For enterprises that have struggled with failed AI projects, Serenity Star offers not only a secure and compliant solution but also one proven to grow with their needs. Why Azure Database for PostgreSQL is a cornerstone The knowledge component of AI agents relies heavily on retrieval-augmented generation (RAG) systems that perform similarity searches against embedded content. This requires a database capable of handling efficient vector search while maintaining enterprise-grade reliability and security. SubgenAI evaluated multiple vector database options. However, Azure Database for PostgreSQL with PGVector emerged as the clear winner. There were several compelling reasons for this. One is its mature technology, which provides immediate credibility with enterprise customers. Two, the ability to scale GenAI use cases with features like DiskANN for accurate and scalable vector search. There, the flexibility and appeal of using an open-source database with a vibrant and fast-moving community. As CPO Leandro Harillo explains: âWhen we tell them their data runs on Azure Database for PostgreSQL, itâs a relief. It's a well-known technology versus other options that were born with this new AI revolution.â As an open-source relational database management system, Azure Database for PostgreSQL offers extensibility and seamless integration with Microsoftâs enterprise ecosystem. It has a trusted reputation that appeals to organizations with strict data sovereignty and compliance requirements such as those in healthcare and insurance where reliability and governance are non-negotiable. The integration with Azure's broader ecosystem also simplified implementation. With Serenity Star built entirely on Azure infrastructure, Azure Database for PostgreSQL provided seamless connectivity and consistent performance characteristics. The fast response times necessary for real-time agent interactions are the result, along with maintaining the reliability demanded by enterprise customers. Semantic Kernel: Enabling model flexibility at scale Enterprise AI success requires the ability to experiment with different models and adapt quickly as technology evolves. Semantic Kernel makes this possible, supporting over 300 LLMs and embedding models through a unified interface. With Serenity Star, organizations can make genuine choices about their AI implementations without vendor lock-in. Companies can use embedding models from OpenAI through Azure deployments, ensuring their information remains in their own infrastructure while accessing cutting-edge capabilities. If business requirements change or new models emerge, switching becomes a configuration change rather than a development project. Semantic Kernel's comprehensive connector ecosystem also accelerated SubgenAI's own development process. Interfaces for different vector databases enabled rapid prototyping and comparison during the evaluation phase. âSemantic Kernel helped us to be able to try the different ones and choose the one that fit better for us,â notes Julia Schroder, VP of Product. The SubgenAI team has also extended Semantic Kernel to support more features in Azure Database for PostgreSQL, which is easier because of how well-known and popular PostgreSQL is. SubgenAI has also contributed improvements back to the community. This collaborative approach ensures the platform benefits from the latest developments while helping advance the broader ecosystem. Proven impact of Azure Database for PostgreSQL across industries Because organizations struggle to deliver production-ready agents because of long development cycles, lack of observability, and compliance, the effectiveness of Azure Database for PostgreSQL and other Azure services is reflected in deployment metrics and customer feedback. Production-ready agents typically require around 30 iterations for basic implementations. Complex use cases demand significantly more refinement. One GenAI customer in medical education required over 200 iterations to perfect an agent that evaluates medical students through complex case analysis. Azure PostgreSQL and other Azure services support hour-long iteration cycles rather than week-long sprints, which made this level of refinement economically feasible. Cost efficiency is another significant advantage. SubgenAI provisions and configures models in Microsoft Foundry, which eliminates idling GPU resources while providing detailed cost breakdowns. Users can see exactly how tokens are consumed across prompt text, RAG context, and tool usage, enabling data-driven optimization decisions. Consulting partnerships validate the platform's market position. One consulting firm with 50,000 employees is delighted with the easier implementation, faster deployment, and reliable production performance. Conclusion The combination of Azure Database for PostgreSQL and Semantic Kernel has enabled SubgenAI to address the fundamental challenges that cause 95 percent of enterprise AI projects to fail. Organizations using Serenity Star bypass the traditional barriers of lengthy development cycles, limited observability, and compliance hurdles that typically derail AI initiatives. The platform's architecture delivers measurable results, including a 50 percent reduction in coding time, support for complex agents requiring 200+ iterations, and deployment capabilities that compress months-long projects into 15-minute implementations. Azure Database for PostgreSQL provides the enterprise-grade foundation that customers in regulated industries require, while Semantic Kernel ensures organizations retain flexibility as AI technology evolves. This technological partnership creates a reliable pathway for companies to deploy production-ready AI agents without sacrificing data sovereignty or operational control. Through the reliability of Azure Database for PostgreSQL and the flexibility of Semantic Kernel, Serenity Star delivers an enterprise-ready foundation that makes AI practical, scalable, and sustainable.799Views1like0CommentsPostgreSQL 18 is now GA on Azure Database for PostgreSQL
PostgreSQL 18 is now GA on Azure Database for PostgreSQL Excited to announce that Flexible Server now offers full general availability of #PostgreSQL18 - the fastest GA weâve ever shipped after community release. This means: đ¸đ°đłđđĽđ¸đŞđĽđŚ đłđŚđ¨đŞđ°đŻ đ´đśđąđąđ°đłđľ, đŞđŻ-đąđđ˘đ¤đŚ đŽđ˘đŤđ°đł-đˇđŚđłđ´đŞđ°đŻ đśđąđ¨đłđ˘đĽđŚđ´ (đđ11-đđ17 â đđ18), đđŞđ¤đłđ°đ´đ°đ§đľ đđŻđľđłđ˘ đđ đ˘đśđľđŠđŚđŻđľđŞđ¤đ˘đľđŞđ°đŻ, and đđśđŚđłđş đđľđ°đłđŚ đ¸đŞđľđŠ đđŻđĽđŚđš đđśđŻđŞđŻđ¨. Check out the full blog for a deep dive đhttps://techcommunity.microsoft.com/blog/adforpostgresql/postgresql-18-now-ga-on-azure-postgres-flexible-server/4469802 #Microsoft #Azure #Cloud #Database #Postgres #PG18Announcing Azure HorizonDB
Affan Dar, Vice President of Engineering, PostgreSQL at Microsoft Charles Feddersen, Partner Director of Program Management, PostgreSQL at Microsoft Today at Microsoft Ignite, weâre excited to unveil the preview of Azure HorizonDB, a fully managed Postgres-compatible database service designed to meet the needs of modern enterprise workloads. The cloud native architecture of Azure HorizonDB delivers highly scalable shared storage, elastic scale-out compute, and a tiered cache optimized for running cloud applications of any scale. Postgres is transforming industries worldwide and is emerging as the foundation of modern data solutions across all sectors at an unprecedented pace. For developers, it is the database of choice for building new applications with its rich set of extensions, open-source API, and expansive ecosystems of tools and libraries. At the same time, but at the opposite end of the workload spectrum, enterprises around the world are also increasingly turning to Postgres to modernize their existing applications. Azure HorizonDB is designed to support applications across the entire workload spectrum from the first line of code in a new app to the migration of large-scale, mission-critical solutions. Developers benefit from the robust Postgres ecosystem and seamless integration with Azureâs advanced AI capabilities, while enterprises can gain a secure, highly available, and performant cloud database to host their business applications. Whether youâre building from scratch or transforming legacy infrastructure, Azure HorizonDB empowers you to innovate and scale with confidence, today and into the future. Azure HorizonDB introduces new levels of performance and scalability to PostgreSQL. The scale-out compute architecture supports up to 3,072 vCores across primary and replica nodes, and the auto-scaling shared storage supports up to 128TB databases while providing sub-millisecond multi-zone commit latencies. This storage innovation enables Azure HorizonDB to deliver up to 3x more throughput when compared with open-source Postgres for transactional workloads. Azure HorizonDB is enterprise ready on day one. With native support for Entra ID, Private Endpoints, and data encryption, it provides compliance and security for sensitive data stored in the cloud. All data is replicated across availability zones by default and maintenance operations are transparent with near-zero downtime. Backups are fully automated, and integration with Azure Defender for Cloud provides additional protection for highly sensitive data. All up, Azure HorizonDB offers enterprise-grade security, compliance, and reliability, making it ready for business use today. Since the launch of ChatGPT, there has been an explosion of new AI apps being built, and Postgres has become the database of choice due in large part to its vector index support. Azure HorizonDB extends the AI capabilities of Postgres further with two key features. We are introducing advanced filtering capabilities to the DiskANN vector index which enable query predicate pushdowns directly into the vector similarity search. This provides significant performance and scalability improvements over pgvector HNSW while maintaining accuracy and is ideal for similarity search over transactional data in Postgres. The second feature is built-in AI model management that seamlessly integrates generative, embedding, and reranking models from Microsoft Foundry for developers to use in the database with zero configuration. In addition to enhanced vector indexing and simplified model management to build powerful new AI apps, weâre also pleased to announce the general availability of Microsoftâs PostgreSQL Extension for VS Code that provides the tooling for Postgres developers to maximize their productivity. Using this extension, GitHub Copilot is context aware of the Postgres database which means less prompting and higher quality answers, and in the Ignite release, weâve added live monitoring with one-click GitHub Copilot debugging where Agent mode can launch directly from the performance monitoring dashboard to diagnose Postgres performance issues and guide users to a fix. Alpha Life Sciences are an existing Azure customers âIâm truly excited about how Azure HorizonDB empowers our AI development. Its seamless support for Vector DB, RAG, and Agentic AI allows us to build intelligent features directly on a reliable Postgres foundation. With Azure HorizonDB, I can focus on advancing AI capabilities instead of managing infrastructure complexities. Itâs a smart, forward-looking solution that perfectly aligns with how we design and deliver AI-powered applications.â Pengcheng Xu, CTO Alpha Life Sciences For enterprises that are modernizing their applications to Postgres in the cloud, the security and availability of Azure HorizonDB make it an ideal platform. However, these migrations are often complex and time consuming for large legacy codebase conversions. To simplify this and reduce the risk, weâre pleased to announce the preview of GitHub Copilot powered Oracle migration built into the PostgreSQL Extension for VS Code. Built into VS Code, teams of engineers can work with GitHub Copilot to automate the end-to-end conversion of complex database code using rich code editing, version control, text authoring, and deployment in an integrated development environment. Azure HorizonDB is the next generation of fully managed, cloud native PostgreSQL database service. Built on the latest Azure infrastructure with state-of-the-art cloud architecture, Azure HorizonDB is ready to for the most demanding application workloads. In addition to our portfolio of managed Postgres services in Azure, Microsoft is deeply invested into the open source Postgres project and is one of the top corporate upstream contributors and sponsors for the PostgreSQL project, with 19 Postgres project contributors employed by Microsoft. As a hyperscale Postgres vendor, itâs critical to actively participate in the open-source project. It enables us to better support our customers down to the metal in Azure, and to contribute our learnings from running Postgres at scale back to the community. Weâre committed to continuing our investment to push the Postgres project forward, and the team is already active in making contributions to Postgres 19 to be released in 2026. Ready to explore Azure HorizonDB? Azure HorizonDB is initially available in Central US, West US3, UK South and Australia East regions. Customers are invited to apply for early preview access to Azure HorizonDB and get hands-on experience with this new service. Participation is limited, apply now at aka.ms/PreviewHorizonDBExciting things on the horizon for PostgreSQL fans @ Ignite 2025
If youâre passionate about PostgreSQL or just curious about whatâs new, youâll want to join us at Microsoft Ignite 2025. We have a packed lineup, including sessions exploring cutting-edge features and exclusive giveaways at the PostgreSQL on Azure booth. Havenât registered yet? Nowâs the time â sign up for Microsoft Ignite and start building your schedule. Below are the must-see PostgreSQL on Azure activities, with highlights of what youâll learn at each. Add these to your agenda today. Sessions can fill up fast! Theater sessions: get a first look, fast I know from experience that attention spans can start to wane after hours-long keynotes, content-rich sessions, and conference socializing. Luckily, we have a couple of theater sessions that offer snackable but substantial information in less time than it will take to grab lunch. And theyâre located conveniently on the main conference floor. PostgreSQL on Azure: Your launchpad for intelligent apps and agents (THR705) - See how weâre making PostgreSQL AI-aware for developers to drive app and agent innovation. Includes a demo of vector similarity search, semantic operators baked into Postgres, and more! Simplifying scale-out of PostgreSQL for performant multi-tenant apps (THR706) - Discover a smarter, simpler way to scale PostgreSQL using the new Elastic Clusters feature. If your app or service is growing fast (or you want it to!), add this breakout to learn how Azure makes it easier to scale Postgres and keep it reliable. These talks are a great way to sample whatâs new and decide where to dive deeper. Plus, theyâre fun and demo-heavy, and who doesnât love a good demo? Breakout sessions: a deep dive into Postgres innovations Led by Azure product leaders and executives from organizations driving innovation backed by PostgreSQL, these breakout sessions will dive into the coolest new capabilities and real-world use cases. If you want rich, technical content and more live demos, these are for you. Build mission-critical apps that scale with PostgreSQL on Azure (BRK127) - Get a closer look at the next generation of PostgreSQL on Azure. Add this session, if youâre curious about how weâre taking Postgres to the next level to support your mission-critical AI workloads. Modern data, modern apps: Innovation with Microsoft Databases (BRK134) - Gain insider knowledge on the latest innovations across open-source, SQL, and NoSQL databases, and understand how Microsoftâs integrated database portfolio supports next-gen innovation. Nasdaq Boardvantage: AI-driven governance on PostgreSQL and AI Foundry (BRK137) - Discover how a Fortune 100 merges trust with cutting-edge AI leveraging Azureâs AI-enriched and enterprise-ready solutions, including Azure Database for PostgreSQL, Azure Database for MySQL, Azure AI Foundry, Azure Kubernetes Service (AKS), and API Management. AI-assisted migration: The path to powerful performance on PostgreSQL (BRK123) - A before and after migration journey from Oracle to Azure Database for PostgreSQL. See how the new AI-assisted migration experience delivers conversion in a few clicks and minimal downtime. The blueprint for intelligent AI agents backed by PostgreSQL (BRK130) - If youâre into AI development, this session will spark ideas on bridging the gap between raw data and AI reasoning. Youâll leave with practical tips to turbocharge your AI agents with PostgreSQL. Each breakout session is 45 minutes with live demos and Q&A, so youâll get plenty of detail and interaction with Postgres experts. Hands-on lab: experience coding with Azure superpowers Do you learn best by doing? Then our guided workshop, Build advanced AI agents with PostgreSQL (Lab515), is for you. In each 75-minute session, youâll get to create a fully functional AI-powered application backed by PostgreSQL on Azure with step-by-step guidance and expert insight on the latest innovations enabling intelligent app development. All the tools and instructions youâll need are provided. Labs have limited capacity, so be sure to reserve your seat for any of the four labs in advance. This lab is a great way to understand how all the pieces come together on Azure. And youâll gain practical skills you can apply to your own projects, whether itâs customer support bots, intelligent search in your app, or any scenario where PostgreSQL + AI collide. Expert meet-up booth: meet the team, grab some swag If you still want more Postgres (or a little Postgres souvenir), you can stop by the PostgreSQL on Azure Expert Meetup booth in the Ignite Hub. This will be our homebase on the show floor, where you can: Meet the team: Iâll be there in person, along with engineers, program managers, cloud solution architects, and advocates from our team. Whether you have a burning technical question, want to share feedback, or need guidance for your specific use case, come chat with us. Get a quick demo re-run: Sometimes a 5-minute demo is worth a thousand words, especially after youâve sat through all those words already in a keynote. The booth will have a monitor and a live environment so we can walk you through select use cases if you have questions - no appointment needed. Swag and giveaways: Ah yes, the goodies! We know conference swag is part of the fun, so weâve got some special PostgreSQL-themed giveaways at the booth. I wonât spoil all the surprises, but rumor has it there are some limited-edition items up for grabs. Network with peers: The expert meet-up area is also a magnet for PostgreSQL enthusiasts. You might bump into other attendees at the booth who are tackling similar projects or challenges. Ignite is about community as much as content, so come by and spark up a conversation. Meet you there? Ignite is our largest event of the year. We love sharing what weâve been working on and, most of all, hearing from you, the community. So, on behalf of the Azure for PostgreSQL team, thank you for your interest and support. We canât wait to show you whatâs new and to help you continue to succeed with Postgres. See you in San Francisco!520Views2likes0CommentsRunning Phi-4 Locally with Microsoft Foundry Local: A Step-by-Step Guide
In our previous post, we explored how Phi-4 represents a new frontier in AI efficiency that delivers performance comparable to models 5x its size while being small enough to run on your laptop. Today, we're taking the next step: getting Phi-4 up and running locally on your machine using Microsoft Foundry Local. Whether you're a developer building AI-powered applications, an educator exploring AI capabilities, or simply curious about running state-of-the-art models without relying on cloud APIs, this guide will walk you through the entire process. Microsoft Foundry Local brings the power of Azure AI Foundry to your local device without requiring an Azure subscription, making local AI development more accessible than ever. So why do you want to run Phi-4 Locally? Before we dive into the setup, let's quickly recap why running models locally matters: Privacy and Control: Your data never leaves your machine. This is crucial for sensitive applications in healthcare, finance, or education where data privacy is paramount. Cost Efficiency: No API costs, no rate limits. Once you have the model downloaded, inference is completely free. Speed and Reliability: No network latency or dependency on external services. Your AI applications work even when you're offline. Learning and Experimentation: Full control over model parameters, prompts, and fine-tuning opportunities without restrictions. With Phi-4's compact size, these benefits are now accessible to anyone with a modern laptopâno expensive GPU required. What You'll Need Before we begin, make sure you have: Operating System: Windows 10/11, macOS (Intel or Apple Silicon), or Linux RAM: Minimum 16GB (32GB recommended for optimal performance) Storage: At least 5 - 10GB of free disk space Processor: Any modern CPU (GPU optional but provides faster inference) Note: Phi-4 works remarkably well even on consumer hardware đ. Step 1: Installing Microsoft Foundry Local Microsoft Foundry Local is designed to make running AI models locally as simple as possible. It handles model downloads, manages memory efficiently, provides OpenAI-compatible APIs, and automatically optimizes for your hardware. For Windows Users: Open PowerShell or Command Prompt and run: winget install Microsoft.FoundryLocal For macOS Users (Apple Silicon): Open Terminal and run: brew install microsoft/foundrylocal/foundrylocal Verify Installation: Open your terminal and type. This should return the Microsoft Foundry Local version, confirming installation: foundry --version Step 2: Downloading Phi-4-Mini For this tutorial, we'll use Phi-4-mini, the lightweight 3.8 billion parameter version that's perfect for learning and experimentation. Open your terminal and run: foundry model run phi-4-mini You should see your download begin and something similar to the image below Available Phi Models on Foundry Local While we're using phi-4-mini for this guide, Foundry Local offers several Phi model variants and other open-source models optimized for different hardware and use cases: Model Hardware Type Size Best For phi-4-mini GPU chat-completion 3.72 GB Learning, fast responses, resource-constrained environments with GPU phi-4-mini CPU chat-completion 4.80 GB Learning, fast responses, CPU-only systems phi-4-mini-reasoning GPU chat-completion 3.15 GB Reasoning tasks with GPU acceleration phi-4-mini-reasoning CPU chat-completion 4.52 GB Mathematical proofs, logic puzzles with lower resource requirements phi-4 GPU chat-completion 8.37 GB Maximum reasoning performance, complex tasks with GPU phi-4 CPU chat-completion 10.16 GB Maximum reasoning performance, CPU-only systems phi-3.5-mini GPU chat-completion 2.16 GB Most lightweight option with GPU support phi-3.5-mini CPU chat-completion 2.53 GB Most lightweight option, CPU-optimized phi-3-mini-128k GPU chat-completion 2.13 GB Extended context (128k tokens), GPU-optimized phi-3-mini-128k CPU chat-completion 2.54 GB Extended context (128k tokens), CPU-optimized phi-3-mini-4k GPU chat-completion 2.13 GB Standard context (4k tokens), GPU-optimized phi-3-mini-4k CPU chat-completion 2.53 GB Standard context (4k tokens), CPU-optimized Note: Foundry Local automatically selects the best variant for your hardware. If you have an NVIDIA GPU, it will use the GPU-optimized version. Otherwise, it will use the CPU-optimized version. run the command below to see full list of models foundry model list Step 3: Test It Out Once the download completes, an interactive session will begin. Let's test Phi-4-mini's capabilities with a few different prompts: Example 1: Explanation Phi-4-mini provides a thorough, well-structured explanation! It starts with the basic definition, explains the process in biological systems, gives real-world examples (plant cells, human blood cells). The response is detailed yet accessible. Example 2: Mathematical Problem Solving Excellent step-by-step solution! Phi-4-mini breaks down the problem methodically: 1. Distributes on the left side 2. Isolates the variable terms 3. Simplifies progressively 4. Arrives at the final answer: x = 11 The model shows its work clearly, making it easy to follow the logic and ideal for educational purposes Example 3: Code Generation The model provides a concise Python function using string slicing ([::-1]) - the most Pythonic approach to reversing a string. It includes clear documentation with a docstring explaining the function's purpose, provides example usage demonstrating the output, and even explains how the slicing notation works under the hood. The response shows that the model understands not just how to write the code, but why this approach is preferred - noting that the [::-1] slice notation means "start at the end of the string and end at position 0, move with the step -1, negative one, which means one step backwards." This showcases the model's ability to generate production-ready code with proper documentation while being educational about Python idioms. To exit the interactive session, type `/bye` Step 4: Extending Phi-4 with Real-Time Tools Understanding Phi-4's Knowledge Cutoff Like all language models, Phi-4 has a knowledge cutoff date from its training data (typically several months old). This means it won't know about very recent events, current prices, or breaking news. For example, if you ask "Who won the 2024 NBA championship?" it might not have the answer. The good thing is, there's a powerful work-around. While Phi-4 is incredibly capable, connecting it to external tools like web search, databases, or APIs transforms it from a static knowledge base into a dynamic reasoning engine. This is where Microsoft Foundry's REST API comes in. Microsoft Foundry provides a simple API that lets you integrate Phi-4 into Python applications and connect it to real-time data sources. Here's a practical example: building a web-enhanced AI assistant. Web-Enhanced AI Assistant This simple application combines Phi-4's reasoning with real-time web search, allowing it to answer current questions accurately. Prerequisites: pip install foundry-local-sdk requests ddgs Create phi4_web_assistant.py: import requests from foundry_local import FoundryLocalManager from ddgs import DDGS import json def search_web(query): """Search the web and return top results""" try: results = list(DDGS().text(query, max_results=3)) if not results: return "No search results found." search_summary = "\n\n".join([ f"[Source {i+1}] {r['title']}\n{r['body'][:500]}" for i, r in enumerate(results) ]) return search_summary except Exception as e: return f"Search failed: {e}" def ask_phi4(endpoint, model_id, prompt): """Send a prompt to Phi-4 and stream response""" response = requests.post( f"{endpoint}/chat/completions", json={ "model": model_id, "messages": [{"role": "user", "content": prompt}], "stream": True }, stream=True, timeout=180 ) full_response = "" for line in response.iter_lines(): if line: line_text = line.decode('utf-8') if line_text.startswith('data: '): line_text = line_text[6:] # Remove 'data: ' prefix if line_text.strip() == '[DONE]': break try: data = json.loads(line_text) if 'choices' in data and len(data['choices']) > 0: delta = data['choices'][0].get('delta', {}) if 'content' in delta: chunk = delta['content'] print(chunk, end="", flush=True) full_response += chunk except json.JSONDecodeError: continue print() return full_response def web_enhanced_query(question): """Combine web search with Phi-4 reasoning""" # By using an alias, the most suitable model will be downloaded # to your device automatically alias = "phi-4-mini" # Create a FoundryLocalManager instance. This will start the Foundry # Local service if it is not already running and load the specified model. manager = FoundryLocalManager(alias) model_info = manager.get_model_info(alias) print("đ Searching the web...\n") search_results = search_web(question) prompt = f"""Here are recent search results: {search_results} Question: {question} Using only the information above, give a clear answer with specific details.""" print("đ¤ Phi-4 Answer:\n") return ask_phi4(manager.endpoint, model_info.id, prompt) if __name__ == "__main__": # Try different questions question = "Who won the 2024 NBA championship?" # question = "What is the latest iPhone model released in 2024?" # question = "What is the current price of Bitcoin?" print(f"Question: {question}\n") print("=" * 60 + "\n") web_enhanced_query(question) print("\n" + "=" * 60) Run It: python phi4_web_assistant.py What Makes This Powerful By connecting Phi-4 to external tools, you create an intelligent system that: Accesses Real-Time Information: Get news, weather, sports scores, and breaking developments Verifies Facts: Cross-reference information with multiple sources Extends Capabilities: Connect to databases, APIs, file systems, or any other tool Enables Complex Applications: Build research assistants, customer support bots, educational tutors, and personal assistants This same pattern can be applied to connect Phi-4 to: Databases: Query your company's internal data APIs: Weather services, stock prices, translation services File Systems: Analyze documents and spreadsheets IoT Devices: Control smart home systems The possibilities are endless when you combine local AI reasoning with real-world data access. Troubleshooting Common Issues Service not running: Make sure Foundry Local is properly installed and the service is running. Try restarting with foundry --version to verify installation. Model downloads slowly: Check your internet connection and ensure you have enough disk space (5-10GB per model). Out of memory: Close other applications or try using a smaller model variant like phi-3.5-mini instead of the full phi-4. Connection issues: Verify that no other services are using the same ports. Foundry Local typically runs on http://localhost:5272. Model not found: Run foundry model list to see available models, then use foundry model run <model-name> to download and run a specific model. Your Next Steps with Foundry Local Congratulations! You now have Phi-4 running locally through Microsoft Foundry Local and understand how to extend it with external tools like web search. This combination of local AI reasoning with real-time data access opens up countless possibilities for building intelligent applications. Coming in Future Posts In the coming weeks, we'll explore advanced topics using Hugging Face: Fine-tuning Phi models on your own data for domain-specific applications Phi-4-multimodal: Analyze images, process audio, and combine multiple data types Advanced deployment patterns: RAG systems and multi-agent orchestration Resources to Explore EdgeAI for Beginners Course: Comprehensive 36-45 hour course covering Edge AI fundamentals, optimization, and production deployment Phi-4 Technical Report: Deep dive into architecture and benchmarks Phi Cookbook on GitHub: Practical examples and recipes Foundry Local Documentation: Complete technical documentation and API reference Module 08: Foundry Local Toolkit: 10 comprehensive samples including RAG applications and multi-agent systems Keep experimenting with Foundry Local, and stay tuned as we unlock the full potential of Edge AI! What will you build with Phi-4? Share your ideas and projects in the comments below!1.9KViews1like1Comment