sqlserverperformance
33 TopicsUnlocking Enterprise AI: SQL Server 2025 and NVIDIA Nemotron RAG Accelerate AI
Today, most of the world’s data still remains untapped, sitting in databases, documents, and systems across organizations. Enterprises are racing to unlock this data’s value by building the next wave of generative AI applications—solutions that can answer questions, summarize documents, and drive smarter decisions. At the heart of these innovations are retrieval-augmented generation (RAG) pipelines, which enable users to interactively engage with large amount of data that continuously evolves. Yet, as promising as RAG pipelines are, enterprises face real challenges in making them work at scale. Handling both structured and unstructured data, processing massive volumes efficiently, and ensuring privacy and security are just a few hurdles. This is where the integration between SQL Server 2025 and NVIDIA Nemotron RAG models, deployed as NVIDIA NIM microservices, comes in, offering a new approach that streamlines AI deployment and delivers enterprise-grade performance—whether you’re running workloads in the cloud or on-premises. “As AI becomes core to every enterprise, organizations need efficient and compliant ways to bring intelligence to their data,” said Joey Conway, Senior Director of Generative AI software at NVIDIA. “With SQL Server 2025’s built-in AI and NVIDIA Nemotron RAG, deployed as NIM microservices, enterprises can deploy and run AI models close to their data on premises or in the cloud without complex integration, accelerating innovation while maintaining data sovereignty and control.” Overcoming the complexity of generating embeddings at scale Customer challenge Building responsive AI applications using RAG requires converting SQL data into vector embeddings—a process that feeds huge amounts of text through complex neural networks. This is inherently parallel and compute-intensive, often creating performance bottlenecks that prevent real-time data indexing. The result? Slow applications and poor user experiences. Moreover, enterprises need flexibility. Different embedding models excel at different tasks—semantic search, recommendations, classification—and each comes with its own tradeoffs in accuracy, speed, and cost. Businesses want to mix and match models, balance premium performance with budget constraints, and stay resilient against model deprecation or API changes. Furthermore, rapid experimentation and adaptation are key to staying ahead and thus developers want models that offer flexible customization and full transparency. The Solution: SQL Server 2025 + NVIDIA Nemotron RAG SQL Server 2025 brings AI closer to your data, allowing you to natively and securely connect to any model hosted anywhere. You can generate embeddings directly in SQL using extensions to T-SQL —no need for new languages, frameworks, or third-party tools. By connecting SQL Server 2025 to the llama-nemotron-embed-1b-v2 embedding model from NVIDIA, you eliminate bottlenecks and deliver the massive throughput needed for real-time embedding generation. llama-nemotron-embed-1b-v2 is a best in class embedding model that offers multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage. This model is part of NVIDIA Nemotron RAG models, a collection of extraction, embedding, reranking models, fine-tuned with the Nemotron RAG datasets and scripts, to achieve the best accuracy. These models offer flexible customization, enabling easy fine-tuning and rapid experimentation. They also offer full transparency with open access to models, datasets, and scripts. Llama-nemotron-embed-1b-v2 is the model of choice for embedding workflows, but this high-speed inference pipeline is not limited to this model and can potentially call any optimized AI model as an NVIDIA NIM microservice, seamlessly powering every stage of the RAG pipeline. From multimodal data ingestion and advanced retrieval to reranking, all operations run directly on your data within SQL Server. Such RAG systems can be applied across a wide range of use cases, enabling intelligent, context-aware applications across industries. Customer Benefits: With GPU acceleration and built-in AI of SQL Server 2025, you can achieve optimal inference, ensuring performance that meets the demands of modern applications. Our flexible approach lets you mix and match models to suit different use cases, striking the right balance between accuracy and cost. And with open models that enable vendor flexibility and rapid adaptation, you gain resilience to stay ahead of the curve in an ever-changing AI landscape. Streamlining AI Model Deployment with Enterprise-Grade Confidence Customer Challenge Integrating advanced AI models into enterprise workflows has historically been slow and complex. Specialized teams must manage intricate software dependencies, configure infrastructure, and handle ongoing maintenance—all while navigating the risks of deploying unsupported models in mission-critical environments. This complexity slows innovation, drains engineering resources, and increases risk. The Solution: Simplified, Secure Model Deployment with NVIDIA NIM This collaboration simplifies and de-risks AI deployment. The llama-nemotron-embed-1b-v2 model is available as an NVIDIA NIM microservice for secure, reliable deployment across multiple Azure compute platforms. Prebuilt NIM containers for a broad spectrum of AI models and can be deployed with a single command for easy integration into enterprise-grade AI applications using built-in REST APIs of SQL Server 2025 and just a few lines of code, regardless where you run SQL Server workloads and NVIDIA NIM, on premises or in the cloud. NIM containers package the latest AI models together with the best inference technology from NVIDIA and the community and all dependencies into a ready-to-run container, abstracting away the complexity of environment setup so customers can spin up AI services quickly. Furthermore, NVIDIA NIM is enterprise-grade and is continuously managed by NVIDIA with dedicated software branches, rigorous validation processes, and support. As a result, developers can confidently integrate state-of-the-art AI into their data applications. This streamlined approach significantly reduces development overhead and provides the reliability needed for mission-critical enterprise systems. NVIDIA NIM containers are discoverable and deployable via Microsoft Azure AI Foundry’s model catalog. Customer Benefits Rapid deployment with minimal setup means you can start leveraging AI without specialized engineering, and SQL Server 2025 makes it even easier with built-in support for AI workloads and native REST APIs. Enterprise-grade security and monitoring ensure safe, reliable operations, while SQL Server’s integration with Entra ID and advanced compliance features provide added protection. Direct integration into SQL workflows reduces complexity and risk, and with SQL Server’s hybrid flexibility, you can run seamlessly across on-premises and cloud environments—simplifying modernization while maintaining control. Innovating Without Compromise on Security or Flexibility Customer Challenge Organizations in regulated industries often face a tough choice: adopt powerful AI or maintain strict data residency and compliance. Moving sensitive data to external services is often not an option, and many companies run AI inference workloads both in the cloud and on-premises to balance scalability, privacy, regulatory compliance, and low-latency requirements. The Solution: Flexible, Secure Integration—On-Premises and Cloud SQL Server 2025 enables organizations in regulated environments to securely integrate locally hosted AI models, ensuring data residency and compliance while minimizing network overhead. This architecture boosts throughput by keeping sensitive data on-premises and leveraging SQL Server’s native extensibility for direct model invocation. With SQL Server 2025 and Nemotron RAG, deployed as NVIDIA NIM microservices, you get the best of both worlds. This solution can be seamlessly deployed in the cloud with serverless NVIDIA GPUs on Azure Container Apps (ACA) or on-premises with NVIDIA GPUs on Azure Local. Sensitive data never leaves your secure environment, allowing you to harness the full power of Nemotron models while maintaining complete data sovereignty and meeting the strictest compliance mandates. Customer Benefits SQL Server 2025 helps you maintain compliance by supporting data residency and meeting regulatory standard requirements across regions. Sensitive data stays protected on-premises with enterprise-grade security, including consistent access controls, ledger support, and advanced encryption to minimize risk. At the same time, SQL Server’s hybrid flexibility lets you deploy AI workloads wherever they’re needed—on-premises, in the cloud, or across a hybrid environment—while leveraging built-in AI features like vector search and secure integration with locally hosted models for performance and control. Conclusion: Powering the Next Wave of Enterprise AI The collaboration between Microsoft and NVIDIA is more than a technical integration. It’s designed to help enterprises overcome the toughest challenges in AI deployment. By streamlining vector embedding and vector search, delivering enterprise-grade performance, and enabling secure, flexible integration across cloud and on-premises environments, this joint solution empowers organizations to unlock the full value of their data. Whether you’re building conversational AI, automating document analysis, or driving predictive insights, SQL Server 2025 and NVIDIA Nemotron RAG models, deployed as NIM, provide the tools you need to innovate with confidence. The future of enterprise AI is here and it’s flexible, secure, and built for real business impact. Get started today: Learn more about SQL Server 2025 and download it today Learn more about our joint solution from NVIDIA’s Technical Blog GitHub: Microsoft SQL Server 2025 and NVIDIA Nemotron RAG551Views1like0CommentsReimagining Data Excellence: SQL Server 2025 Accelerated by Pure Storage
SQL Server 2025 is a leap forward as enterprise AI-ready database, unifying analytics, modern AI application development, and mission-critical engine capabilities like security, high availability and performance from ground to cloud. Pure Storage’s all-Flash solutions are engineered to optimize SQL Server workloads, offering faster query performance, reduced latency, and simplified management. Together it helps customers accelerate the modernization of their data estate.432Views2likes1CommentAnnouncing SQLCon 2026: Better Together with FabCon!
We’re thrilled to unveil SQLCon 2026, the premier Microsoft SQL Community Conference, co-located with the Microsoft Fabric Community Conference (FabCon) from March 16–20, 2026! This year, we’re bringing the best of both worlds under one roof—uniting the vibrant SQL and Fabric communities for a truly next-level experience. Whether you’re passionate about SQL Server, Azure SQL, SQL in Fabric, SQL Tools, migration and modernization, database security, or building AI-powered apps with SQL, SQLCon 2026 has you covered. Dive into 50+ breakout sessions and 4 expert-led workshops designed to help you optimize, innovate, and connect. Why are SQLCon + FabCon better together? One registration, double the value: Register for either conference and get full access to both—mix and match sessions, keynotes, and community events to fit your interests. Shared spaces, shared energy: Enjoy the same expo hall, registration desk, conference app, and community lounge. Network with peers across the data platform spectrum. Unforgettable experiences: Join us for both keynotes at the State Farm Arena and celebrate at the legendary attendee party at the Georgia Aquarium. Our goal is to reignite the SQL Community spirit—restoring the robust networks, friendships, and career-building opportunities that make this ecosystem so special. SQLCon is just the beginning of a renewed commitment to connect at conferences, user groups, online, and at regional events. Early Access Pricing Extended! Register by November 14th and save $200 with code SQLCMTY200. Register Now! Want to share your expertise? The Call for Content is open until November 20th for both conferences! Let’s build the future of data—together. See you at SQLCon + FabCon!2.6KViews8likes0CommentsMicrosoft and PASS Summit are going on tour!
Microsoft and PASS Summit are going on tour! Join us for this two-day event Sep 15 - 16 in Dallas, TX, together with our partners Pure Storage. With keynotes, pre-cons, general sessions and more this event will cover all things SQL Server, Microsoft Fabric, Azure SQL and more. Connect with Microsoft and other community experts. Registertoday and take a peek into the agenda below: Pre-Con: Deep-Dive Workshop: Accelerating AI, Performance and Resilience with SQL Server 2025 SQL Server 2025 delivers powerful new capabilities for AI, developer productivity, and performance all ready to run at scale on modern infrastructure. In this 1/2 day deep dive workshop, you'll get a first look at what's new and how to put it to work immediately. Learn directly from Microsoft and Pure Storage experts how to harness SQL Server’s vector capabilities and walk through real-world demos covering semantic search, change event streaming, and using external tables to manage vector embeddings. You'll also see how new REST APIs securely expose SQL Server's internals for automation and observability including snapshots, performance monitoring, and backup management. The workshop wraps up with insights into core engine enhancements: optimized locking, faster backups using ZSTD compression all running on a modern Pure Storage foundation that brings scale and resilience to your data platform. Whether you're a DBA, developer, or architect, this session will equip you with practical strategies for harnessing Microsoft SQL Server 2025 and Pure Storage to accelerate your organization's AI and data goals. Day 2: Building an LLM Chatbot on Your Laptop - No Cloud Required! Want to build a chatbot that can answer questions using your own data? This session shows you how, and no cloud is required! In this session, you will learn how to build a Retrieval-Augmented Generation (RAG) chatbot using open-source tools combined with SQL Server for vector storage - all from your laptop. We will cover topics such as LLM fundamentals, embeddings, similarity search, and how to integrate LLMs with your own data. By the end of the session, you will have a working chatbot and practical knowledge of how AI can enhance your data platform and a new way to elevate your SQL Server skills with AI. Day 2: An Inside Look at Building SQL Server/Azure and Fabric Data Warehouse Have you ever wondered how Microsoft builds and releases its database engines (SQL Server, SQL Azure, Fabric Data Warehouse)? This talk goes through some of the behind-the-scenes details about how the engineering team works, how features come together, and how the engineers learn how to build the software you use to manage your data. Having the Regional Summit in Dallas gives us a unique opportunity to talk about our engineering office in Austin and how it contributed features and enhancements to recent releases. Come and learn about which SQL features are built right here in Texas! Day2: AI Ready Apps with SQL Database in Microsoft Fabric Explore how to build enterprise-grade Retrieval-Augmented Generation (RAG) systems by harnessing the power of SQL AI features, vector-based search, and Microsoft Fabric. This session delves into modern architectures that integrate structured data with large language model (LLM) capabilities to enable real-time, intelligent, and secure applications. And much more! Learn more: PASS Summit On Tour Dallas - PASS Data Community Summit Register today: PASS Summit On Tour293Views0likes0CommentsSQL Server 2025: introducing optimized Halloween protection
Executive summary Optimized Halloween protection, available in the public preview of SQL Server 2025 starting with the CTP 2.0 release, reduces tempdb space consumption and improves query performance by redesigning the way the database engine solves the Halloween problem. An example in the appendix shows CPU and elapsed time of a query reduced by about 50% while eliminating all tempdb space consumption. Update 2025-09-02 During public preview of SQL Server 2025, we identified a potential data integrity issue that might occur if optimized Halloween protection is enabled. While the probability of encountering this issue is low, we take data integrity seriously. Therefore, we temporarily removed optimized Halloween protection from SQL Server 2025, starting with the RC 0 release. The fix for this issue is in progress. In the coming months, we plan to make optimized Halloween protection available in Azure SQL Database and Azure SQL Managed Instance with the always-up-to-date update policy. Enabling optimized Halloween protection in a future SQL Server 2025 update is under consideration as well. The Halloween problem The Halloween problem, named so because it was discovered on Halloween in 1976, occurs when a data modification language (DML) statement changes data in such a way that the same statement unexpectedly processes the same row more than once. Traditionally, the SQL Server database engine protects DML statements from the Halloween problem by introducing a spool operator in the query plan, or by taking advantage of another blocking operator already present in the plan, such as a sort or a hash match. If a spool operator is used, it creates a temporary copy of the data to be modified before any modifications are made to the data in the table. While the protection spool avoids the Halloween problem, it comes with downsides: The spool requires extra resources: space in tempdb, disk I/O, memory, and CPU. Statement processing by the downstream query operators is blocked until the data is fully written into the spool. The spool adds query plan complexity that can cause the query optimizer to generate a less optimal plan. Optimized Halloween protection removes these downsides by making the spool operator unnecessary. How it works When accelerated database recovery (ADR) is enabled, each statement in a transaction obtains a unique statement identifier, known as nest ID. As each row is modified by a DML statement, it is stamped with the nest ID of the statement. This is required to provide the ACID transaction semantics with ADR. During DML statement processing, when the storage engine reads the data, it skips any row that has the same nest ID as the current DML statement. This means that the query processor doesn't see the rows already processed by the statement, therefore avoiding the Halloween problem. How to use optimized Halloween protection To enable optimized Halloween protection for a database, the following prerequisites are required: ADR must be enabled on the database. The database must use compatibility level 170. The OPTIMIZED_HALLOWEEN_PROTECTION database-scoped configuration must be enabled. The OPTIMIZED_HALLOWEEN_PROTECTION database-scoped configuration is enabled by default. This means that when you enable ADR for a database using compatibility level 170, it will use optimized Halloween protection. You can ensure that a database uses optimized Halloween protection by executing the following statements: ALTER DATABASE [<database-name-placeholder>] SET ACCELERATED_DATABASE_RECOVERY = ON WITH ROLLBACK IMMEDIATE; ALTER DATABASE [<database-name-placeholder>] SET COMPATIBILITY_LEVEL = 170; ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZED_HALLOWEEN_PROTECTION = ON; You can also enable and disable optimized Halloween protection at the query level by using the ENABLE_OPTIMIZED_HALLOWEEN_PROTECTION and DISABLE_OPTIMIZED_HALLOWEEN_PROTECTION query hints, either directly in the query, or via Query Store hints. These hints work under any compatibility level and take precedence over the OPTIMIZED_HALLOWEEN_PROTECTION database-scoped configuration. When optimized Halloween protection is used for an operator in the query plan, the OptimizedHalloweenProtectionUsed property of the operator in the XML query plan is set to True. For more details, see optimized Halloween protection in documentation. Conclusion Optimized Halloween protection is another Intelligent Query Processing feature that improves query performance and reduces resource consumption when you upgrade to SQL Server 2025, without having to make any changes to your query workloads. We are looking forward to your feedback about this and other features during the public preview of SQL Server 2025 and beyond. You can leave comments on this blog post, email us at intelligentqp@microsoft.com, or leave feedback at https://aka.ms/sqlfeedback. Appendix The following script shows how optimized Halloween protection removes the protection spool in the query plan, and reduces tempdb usage, CPU time, and duration when enabled. /* Requires the WideWorldImporters sample database. SQL Server backup: https://github.com/Microsoft/sql-server-samples/releases/download/wide-world-importers-v1.0/WideWorldImporters-Full.bak Bacpac: https://github.com/Microsoft/sql-server-samples/releases/download/wide-world-importers-v1.0/WideWorldImporters-Standard.bacpac */ /* Ensure that optimized Halloween protection prerequisites are in place */ ALTER DATABASE WideWorldImporters SET ACCELERATED_DATABASE_RECOVERY = ON WITH ROLLBACK IMMEDIATE; ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 170; ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZED_HALLOWEEN_PROTECTION = ON; GO /* Validate configuration */ SELECT d.compatibility_level, d.is_accelerated_database_recovery_on, dsc.name, dsc.value FROM sys.database_scoped_configurations AS dsc CROSS JOIN sys.databases AS d WHERE dsc.name = 'OPTIMIZED_HALLOWEEN_PROTECTION' AND d.name = DB_NAME(); GO /* Create the test table and add data */ DROP TABLE IF EXISTS dbo.OptimizedHPDemo; BEGIN TRANSACTION; SELECT * INTO dbo.OptimizedHPDemo FROM Sales.Invoices ALTER TABLE dbo.OptimizedHPDemo ADD CONSTRAINT PK_OptimizedHPDemo PRIMARY KEY CLUSTERED (InvoiceID) ON USERDATA; COMMIT; GO /* Ensure that Query Store is enabled and is capturing all queries */ ALTER DATABASE WideWorldImporters SET QUERY_STORE = ON (OPERATION_MODE = READ_WRITE, QUERY_CAPTURE_MODE = ALL); /* Empty Query Store to start with a clean slate */ ALTER DATABASE WideWorldImporters SET QUERY_STORE CLEAR; GO /* Disable optimized Halloween protection as the baseline */ ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZED_HALLOWEEN_PROTECTION = OFF; GO /* Insert data selecting from the same table. This requires Halloween protection so that the same row cannot be selected and inserted repeatedly. */ BEGIN TRANSACTION; INSERT INTO dbo.OptimizedHPDemo ( InvoiceID, CustomerID, BillToCustomerID, OrderID, DeliveryMethodID, ContactPersonID, AccountsPersonID, SalespersonPersonID, PackedByPersonID, InvoiceDate, CustomerPurchaseOrderNumber, IsCreditNote, CreditNoteReason, Comments, DeliveryInstructions, InternalComments, TotalDryItems, TotalChillerItems, DeliveryRun, RunPosition, ReturnedDeliveryData, ConfirmedDeliveryTime, ConfirmedReceivedBy, LastEditedBy, LastEditedWhen ) SELECT InvoiceID + 1000000 AS InvoiceID, CustomerID, BillToCustomerID, OrderID, DeliveryMethodID, ContactPersonID, AccountsPersonID, SalespersonPersonID, PackedByPersonID, InvoiceDate, CustomerPurchaseOrderNumber, IsCreditNote, CreditNoteReason, Comments, DeliveryInstructions, InternalComments, TotalDryItems, TotalChillerItems, DeliveryRun, RunPosition, ReturnedDeliveryData, ConfirmedDeliveryTime, ConfirmedReceivedBy, LastEditedBy, LastEditedWhen FROM dbo.OptimizedHPDemo; ROLLBACK; GO /* Enable optimized Halloween protection. Execute the following statement in its own batch. */ ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZED_HALLOWEEN_PROTECTION = ON; GO /* Execute the same query again */ BEGIN TRANSACTION; INSERT INTO dbo.OptimizedHPDemo ( InvoiceID, CustomerID, BillToCustomerID, OrderID, DeliveryMethodID, ContactPersonID, AccountsPersonID, SalespersonPersonID, PackedByPersonID, InvoiceDate, CustomerPurchaseOrderNumber, IsCreditNote, CreditNoteReason, Comments, DeliveryInstructions, InternalComments, TotalDryItems, TotalChillerItems, DeliveryRun, RunPosition, ReturnedDeliveryData, ConfirmedDeliveryTime, ConfirmedReceivedBy, LastEditedBy, LastEditedWhen ) SELECT InvoiceID + 1000000 AS InvoiceID, CustomerID, BillToCustomerID, OrderID, DeliveryMethodID, ContactPersonID, AccountsPersonID, SalespersonPersonID, PackedByPersonID, InvoiceDate, CustomerPurchaseOrderNumber, IsCreditNote, CreditNoteReason, Comments, DeliveryInstructions, InternalComments, TotalDryItems, TotalChillerItems, DeliveryRun, RunPosition, ReturnedDeliveryData, ConfirmedDeliveryTime, ConfirmedReceivedBy, LastEditedBy, LastEditedWhen FROM dbo.OptimizedHPDemo; ROLLBACK; GO /* Examine query runtime statistics and plans for the two executions of the same query. */ SELECT q.query_id, q.query_hash, qt.query_sql_text, p.plan_id, rs.count_executions, rs.avg_tempdb_space_used * 8 / 1024. AS tempdb_space_mb, FORMAT(rs.avg_cpu_time / 1000., 'N0') AS avg_cpu_time_ms, FORMAT(rs.avg_duration / 1000., 'N0') AS avg_duration_ms, TRY_CAST(p.query_plan AS xml) AS xml_query_plan FROM sys.query_store_runtime_stats AS rs INNER JOIN sys.query_store_plan AS p ON rs.plan_id = p.plan_id INNER JOIN sys.query_store_query AS q ON p.query_id = q.query_id INNER JOIN sys.query_store_query_text AS qt ON q.query_text_id = qt.query_text_id WHERE q.query_hash = 0xC6ADB023512BBCCC; /* For the second execution with optimized Halloween protection: 1. tempdb space usage is zero 2. CPU time and duration are reduced by about 50% 3. The Clustered Index Insert operator in the query plan has the OptimizedHalloweenProtection property set to True */2.8KViews2likes0CommentsSQL Server on Linux Now Supports cgroup v2
Hello, Linux + SQL Server Fans! If you’re running SQL Server on Linux, here’s some great news - cgroup v2 is now supported in SQL Server 2025 preview and SQL Server 2022 CU 20. This enhancement brings more precise and reliable resource management, especially for containerized deployments in environments like Docker, Kubernetes, and OpenShift. Why cgroup v2 Matters In Linux, control groups (cgroups) are a kernel feature that allows you to allocate, prioritize, and limit system resources such as CPU and memory. With cgroup v2, these capabilities are more unified and robust, offering better enforcement and visibility compared to the older version. To know more please visit: Control Group v2 — The Linux Kernel documentation. How to Check Your cgroup Version Run this command: stat -fc %T /sys/fs/cgroup/ If it returns cgroup2fs, you're using cgroup v2. If it returns cgroup, you're on cgroup v1. How to switch to cgroup v2: The simplest path is choosing a distribution that supports cgroup v2 out of the box. To switch manually: Add to GRUB config: systemd.unified_cgroup_hierarchy=1 Run: sudo update-grub SQL Server and Cgroupv2: Before this update, users running SQL Server containers on Kubernetes clusters (e.g., Azure Kubernetes Service version 1.25 and above) reported that SQL Server did not respect memory limits set via container specs. This led to issues like Out of Memory (OOM) errors, even when limits were properly configured. Here is an example: - For a standard D4ds_v5 machine that has 4 CPUs and 16 GB of RAM as shown in below screenshot If you check the SQL Server errorlog before SQL Server 2022 CU 20: You would observe that SQL Server can see 80% (12792 MB) of the overall memory (16 GB) available on the worker node of the Kubernetes cluster, even though you have configured the 3 Gi memory limit. You ask why just 80% then learn more about the memory.memorylimit, which by default is configured to 80% of the physical memory, to prevent out of memory (OOM) errors. For details please refer: Configure SQL Server Settings on Linux - SQL Server | Microsoft Learn. Below is the errorlog snippet and the container configuration: “Microsoft SQL Server 2022 (RTM-CU19) (KB5054531) - 16.0.4195.2 (X64) Apr 18 2025 13:42:14 Copyright (C) 2022 Microsoft Corporation Developer Edition (64-bit) on Linux (Ubuntu 22.04.5 LTS) <X64> .... .... Detected 12792 MB of RAM, 12313 MB of available memory, 12313 MB of available page file. This is an informational message; no user action is required” - This was despite the container being configured with a 3Gi memory limit: kubectl get pod mssql-0 -n cgrouptest -o jsonpath="{.status.qosClass}`n{.spec.containers[*].resources.limits.memory}" Guaranteed 3Gi Even though users limited the memory for SQL Server containers to 3 GB, SQL Server was still able to see the entire physical memory on the host and tried using that ending up in OOM crashes. But, With the release of SQL Server 2025 preview and SQL Server 2022 CU 20, the memory limits are now correctly enforced. Here's what the error log looks like with cgroup v2 support: “Microsoft SQL Server 2022 (RTM-CU20) (KB5059390) - 16.0.4205.1 (X64) Jun 13 2025 13:38:45 Copyright (C) 2022 Microsoft Corporation Developer Edition (64-bit) on Linux (Ubuntu 22.04.5 LTS) <X64> .. .. Detected 2458 MB of RAM, 1932 MB of available memory, 1932 MB of available page file. This is an informational message; no user action is required” The limits are same as previous case with memory limited to 3 GB as shown below, SQL Server ends up with 80% of 3 GB as the limit that is 2458 MB as printed in the errorlog. Below is the container configuration with a 3Gi memory limit: kubectl get pod mssql-latest-0 -n cgrouptest -o jsonpath="{.status.qosClass}`n{.spec.containers[*].resources.limits.memory}" Guaranteed 3Gi Learn More SQL Server on Linux Overview SQL Server 2025 Release Notes Deploy a SQL Server Linux container to kubernetes Deploy SQL Server on OpenShift or Kubernetes Understanding Cgroup v2on Kubernetes Understanding Cgroups on RHEL Wrapping Up With the introduction of cgroup v2 support in SQL Server 2025 and SQL Server 2022 CU 20, Linux-based deployments gain a powerful tool for smarter resource management. Whether you're running SQL Server in containers or on bare metal, cgroup v2’s unified hierarchy, simplified configuration, and real-time pressure metrics offer a more predictable and efficient way to enforce Quality of Service. From isolating workloads in Kubernetes to dynamically tuning performance under contention, this enhancement empowers DBAs and platform engineers to deliver consistent service levels across diverse environments. As SQL Server continues to evolve on Linux, embracing cgroup v2 is a strategic step toward building resilient, high-performance data platforms. Thanks, Engineering: Andrew Carter (Lead), Nicolas Blais-Miko Product Manager: Attinder Pal Singh and Amit Khandelwal371Views0likes0CommentsSQL Server 2025: introducing tempdb space resource governance
An old problem Since the early days of SQL Server, DBAs had to contend with a common problem – running out of space in the tempdb database. It has always struck me as odd that all I need to cause an outage on an SQL Server instance is access to the server where I can create a temp table that fills up tempdb, and there is no permission to stop me. - Erland Sommarskog (website), an independent SQL Server consultant and a Data Platform MVP Because tempdb is used for a multitude of purposes, the problem can occur without any explicit user action such as creating a temporary table. For example, executing a reporting query that spills data to tempdb and fills it up can cause an outage for all workloads using that SQL Server instance. Over the years, many DBAs developed custom solutions that monitor tempdb space and take action, for example kill sessions that consume a large amount of tempdb space. But that comes with extra effort and complexity. I have spent more hours in my career than I can count building solutions to manage TempDB space. Even with immense time and effort, there were still quirks and caveats that came up that created challenges - especially in multi-tenant environments with lots of databases and the noisy-neighbor problem. - Edward Pollack (LinkedIn), Data Architect at Transfinder and a Data Platform MVP A new solution in the SQL Server engine SQL Server 2025 brings a new solution for this old problem, built directly into the database engine. Starting with the CTP 2.0 release, you can use resource governor, a feature available since SQL Server 2008, to enforce limits on tempdb space consumption. We rely on Resource Governor to isolate workloads on our SQL Server instances by controlling CPU and memory usage. It helps us ensure that the core of our trading systems remains stable and runs with predictable performance, even when other parts of the systems share the same servers. - Ola Hallengren (website), Chief Data Platforms Engineer at Saxo Bank and a Data Platform MVP Similarly, if you have multiple workloads running on your server, each workload can have its own tempdb limit, lower than the maximum available tempdb space. This way, even if one workload hits its limit, other workloads continue running. Here's an example that limits the total tempdb space consumption by queries in the default workload group to 17 GB, using just two T-SQL statements: ALTER WORKLOAD GROUP [default] WITH (GROUP_MAX_TEMPDB_DATA_MB = 17408); ALTER RESOURCE GOVERNOR RECONFIGURE; The default group is used for all queries that aren’t classified into another workload group. You can create workload groups for specific applications, users, etc. and set limits for each group. When a query attempts to increase tempdb space consumption beyond the workload group limit, it is aborted with error 1138, severity 17, Could not allocate a new page for database 'tempdb' because that would exceed the limit set for workload group 'workload-group-name'. All other queries on the server continue to execute. Setting the limits You might be asking, “How do I know the right limits for the different workloads on my servers?” No need to guess. Tempdb space usage is tracked for each workload group at all times and reported in the sys.dm_resource_governor_workload_groups DMV. Usage is tracked even if no limits are set for the workload groups. You can establish representative usage patterns for each workload over time, then set the right limits. You can reconfigure the limits over time if workload patterns change. For example, the following query lets you see the current tempdb space usage, peak usage, and the number of times queries were aborted because they would otherwise exceed the limit per workload group: SELECT group_id, name, tempdb_data_space_kb, peak_tempdb_data_space_kb, total_tempdb_data_limit_violation_count FROM sys.dm_resource_governor_workload_groups; Peak usage and the number of query aborts (limit violations) are tracked since server restart. You can reset these and other resource governor statistics to restart tracking at any time and without restarting the server by executing ALTER RESOURCE GOVERNOR RESET STATISTICS; What about the transaction log? The limits you set for each workload group apply to space in the tempdb data files. But what about the tempdb transaction log? Couldn’t a large transaction fill up the log and cause an outage? This is where another feature in SQL Server 2025 comes in. You can now enable accelerated database recovery (ADR) in tempdb to get the benefit of aggressive log truncation, and drastically reduce the possibility of running out of log space in tempdb. For more information, see ADR improvements in SQL Server 2025. Learn more For more information about tempdb space resource governance, including examples, best practices, and the details of how it works, see Tempdb space resource governance in documentation. If you haven’t used resource governor in SQL Server before, here’s a good starting point: Tutorial: Resource governor configuration examples and best practices. Conclusion SQL Server 2025 brings a new, built-in solution for the age-old problem of tempdb space management. You can now use resource governor to set limits on tempdb usage and avoid server-wide outages because tempdb ran out of space. We are looking forward to your feedback on this and other SQL Server features during the public preview of SQL Server 2025 and beyond. You can leave comments on this blog post, email us at sql-rg-feedback@microsoft.com, or leave feedback at https://aka.ms/sqlfeedback.1.3KViews4likes0Comments