In the ever-evolving world of database management, optimizing storage and improving performance are constant pursuits. One significant enhancement in this space is the introduction of XML compression in the MSSQL database engine. This feature is included in SQL Server 2022, and today we are announcing its general availability (GA) in Azure SQL Database and Azure SQL Managed Instance. This new feature brings with it an array of benefits, making data storage more efficient and queries faster. In this blog post, we will delve into the technical details of this feature and explore a practical example to showcase its potential.
Understanding XML compression
XML data has become ubiquitous in modern application development, as it offers a versatile way to structure and store information. However, XML documents can be quite verbose, leading to larger storage and memory requirements and slower query performance.
The XML compression feature addresses these concerns by providing an automatic mechanism to compress XML data. It uses a compression algorithm that significantly reduces the size of XML data. By doing so, it optimizes storage usage, reduces storage IO operations, and speeds up XML-related queries.
Advantages of XML compression Feature
Storage Savings: XML compression in Azure SQL Database and Managed Instance enables substantial storage savings, especially for databases that store large amounts of XML data.
Query Performance: With compressed XML data, the execution of XML queries becomes faster due to the reduced physical size of the data. Smaller data size means less time spent on data retrieval and, in turn, improved query performance.
Transparent Compression: One of the most significant advantages of this feature is its transparency. Users do not need to modify their existing queries or applications to utilize XML compression. The database engine automatically handles the compression and decompression processes, making it seamless and hassle-free.
Indexing and Statistics: XML compression extends its benefits to indexing and statistics as well. Compressed XML data can lead to smaller indexes, thereby reducing the time it takes to build and maintain them.
Example
To demonstrate the use of XML compression, you can use the script available at the bottom of this article. Here is the quick summary of results with table level compression and XML index level compression.
Size before compression |
Size after compression |
% Compression |
|
Clustered index |
872 KB |
216 KB |
75 |
Primary XML index |
720 KB |
504 KB |
30 |
To get an estimation about object size after compression, sp_estimate_data_compression_savings can use used which supports parameter @xml_compression.
Conclusion
To conclude, the XML compression feature in Azure SQL Database and Managed Instance is a powerful addition that brings efficiency and performance improvements to XML data management. By automatically compressing XML data and providing storage savings, it enhances the overall database experience without requiring any changes to existing applications. As organizations strive for optimized cloud-based solutions, this feature proves to be a valuable asset in the modern data landscape.
Ready to experience the benefits of XML compression? Try it out now and share your thoughts about the XML compression feature in the comments below.
Demo script
Below script is based on table present in AdventureWorksLT database. More information about the database can be found at AdventureWorks sample databases - SQL Server | Microsoft Learn
-- Start Script
-- drop demo schema and table, if exists
DROP TABLE IF EXISTS demo.ProductModelXMLDemo;
DROP SCHEMA IF EXISTS demo;
GO
-- create demo schema
CREATE SCHEMA demo;
GO
-- populate into a different table.
SELECT *
INTO demo.ProductModelXMLDemo
FROM SalesLT.ProductModel;
-- add primary key to demo table.
ALTER TABLE demo.ProductModelXMLDemo
ADD PRIMARY KEY (ProductModelID);
-- only few rows have XML data
SELECT *
FROM demo.ProductModelXMLDemo
WHERE CatalogDescription IS NOT NULL;
-- how many distinct values?
SELECT COUNT(*) 'number_of_rows'
,CONVERT(VARCHAR(8000), CatalogDescription) AS 'XML_text'
FROM demo.ProductModelXMLDemo
GROUP BY CONVERT(VARCHAR(8000), CatalogDescription);
-- since there are only few rows having XML data, below steps would duplicate them to more rows.
-- create a temporary table to store non-NULL CatalogDescription values with identity
CREATE TABLE #NonNullDescriptions (
IdentityInt INT IDENTITY(1,1) NOT NULL ,
CatalogDescription XML NOT NULL
);
-- insert non-NULL values into the temporary table
INSERT INTO #NonNullDescriptions (CatalogDescription)
SELECT CatalogDescription
FROM demo.ProductModelXMLDemo
WHERE CatalogDescription IS NOT NULL;
-- update NULL rows with non-NULL values
UPDATE p1
SET p1.CatalogDescription = nd.CatalogDescription
FROM demo.ProductModelXMLDemo p1, #NonNullDescriptions nd
WHERE p1.CatalogDescription IS NULL
AND p1.ProductModelID%6 = nd.IdentityInt;
-- drop the temporary table
DROP TABLE #NonNullDescriptions;
-- now we have table populated XML data in more rows.
-- look at space used
EXEC sp_spaceused 'demo.ProductModelXMLDemo';
-- name rows reserved data index_size unused
-- ------------------- ----- -------- ---- ---------- ------
-- ProductModelXMLDemo 128 976 KB 872 KB 16 KB 88 KB
-- create primary XML index.
CREATE PRIMARY XML INDEX idx_xml_catalog_desc ON demo.ProductModelXMLDemo (CatalogDescription);
GO
EXEC sp_spaceused 'demo.ProductModelXMLDemo';
-- name rows reserved data index_size unused
-- ------------------- ----- -------- ---- ---------- ------
-- ProductModelXMLDemo 128 1752 KB 872 KB 720 KB 160 KB
--=======================================================================================
-- Scenario:1 - Uncompressed clustered index and compressed XML index
-- We can enable XML compression on index by rebuilding
ALTER INDEX idx_xml_catalog_desc ON demo.ProductModelXMLDemo REBUILD WITH (XML_COMPRESSION = ON);
GO
EXEC sp_spaceused 'demo.ProductModelXMLDemo';
-- name rows reserved data index_size unused
-- ------------------- ----- -------- ---- ---------- ------
-- ProductModelXMLDemo 128 1560 KB 872 KB 504 KB 184 KB
--====================================================================================
-- Scenario:2 Compressed table and uncompresses index
-- Enable XML compression on table and rebuild.
-- Since we created compressed XML index in last step, we will drop and create without compression option.
ALTER TABLE demo.ProductModelXMLDemo REBUILD WITH (XML_COMPRESSION = ON);
GO
DROP INDEX IF EXISTS idx_xml_catalog_desc ON demo.ProductModelXMLDemo;
GO
CREATE PRIMARY XML INDEX idx_xml_catalog_desc ON demo.ProductModelXMLDemo (CatalogDescription);
GO
EXEC sp_spaceused 'demo.ProductModelXMLDemo';
-- name rows reserved data index_size unused
-- ------------------- ----- -------- ---- ---------- ------
-- ProductModelXMLDemo 128 1104 KB 216 KB 720 KB 168 KB
--====================================================================================
-- Scenario:3 - Compressed table and compressed index
-- Enable XML compression on index and rebuild. Table was already compressed in last step
ALTER INDEX idx_xml_catalog_desc ON demo.ProductModelXMLDemo REBUILD WITH (XML_COMPRESSION = ON);
GO
EXEC sp_spaceused 'demo.ProductModelXMLDemo';
-- name rows reserved data index_size unused
-- ------------------- ----- -------- ---- ---------- ------
-- ProductModelXMLDemo 128 912 KB 216 KB 504 KB 192 KB
-- End Script