Efficient Partitioning of Large Tables in PostgreSQL and SQL Server using the First Letter
Published Apr 17 2023 04:57 AM 2,477 Views
Microsoft

Partitioning by First Letter:

When partitioning by first letter, we can either create a computed column on the first letter or use a range partition. However, the most elegant solution is to use only the letter itself, as it follows a predictable order regardless of the length of the key field.

 

As a bonus, I’m showing here how you can load the data in parallel to different letter-based tables and then switch in those tables to unified tables. This will enable you to load data very fast, more details in another blog post that I wrote (Efficiently Generating and Loading 1 Billion Rows into a Relational Database Content in just an hour....

 

PostgreSQL Solution:

To implement partitioning by first letter in PostgreSQL, we can use the CREATE TABLE command with the PARTITION BY clause. Here's an example code snippet to create a table partitioned by first letter:

CREATE TABLE QR_A (a VARCHAR(10) NOT NULL, b INT NOT NULL);

CREATE TABLE QR_B (a VARCHAR(10) NOT NULL, b INT NOT NULL);

ALTER TABLE QR_A ADD CONSTRAINT StartWithA

   CHECK ( a >= 'a' AND a <  'b' );

ALTER TABLE QR_B ADD CONSTRAINT StartWithB

   CHECK ( a >= 'b' AND a <  'c' );

INSERT INTO QR_A values ('a1',1);

INSERT INTO QR_B values ('b1',1);

Next, we can create the full partitioned table:

CREATE TABLE QR(a VARCHAR(10) NOT NULL, b INT NOT NULL)

PARTITION BY RANGE (a);

Last, we will attach the letter-based tables into the full table.

ALTER TABLE QR ATTACH PARTITION QR_A

    FOR VALUES FROM ('a') TO ('b');

ALTER TABLE QR ATTACH PARTITION QR_B

    FOR VALUES FROM ('b') TO ('c');

 

SQL Server Solution:

To implement partitioning by first letter in SQL Server, we can use the CREATE PARTITION FUNCTION and CREATE PARTITION SCHEME commands. Here's an example code snippet to create a partition function and scheme for a table partitioned by first letter:

CREATE PARTITION FUNCTION qr_pf (varchar(12))

AS RANGE RIGHT FOR VALUES ('a', 'b', 'c');

CREATE PARTITION SCHEME qr_ps

AS PARTITION pfSalesRight

ALL TO ([Primary]);

Then, we are creating and populating the letter-based tables:

CREATE TABLE dbo.QR_A (a VARCHAR(12),  b INT) ;

CREATE TABLE dbo.QR_B (a VARCHAR(12),  b INT) ;

ALTER TABLE dbo.QR_A

WITH CHECK ADD CONSTRAINT ckLetterA

CHECK (a IS NOT NULL AND a >= 'a' and a < 'b');

ALTER TABLE dbo.QR_B

WITH CHECK ADD CONSTRAINT ckLetterB

CHECK (a IS NOT NULL AND a >= 'b' AND a < 'c');

INSERT QR_A values ('a',1),('ab',1);

INSERT QR_B values ('ba',1),('b',1);

Finally, we can create the partitioned table and switch-in the letter-based tables with the following code snippet:

CREATE TABLE dbo.QR (

  a VARCHAR(12),

  b INT

) ON qr_ps(a);

ALTER TABLE QR_A SWITCH TO QR PARTITION 2;

ALTER TABLE QR_B SWITCH TO QR PARTITION 3;

 

Conclusion:

Partitioning large tables based on the first letter of the key field is an effective way to improve query performance and simplify index maintenance. By using the appropriate commands in PostgreSQL or SQL Server, we can easily create a partitioned table that is optimized for our use case.

Co-Authors
Version history
Last update:
‎Apr 16 2023 07:56 AM
Updated by: