Azure Database for PostgreSQL Flexible Server provides powerful partitioning capabilities that allow organizations to manage large volumes of data effectively. By partitioning tables based on time intervals or other logical segments, we can improve query performance, automate archival processes, and ensure efficient data purging—all while maintaining referential integrity across complex schemas.
Introduction
As enterprises migrate mission-critical workloads from heterogeneous databases like Oracle to Azure Database for PostgreSQL Flexible Server, managing large datasets while ensuring compliance with strict data retention policies becomes a key priority. Industries such as retail, telecommunications, transportation and logistics industry, among others, enforce stringent data retention requirements to safeguard customer information, operational efficiency and maintain service reliability. Failure to meet these standards can lead to increased risks, including data loss, inefficiencies, and potential non-compliance issues. Implementing a robust data retention and partitioning strategy in PostgreSQL helps organizations efficiently manage and archive historical data while optimizing performance.
Azure Database for PostgreSQL Flexible Server provides powerful partitioning capabilities that allow organizations to manage large volumes of data effectively. By partitioning tables based on time intervals or other logical segments, we can improve query performance, automate archival processes, and ensure efficient data purging—all while maintaining referential integrity across complex schemas.
Migration Challenges
While both Azure Database for PostgreSQL Flexible Server and Oracle support partitioning, their approaches differ significantly. In Azure Database for PostgreSQL Flexible Server, partitions are not created automatically; each partition must be explicitly defined using the CREATE TABLE statement. This means that when setting up a partitioned table, each partition must be created separately, requiring careful planning and implementation.
This blog explores best practices for implementing range partitioning in Azure Database for PostgreSQL Flexible Server, maintaining referential integrity across multiple levels, and leveraging Azure Blob Storage through the azure storage extension to efficiently archive partitioned data.
Implementing Partitioning with Example
In this example, we demonstrate a partitioning strategy within the test_part schema, where a parent table logs and a child table child_logs are structured using range partitioning with a monthly interval. Depending on specific business requirements, the partitioning strategy can also be adjusted to quarterly or yearly intervals to optimize storage and query performance.
--Create and set schema for the session 
CREATE SCHEMA test_part;
SET SEARCH_PATH=test_part;
--Create a partitioned table
 CREATE TABLE logs ( 
    id integer not null, 
    log_date date not null, 
    message text  
) PARTITION BY RANGE (log_date); 
--Add primary key constraints in parent partition table
ALTER TABLE ONLY logs ADD primary key (id,log_date); 
--Define partition for each month
CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');  
CREATE TABLE logs_2024_02 PARTITION OF logs FOR VALUES FROM ('2024-02-01') TO ('2024-03-01'); 
 
--Create a Child partition table
CREATE TABLE logs_child ( 
    id integer, 
    log_date date, 
    message text,
logs_parent_id integer
) PARTITION BY RANGE (log_date); 
 
--Add constraints 
ALTER TABLE ONLY logs_child ADD primary key (id,log_date);  
ALTER TABLE logs_child add constraint logs_child_fk foreign key(logs_parent_id,log_date) references logs(id,log_date) ON DELETE CASCADE;
 --Define a partition for each month
CREATE TABLE logs_child_2024_01 PARTITION OF logs_child FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');  
CREATE TABLE logs_child_2024_02 PARTITION OF logs_child FOR VALUES FROM ('2024-02-01') TO ('2024-03-01'); 
 
--Insert data into the parent partition table :
INSERT INTO logs (id,log_date, message) VALUES (1,'2024-01-15', 'Log message 1'); 
INSERT INTO logs (id,log_date, message) VALUES (11,'2024-01-15', 'Log message 1'); 
 
INSERT INTO logs (id,log_date, message) VALUES (2,'2024-02-15', 'Log message 2'); 
INSERT INTO logs (id,log_date, message) VALUES (22,'2024-02-15', 'Log message 2'); 
--Insert data into child partition table:
INSERT INTO logs_child values (1,'2024-01-15', 'Log message 1',1);  
INSERT INTO logs_child values (2,'2024-01-15', 'Log message 1',1); 
INSERT INTO logs_child values (5,'2024-02-15', 'Log message 2',22); 
INSERT INTO logs_child values (6,'2024-02-15', 'Log message 2',2); 
--Review data using Select 
SELECT * FROM logs;
SELECT * FROM logs_2024_01;
SELECT * FROM logs_2024_02;
SELECT * FROM logs_child_2024_01;
SELECT * FROM logs_child_2024_02;
Detach the partition:
While detaching partition follow below order, first detach the child table partition, remove FK and then detach the parent table partition.
--Remove partitioning tables
ALTER TABLE logs_child DETACH PARTITION logs_child_2024_02;
ALTER TABLE logs_child_2024_02 DROP CONSTRAINT logs_child_fk;
ALTER TABLE logs DETACH PARTITION logs_2024_02; Archive the partition table in the Azure blob storage:
The following steps demonstrate how to restore removed partition data in Azure Blob Storage using Microsoft Entra ID for authorization.
Create an azure_storage extension by following steps provided in the link here.
--Create an extension
CREATE EXTENSION azure_storage;
SET search_path=Azure_storage;
--Add account (Entra id or Authentication keys steps provided in reference document link here)
SELECT * FROM azure_storage.account_options_managed_identity('shaystorage','blob');
SELECT * FROM azure_storage.account_add('{
  "auth_type": "managed-identity",
  "account_name": "shayrgstorage",
  "account_type": "blob"
}');SET SEARCH_PATH=test_part;
COPY test_part.logs_child_2024_02
TO 'https://shayrgstorage.blob.core.windows.net/pgtable/logs_child_2024_02.csv'
WITH (FORMAT 'csv', header);View or load the archived partitioned table
--After truncating data from the partition, view data from Azure storage .csv file. When archival data needed for ready only purpose
TRUNCATE TABLE test_part.logs_child_2024_02;
SELECT * FROM test_part.logs_child_2024_02;
SELECT * FROM azure_storage.blob_get
        ('shayrgstorage'
        ,'pgtable'
        ,'logs_child_2024_02.csv'
        ,NULL::test_part.logs_child_2024_02
        ,options := azure_storage.options_csv_get(delimiter := ',' , header := 'true')
        );--Load data from .csv file to Azure database for PostgreSQL flexible server table. When need to restore the data for update purpose
TRUNCATE TABLE test_part.logs_child_2024_02;
INSERT INTO test_part.logs_child_2024_02 (id,log_date,message,logs_parent_id)
SELECT * FROM azure_storage.blob_get
        ('shayrgstorage'
        ,'pgtable'
        ,'logs_child_2024_02.csv'
        ,NULL::test_part.logs_child_2024_02
        ,options := azure_storage.options_csv_get(delimiter := ',' , header := 'true')
        ); 
Attach the partition table
--Attach the partition table to view data in partition table for operation purpose
ALTER TABLE test_part.logs attach PARTITION test_part.logs_2024_02 for values from ('2024-02-01') TO ('2024-03-01'); 
ALTER TABLE test_part.logs_child attach PARTITION test_part.logs_child_2024_02 for values from ('2024-02-01') TO ('2024-03-01'); 
Alternative Data Archival Strategies Based on Business Requirements:
- Deploy a lower sku Azure database for PostgreSQL Server such as Burstable or General Purpose service tier. Utilize the postgres_fdw extension to move the data between tables resides in different PostgreSQL databases or instances. Burstable servers are available with up to 64 TB of space. Automate database start/stop processes to minimize expenses when loading or extracting data.
- If the database size is relatively small, consider removing a partition from a partitioned table using the ALTER TABLE DETACH PARTITION command, converting it into a standalone table for easier archival.
- Use LTR options to retain the database backups for up to 10 years, depending on the business requirement and restore it when needed. For more information review here
- Utilize Azure Data Factory (ADF) Pipeline to move data into Azure storage and restore it as needed using automation scripts.
Feedback and suggestions
If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Customer Success Engineering (Ninja) Team (datasqlninja@microsoft.com). Thanks for your support!
Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.