Pre-requisites
This article expects that the viewer is aware of various Azure Data Store options and is considering to leverage the Azure PostgreSQL as a PaaS service. If you are still in the deciding phase whether to deploy PostgreSQL in a VM or as a service, I'd recommend you read this article.
Overview
Azure PostgreSQL(PaaS) as a service provides various options for deploying/migrating your database. However the choice between Single/Flexible/Hyperscale is critical prior to execution and should be made wisely. It determines the compatibility of your application after deployment/migration and crucial to meet your condition of success. At a broad level it is governed by your use case and features each of the options in PaaS mode that are available. Azure PostgreSQL PaaS currently offers the following modes of spinning PostgreSQL databases as a service.
This article is written with the intent to help you with this choice. However please note Azure is an ever evolving world and we highly recommend to reference any hyperlinks provided (especially the Important Check). This helps for you to be updated with the latest additions or features
|
Single Server |
Flexible Server |
Hyperscale (Citus) |
Version Support |
Supported versions: 10, and 11 |
Supported versions: 11, 12, 13, 14 |
Supported versions: 11, 12, 13 , 14 |
Architecture |
The architecture separates compute and storage. |
The architecture separates compute and storage. |
The architecture separates compute and storage in a cluster-oriented approached. |
Database Engine |
The database engine runs on a proprietary compute container(windows), while data files reside on Azure storage. |
The database engine runs on a container inside a Linux virtual machine, while data files reside on Azure storage. |
Citus cluster (Single Node) are spun up on Linux machines running RedHat OS internally. This is transparent to the user as they only work with server groups You can select the compute and storage settings independently for worker nodes and the coordinator node in a Hyperscale (Citus) server group. The storage size for provisioning refers to the capacity available to the coordinator and worker nodes in your Hyperscale (Citus) server group. The storage includes database files, temporary files, transaction logs, and the Postgres server logs |
Redundancy |
The storage maintains three locally redundant synchronous copies of the database files ensuring data durability. |
The storage maintains three locally redundant synchronous copies of the database files ensuring data durability.
If zone redundant high availability is configured, the service provisions and maintains a warm standby server across availability zone within the same Azure region. The data changes on the source server is synchronously replicated to the standby server to ensure zero data loss. (Standby replica cannot be used for read queries) |
Each Hyperscale (Citus) node has its own locally redundant storage (LRS) with three synchronous replicas maintained by Azure Storage service. |
Storage |
Underlying storage on General SSD |
Underlying storage on General SSD |
Underlying storage on General SSD |
Encryption for data at rest (FIPS 140-2) |
Yes |
Yes |
Yes |
Encryption for data in transit (TLS Support ) |
The service supports TLS versions 1.2, 1.1 and 1.0 with an ability to enforce minimum TLS version. You can choose to disable requiring TLS if your client application. |
The service enforces and supports TLS versions 1.2 only. You have an option to disable TLS\SSL for connections to Azure Database for PostgreSQL - Flexible Server |
The service supports TLS versions 1.3, 1.2, 1.1 and 1.0.The default is TLSv1.2. This cant be disabled. |
Supports Migration from Windows or Linux machines |
Yes |
Yes |
Yes |
Schema modifications |
Minimal changes only if you are using unsupported versions/features |
Minimal changes only if you are using unsupported versions/features |
Migrating an existing application to Citus sometimes requires adjusting the schema and queries for optimal performance. Citus cluster involves thinking about the data model, tooling, and choice of SQL features used. |
Migration Methodologies |
Use pg_dump and pg_restore to move a database to a server created with the new engine version. Use Azure Database Migration service for doing online upgrades. |
Use pg_dump and pg_restore to move a database to a server created with the new engine version. Use Azure Database Migration service for doing online upgrades. |
Use pg_dump and pg_restore to move small database after the database structures are validated for compatibility. Use Azure Database Migration service for doing online upgrades Big Database migration through replication. ( For this process we strongly recommend contacting us by opening a support request) |
SLA's |
At least 99.99% of the time customers will have connectivity between their Microsoft Azure Database for PostgreSQL Server- Single Server and our Internet gateway. Subject to SLA |
At least 99.9% of the time customers will have connectivity to their Microsoft Azure Database for PostgreSQL – Flexible server configured without Zone Redundant High Availability.
At least 99.99% of the time customers will have connectivity to their Microsoft Azure Database for PostgreSQL – Flexible server configured with Zone Redundant High Availability. Subject to SLA |
At least 99.95% of the time customers will have connectivity to their Microsoft Azure Database for PostgreSQL – Hyperscale (Citus) High Availability Nodes Subject to SLA
|
Connection Pooling |
The connection pooling needs to be configured outside the service. Recommended option is to leverage a proxy connection pooler service like PgBouncer or Pgpool running outside the application and connecting to the database server. |
The connection pooler is built-in with flexible server. You can optionally enable it |
The connection pooler is built-in for server groups
|
Connection Port |
5432 for database server |
6432 if connection pooling enabled 5432 for database server |
6432 for connection pooling 5432 for coordinator node's port |
Max Connections |
Max user connections : 1982 |
Max user connections : 5000 |
Hyperscale (Citus) offers a managed pgBouncer connection pooler configured for up to 2,000 simultaneous client connections. Maximum connections per node 300 for 0-3 vCores, 500 for 4-15 vCores, 1000 for 16+ vCore |
Supports AAD Auth |
Yes |
No |
No |
Domain Name |
The domain name postgres.database.azure.com is appended to the server name that you provide. |
The domain name postgres.database.azure.com is appended to the server name you provide. |
The server group name will determine the DNS name your applications use to connect, in the form server-group-name.postgres.database.azure.com. |
Server Admin |
Server admin username cant start with pg_ The admin username not allowed : azure_superuser, azure_pg_admin, admin, administrator, root, guest, or public. |
Server admin username cant start with pg_ The admin username not allowed : azure_superuser, azure_pg_admin, admin, administrator, root, guest, or public. |
For limited administrative access, Hyperscale (Citus) provides the citus role.
Server admin username cant start with pg_ The admin username not allowed : azure_superuser, azure_pg_admin, admin, administrator, root, guest, or public. |
Connection strings |
Servername needs to be mentioned in the username field For example psql --host=mydemoserver.postgres.database.azure.com --port=5432 --username=myadmin@mydemoserver --dbname=postgres |
Servername is not required for the username field. For example psql --host=mydemoserver-pg.postgres.database.azure.com --port=5432 --username=myadmin --dbname=postgres
|
Servername is not required for username field
For Example "host=c.servergroup.postgres.database.azure.com port=5432 dbname=citus user=citus password={your_password} sslmode=require"
|
DNS Zone |
Private DNS Zone can be configured |
Flexible service creates DNS records such as serername.postgres.database.azure.com in the selected private DNS zone for the server with a private endpoint Private DNS Zone can be configured |
Hyperscale (Citus) service creates DNS records such as c.privatelink.mygroup01.postgres.database.azure.com in the selected private DNS zone for each node with a private endpoint. Private DNS Zone can be configured |
Public Access can be configured |
Yes |
Yes |
Yes |
Private Access |
Private Link allows you to create private endpoints for Azure Database for PostgreSQL - Single server to bring it inside your Virtual Network (VNet). |
With Private access (VNet Integration), you can deploy your flexible server into your own Virtual Network (VNet) Note : When you create an Azure Database for PostgreSQL - Flexible Server instance (a flexible server), you must choose one of the following networking options: Private access (VNet integration) or Public access (allowed IP addresses). You can't change your networking option after the server is created. |
Private access allows resources in an Azure virtual network to connect securely and privately to nodes in a Hyperscale (Citus) server group.
|
In-built databases |
During server creation An empty database, postgreSQL, is created. You can optionally also create more databases. You will also find the below database created by default
|
During server creation An empty database, postgreSQL, is created. You will also find the below database created by default
|
The Azure portal provides credentials to connect to exactly one database per Hyperscale (Citus) server group, the citus database. |
Cross-region replicas |
Cross region replicas can be configured which are updated asynchronously to provide a read-only server You can replicate from the primary server to up to five replicas |
Flexible server does not support cross-region read replicas. Depending on the type of workload, you may choose to use logical replication feature for cross-region disaster recovery (DR) purpose. |
Cross region replicas can be configured which are updated updated asynchronously to provide a read-only server group. You can replicate from the primary server to an unlimited number of replicas. |
Start and Stop Server/Server-groups |
No |
Yes You can manually stop the server. Stopped servers will automatically start again after seven days |
No |
Restarts |
Yes Server restarts can be performed from portal. The time required to complete a restart depends on the PostgreSQL recovery process. |
Yes Server restarts can be performed from portal. When configured with high availability, both the primary and the standby servers are restarted at the same time. The time required to complete a restart depends on the PostgreSQL recovery process. |
Yes Hyperscale(Citus) server group can be restarted from the portal. Restarting the server group applies to all nodes; you can't selectively restart individual nodes. The restart applies to the PostgreSQL server processes in the nodes. The time required to complete a restart depends on the PostgreSQL recovery process. |
Scaling |
You cannot scale from Basic to GP(General Purpose) or MO(Memory Optimized). You can scale in GP and MO tiers |
You can scale across all tiers. |
You can scale from Basic to Standard but not the other way around. |
Decreasing Storage |
No |
No |
No Storage on coordinator and worker nodes can be scaled up (increased) but can't be scaled down (decreased). |
Maximum Storage |
Max. Storage size : 1 TB (Basic), 4 TB or 16 TB (GP,MO). Note: Not all regions support 16 TB. |
Max storage size : 16 TB |
Max storage size could go up to 2TB per coordinator and worker node. You can go up to 20 worker nodes (20*2TB = 40TB's) |
Automated migration to new versions of database engine |
Alternative : Take a dump and restore it to a server that was created with the new engine version. |
Alternative : Take a dump and restore it to a server that was created with the new engine version. |
|
Backups Types |
Azure Database for PostgreSQL takes backups of the data files and the transaction log. Azure Database for PostgreSQL automatically creates server backups and stores them in user configured locally redundant or geo-redundant storage. |
Flexible Server takes snapshot backups of data files and stores them securely in zone-redundant storage or locally redundant storage, depending on the region. (General Purpose and Memory Optimized tiers). The server also backs up transaction logs when the write-ahead log (WAL) file is ready to be archived. |
Hyperscale takes snapshot backups of data files and the database transaction log. It automatically creates backups of each node and stores them in locally redundant storage. In Azure regions that support availability zones, backup snapshots are stored in three availability zones. As long as at least one availability zone is online, the Hyperscale (Citus) server group is restorable. |
Backup Frequency |
Servers with up to 4-TB storage Full backups occur once every week. Differential backups occur twice a day. Transaction log backups occur every five minutes.
Servers with up to 16-TB storage The first full snapshot backup is scheduled immediately after a server is created. Subsequent snapshot backups are differential backups only. Differential snapshot backups do not occur on a fixed schedule. In a day, three differential snapshot backups are performed. Transaction log backups occur every five minutes. |
The first snapshot backup is scheduled immediately after a server is created. Snapshot backups are currently taken once daily.
Transaction log backups happen at varied frequencies, depending on the workload and when the WAL file is filled and ready to be archived. In general, the delay (recovery point objective, or RPO) can be up to 15 minutes. |
The first snapshot backup is scheduled immediately after a server is created. Snapshot backups are currently taken once daily.
|
Backup Files access |
These backup files cannot be exported. The backups can only be used for restore operations in Azure Database for PostgreSQL. You can use pg_dump to copy a database. |
These backup files cannot be exported. The backups can only be used for restore operations in Azure Database for PostgreSQL. You can use pg_dump to copy a database. |
These backup files cannot be exported. . They may only be used for restore operations in Azure Database for PostgreSQL.
|
Backup Encryption |
All backups are encrypted using AES 256-bit encryption. |
All backups are encrypted using AES 256-bit encryption. |
All backups are encrypted using AES 256-bit encryption. |
Retention |
You can select a retention period of 7 to 35 days. The default retention period is 7 days |
You can select a retention period of 7 to 35 days. The default retention period is 7 days |
The retention period is currently 35 days for all server groups. |
Point-in-time Restore |
Azure Database for PostgreSQL allows you to restore the server back to a point-in-time and into to a new copy of the server in the same region as your original server. |
Azure Database for PostgreSQL allows you to restore the server back to a point-in-time and into to a new copy of the server. It gives you the flexibility to choose the availability zone and region |
Hyperscale (Citus) server groups can be restored to either the earliest backup or to a custom restore point within your retention period in the same region as your original server. |
Restore Fields |
The new server created during a restore does not have the firewall rules or VNet service endpoints that existed on the original server. The subscription, location and resource group cannot be changed. These fields are non-editable |
The new server created during a restore will be subject to the NSG rules over the subnet on which it is restored to. The subscription, location and resource group cannot be changed. These fields are non-editable
|
The new server group does not have the firewall rules or VNet service endpoints that existed on the original server group.The subscription, location and resource group cannot be changed. These fields are non-editable. The server group has the original's configuration: the same number of nodes, number of vCores, storage size, user roles, PostgreSQL version, and version of the Citus extension. |
Geo-Restores |
Yes, Preview |
No. The model currently does not support GRS backups. Hence the Geo-restores are currently not supported |
|
USE CASES |
|
|
(Database size > 100GB or Growth Rate of Data = 10X annually/quarterly/monthly) |
|
Basic - Workloads that require light compute and I/O performance. Examples include servers used for development or testing or small-scale infrequently used applications. |
Burstable - Best for workloads that don’t need the full CPU continuously.
|
Citus is inappropriate for workloads that don’t need a powerful distributed database and can't incorporate features like sharding, a distributed SQL engine, reference tables, and distributed tables Basic Tier - It allows you to run Hyperscale (Citus) on a single node. With Basic tier, you are scale-out ready. Basic tier also gives you an easy way to try Hyperscale (Citus). Basic (2 to 64 vCores, up to 256 GiB memory), coordinator and worker node unified - Best for starting out, or dev/test |
|
General Purpose- Most business workloads that require balanced compute and memory with scalable I/O throughput. Examples include servers for hosting web and mobile apps and other enterprise applications. |
General Purpose- Most business workloads that require balanced compute and memory with scalable I/O throughput. Examples include servers for hosting web and mobile apps and other enterprise applications. |
Standard Tier - Standard tier for Hyperscale (Citus) gives you a distributed Postgres cluster (called a server group) with 1 coordinator and 2 or more worker nodes. You can easily add more worker nodes with zero downtime. Best suited for Multi-Tenant Database B2B applications that already have the notion of a tenant, customer, or account built into their data model. Good fit for Real-Time Analytics which heavily rely on large datasets |
|
Memory Optimized - High-performance database workloads that require in-memory performance for faster transaction processing and higher concurrency. Examples include servers for processing real-time data and high-performance transactional or analytical apps. |
Memory Optimized - High-performance database workloads that require in-memory performance for faster transaction processing and higher concurrency. Examples include servers for processing real-time data and high-performance transactional or analytical apps. |
Citus currently supports all SQL clauses except:
From < https://docs.citusdata.com/en/v11.0-beta/faq/faq.html> |
Important Checks |
|
|
|
|
|
||
|
|||
|
Credit: Thanks Nathan Widdup, Samrendra Panda, Miho Yamamoto for reviews and guidance
FastTrack for Azure: Move to Azure efficiently with customized guidance from Azure engineering.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.