Comparison Azure PostgreSQL (Single Server, Flexible Server, Hyperscale(Citus)

Published Jun 08 2022 02:16 PM 1,013 Views
Microsoft

Pre-requisites

This article expects that the viewer is aware of various Azure Data Store options and is considering to leverage the Azure PostgreSQL as a PaaS service. If you are still in the deciding phase whether to deploy PostgreSQL in a VM or as a service, I'd recommend you read this article.

 

Overview

Azure PostgreSQL(PaaS) as a service provides various options for deploying/migrating your database. However the choice between Single/Flexible/Hyperscale is critical prior to execution and should be made wisely. It determines the compatibility of your application after deployment/migration and crucial to meet your condition of success. At a broad level it is governed by your use case and features each of the options in PaaS mode that are available. Azure PostgreSQL PaaS currently offers the following modes of spinning PostgreSQL databases as a service.

This article is written with the intent to help you with this choice. However please note Azure is an ever evolving world and we highly recommend to reference any hyperlinks provided (especially the Important Check). This helps for you to be updated with the latest additions or features

 

 

Single Server

Flexible Server

Hyperscale (Citus)

Version Support

Supported versions: 10, and 11

Supported versions: 11, 12, 13

Supported versions: 11, preview: 12, 13 , 14

Architecture

The architecture separates compute and storage.

The architecture separates compute and storage.

The architecture separates compute and storage in a cluster-oriented approached.

Database Engine

The database engine runs on a proprietary compute container(windows), while data files reside on Azure storage.

The database engine runs on a container inside a Linux virtual machine, while data files reside on Azure storage.

Citus cluster (Single Node) are spun up on Linux machines running RedHat OS internally. This is transparent to the user as they only work with server groups

You can select the compute and storage settings independently for worker nodes and the coordinator node in a Hyperscale (Citus) server group. The storage size for provisioning refers to the capacity available to the coordinator and worker nodes in your Hyperscale (Citus) server group. The storage includes database files, temporary files, transaction logs, and the Postgres server logs

Redundancy

The storage maintains three locally redundant synchronous copies of the database files ensuring data durability.

The storage maintains three locally redundant synchronous copies of the database files ensuring data durability.

 

If zone redundant high availability is configured, the service provisions and maintains a warm standby server across availability zone within the same Azure region. The data changes on the source server is synchronously replicated to the standby server to ensure zero data loss. (Standby replica cannot be used for read queries)

Each Hyperscale (Citus) node has its own locally redundant storage (LRS) with three synchronous replicas maintained by Azure Storage service.

Storage

Underlying storage on General SSD

Underlying storage on General SSD

Underlying storage on General SSD

Encryption for data at rest (FIPS 140-2)

Yes

Yes

Yes

Encryption for data in transit (TLS Support )

The service supports TLS versions 1.2, 1.1 and 1.0 with an ability to enforce minimum TLS version. You can choose to disable requiring TLS if your client application.

The service enforces and supports TLS versions 1.2 only. You have an option to disable TLS\SSL for connections to Azure Database for PostgreSQL - Flexible Server

The service supports TLS versions 1.3, 1.2, 1.1 and 1.0.The default is TLSv1.2. This cant be disabled.

Supports Migration from  Windows or Linux machines

Yes

Yes

Yes

Schema modifications

Minimal changes only if you are using unsupported versions/features

Minimal changes only if you are using unsupported versions/features

Migrating an existing application to Citus sometimes requires adjusting the schema and queries for optimal performance. Citus cluster involves thinking about the data model, tooling, and choice of SQL features used.

Migration Methodologies

Use pg_dump and pg_restore to move a database to a server created with the new engine version.

Use Azure Database Migration service for doing online upgrades.

Use pg_dump and pg_restore to move a database to a server created with the new engine version. 

Use Azure Database Migration service for doing online upgrades.

Use pg_dump and pg_restore to move small database after the database structures are validated for compatibility.

Use Azure Database Migration service for doing online upgrades

Big Database migration through replication. ( For this process we strongly recommend contacting us by opening a support request)

SLA's

At least 99.99% of the time customers will have connectivity between their Microsoft Azure Database for PostgreSQL Server- Single Server and our Internet gateway. Subject to SLA

At least 99.9% of the time customers will have connectivity to their Microsoft Azure Database for PostgreSQL – Flexible server configured without Zone Redundant High Availability.

 

At least 99.99% of the time customers will have connectivity to their Microsoft Azure Database for PostgreSQL – Flexible server configured with Zone Redundant High Availability. Subject to SLA

At least 99.95% of the time customers will have connectivity to their Microsoft Azure Database for PostgreSQL – Hyperscale (Citus) High Availability Nodes

Subject to SLA

 

 

Connection Pooling

The connection pooling needs to be configured outside the service. Recommended option is to leverage a proxy connection pooler service like PgBouncer or Pgpool running outside the application and connecting to the database server.

The connection pooler is built-in with flexible server. You can optionally enable it

The connection pooler is built-in  for server groups

 

 

 

Connection Port

5432 for database server

6432 if connection pooling enabled

5432 for database server

6432 for connection pooling

5432 for coordinator node's port

Max Connections

Max user connections : 1982

Max user connections : 5000

Hyperscale (Citus) offers a managed pgBouncer connection pooler configured for up to 2,000 simultaneous client connections.

Maximum connections per node

300 for 0-3 vCores,

500 for 4-15 vCores,

1000 for 16+ vCore

Supports AAD Auth

Yes

No

No

Domain Name

The domain name postgres.database.azure.com is appended to the server name that you provide.

The domain name postgres.database.azure.com is appended to the server name you provide.

The server group name will determine the DNS name your applications use to connect, in the form server-group-name.postgres.database.azure.com.

Server Admin

Server admin username cant 

start with pg_ 

 The admin username not allowed : 

azure_superuser,  azure_pg_admin,  

admin,  administrator, 

root,  guest, or public.

Server admin username cant 

start with pg_ 

The admin username not allowed : 

azure_superuser,  azure_pg_admin,  

admin,  administrator, 

root,  guest, or public.

For limited administrative access, Hyperscale (Citus) provides the citus role.

 

Server admin username cant 

start with pg_ 

 The admin username not allowed : 

azure_superuser,  azure_pg_admin,  

admin,  administrator, 

root,  guest, or public.

Connection strings

Servername needs to be mentioned in the username field

For example

psql --host=mydemoserver.postgres.database.azure.com --port=5432 --username=myadmin@mydemoserver --dbname=postgres

Servername is not required for the username field.

For example

psql --host=mydemoserver-pg.postgres.database.azure.com --port=5432 --username=myadmin --dbname=postgres

 

Servername is not required for username field

 

For Example

"host=c.servergroup.postgres.database.azure.com port=5432 dbname=citus user=citus password={your_password} sslmode=require"

 

 

DNS Zone

Private DNS Zone can be configured

Flexible service creates DNS records such as serername.postgres.database.azure.com in the selected private DNS zone for the server with a private endpoint

Private DNS Zone can be configured

Hyperscale (Citus) service creates DNS records such as c.privatelink.mygroup01.postgres.database.azure.com in the selected private DNS zone for each node with a private endpoint. Private DNS Zone can be configured

Public Access can be configured

Yes

Yes

Yes

Private Access

Private Link allows you to create private endpoints for Azure Database for PostgreSQL - Single server to bring it inside your Virtual Network (VNet).

With Private access (VNet Integration), you can deploy your flexible server into your own Virtual Network (VNet)

Note :

When you create an Azure Database for PostgreSQL - Flexible Server instance (a flexible server), you must choose one of the following networking options: Private access (VNet integration) or Public access (allowed IP addresses). You can't change your networking option after the server is created.

Private access allows resources in an Azure virtual network to connect securely and privately to nodes in a Hyperscale (Citus) server group. 

 

 

In-built databases

During server creation An empty database, postgreSQL, is created. You can optionally also create more databases. You will also find the below database created by default

  • azure_maintenance - This database is used to separate the processes that provide the managed service from user actions. You do not have access to this database.
  • azure_sys - A database for the Query Store. This database does not accumulate data when Query Store is off; this is the default setting.

During server creation An empty database, postgreSQL, is created. You will also find the below database created by default

  • azure_maintenance - This database is used to separate the processes that provide the managed service from user actions. You do not have access to this database.

The Azure portal provides credentials to connect to exactly one database per Hyperscale (Citus) server group, the citus database.

Cross-region replicas

Cross region replicas can be configured which are updated asynchronously to provide a read-only server

You can replicate from the primary server to up to five replicas

Flexible server does not support cross-region read replicas. Depending on the type of workload, you may choose to use logical replication feature for cross-region disaster recovery (DR) purpose.

Cross region replicas can be configured which are updated

updated asynchronously to provide a read-only server group.

You can replicate from the primary server to an unlimited number of replicas.

Start and Stop Server/Server-groups

No

Yes

You can manually stop the server. Stopped servers will automatically start again after seven days

No

Restarts

Yes

Server restarts can be performed from portal. The time required to complete a restart depends on the PostgreSQL recovery process.

Yes

Server restarts can be performed from portal. When configured with high availability, both the primary and the standby servers are restarted at the same time. The time required to complete a restart depends on the PostgreSQL recovery process. 

Yes

Hyperscale(Citus) server group can be restarted from the portal. Restarting the server group applies to all nodes; you can't selectively restart individual nodes. The restart applies to the PostgreSQL server processes in the nodes.  

The time required to complete a restart depends on the PostgreSQL recovery process.

Scaling

You cannot scale from Basic to GP(General Purpose) or MO(Memory Optimized). You can scale in GP and MO tiers

You can scale across all tiers.

You can scale from Basic to Standard but not the other way around.

Decreasing Storage

No

No

No

Storage on coordinator and worker nodes can be scaled up (increased) but can't be scaled down (decreased).

Maximum Storage

Max. Storage size : 1 TB (Basic), 4 TB or 16 TB (GP,MO). Note: Not all regions support 16 TB.

Max storage size : 16 TB

Max storage size could go up to 2TB per coordinator and worker node. You can go up to 20 worker nodes (20*2TB = 40TB's)

Automated migration to new versions of database engine

No

Alternative : Take a dump and restore it to a server that was created with the new engine version.

No

Alternative :

Take a dump and restore it to a server that was created with the new engine version.

Yes

Backups Types

Azure Database for PostgreSQL takes backups of the data files and the transaction log. Azure Database for PostgreSQL automatically creates server backups and stores them in user configured locally redundant or geo-redundant storage. 

Flexible Server takes snapshot backups of data files and stores them securely in zone-redundant storage or locally redundant storage, depending on the region. (General Purpose and Memory Optimized tiers). The server also backs up transaction logs when the write-ahead log (WAL) file is ready to be archived. 

Hyperscale takes snapshot backups of data files and the database transaction log. It automatically creates backups of each node and stores them in locally redundant storage. In Azure regions that support availability zones, backup snapshots are stored in three availability zones. As long as at least one availability zone is online, the Hyperscale (Citus) server group is restorable.

Backup Frequency

Servers with up to 4-TB storage

Full backups occur once every week. Differential backups occur twice a day. Transaction log backups occur every five minutes.

 

Servers with up to 16-TB storage

The first full snapshot backup is scheduled immediately after a server is created. Subsequent snapshot backups are differential backups only. Differential snapshot backups do not occur on a fixed schedule. In a day, three differential snapshot backups are performed. Transaction log backups occur every five minutes.

The first snapshot backup is scheduled immediately after a server is created. Snapshot backups are currently taken once daily.

 

Transaction log backups happen at varied frequencies, depending on the workload and when the WAL file is filled and ready to be archived. In general, the delay (recovery point objective, or RPO) can be up to 15 minutes.

The first snapshot backup is scheduled immediately after a server is created. Snapshot backups are currently taken once daily.

 

 

 

Backup Files access

These backup files cannot be exported. The backups can only be used for restore operations in Azure Database for PostgreSQL. You can use pg_dump to copy a database.

These backup files cannot be exported. The backups can only be used for restore operations in Azure Database for PostgreSQL. You can use pg_dump to copy a database.

These backup files cannot be exported. . They may only be used for restore operations in Azure Database for PostgreSQL.

 

 

Backup Encryption

All backups are encrypted using AES 256-bit encryption.

All backups are encrypted using AES 256-bit encryption.

All backups are encrypted using AES 256-bit encryption.

Retention

You can select a retention period of 7 to 35 days. The default retention period is 7 days

You can select a retention period of 7 to 35 days. The default retention period is 7 days

The retention period is currently 35 days for all server groups.

Point-in-time Restore

Azure Database for PostgreSQL allows you to restore the server back to a point-in-time and into to a new copy of the server in the same region as your original server.

Azure Database for PostgreSQL allows you to restore the server back to a point-in-time and into to a new copy of the server. It gives you the flexibility to choose the availability zone and region

Hyperscale (Citus) server groups can be restored to either the earliest backup or to a custom restore point within your retention period in the same region as your original server.

Restore Fields

The new server created during a restore does not have the firewall rules or VNet service endpoints that existed on the original server.

The subscription, location and resource group cannot be changed. These fields are non-editable

The new server created during a restore does not have the firewall rules or VNet service endpoints that existed on the original server.The subscription, location and resource group cannot be changed. These fields are non-editable

 

 

The new server group does not have the firewall rules or VNet service endpoints that existed on the original server group.The subscription, location and resource group cannot be changed. These fields are non-editable. The server group has the original's configuration: the same number of nodes, number of vCores, storage size, user roles, PostgreSQL version, and version of the Citus extension.

Geo-Restores

Yes

Yes, Preview

No. The model currently does not support GRS backups. Hence the Geo-restores are currently not supported

USE CASES

 

 

(Database size > 100GB or Growth Rate of Data = 10X annually/quarterly/monthly)

 

Basic - 

Workloads that require light compute and I/O performance. Examples include servers used for development or testing or small-scale infrequently used applications.

Burstable -

Best for workloads that don’t need the full CPU continuously.

 

Citus is inappropriate for workloads that don’t need a powerful distributed database and can't incorporate features like sharding, a distributed SQL engine, reference tables, and distributed tables

Basic Tier -

It allows you to run Hyperscale (Citus) on a single node. With Basic tier, you are scale-out ready. Basic tier also gives you an easy way to try Hyperscale (Citus). Basic (2 to 64 vCores, up to 256 GiB memory), coordinator and worker node unified - Best for starting out, or dev/test

 

General Purpose-

Most business workloads that require balanced compute and memory with scalable I/O throughput. Examples include servers for hosting web and mobile apps and other enterprise applications.

General Purpose-

Most business workloads that require balanced compute and memory with scalable I/O throughput. Examples include servers for hosting web and mobile apps and other enterprise applications.

Standard Tier -

Standard tier for Hyperscale (Citus) gives you a distributed Postgres cluster (called a server group) with 1 coordinator and 2 or more worker nodes. You can easily add more worker nodes with zero downtime. Best suited for Multi-Tenant Database B2B applications that already have the notion of a tenant, customer, or account built into their data model. Good fit for Real-Time Analytics which heavily rely on large datasets

 

Memory Optimized -

High-performance database workloads that require in-memory performance for faster transaction processing and higher concurrency. Examples include servers for processing real-time data and high-performance transactional or analytical apps.

Memory Optimized -

High-performance database workloads that require in-memory performance for faster transaction processing and higher concurrency. Examples include servers for processing real-time data and high-performance transactional or analytical apps.

Citus currently supports all SQL clauses except:

  • Correlated subqueries
  • Recursive CTEs
  • Table sample
  • SELECT … FOR UPDATE
  • Grouping sets

From < https://docs.citusdata.com/en/v11.0-beta/faq/faq.html>

Important Checks

 

 

 

 

Limitations

Limitations

 

Limitations

 

Update Feed

Update Feed

Update Feed

 

Extensions

Extensions

Extensions

 

Do not forget to share a KMehta_0-1655349786232.jpeg if this helps

Credit: Thanks Nathan Widdup, Samrendra Panda, Miho Yamamoto for reviews and guidance

FastTrack for Azure:  Move to Azure efficiently with customized guidance from Azure engineering.

FastTrack for Azure – Benefits and FAQ | Microsoft Azure 

Co-Authors
Version history
Last update:
‎Jun 29 2022 01:17 AM
Updated by: