Update in October 2022: Citus has a new home on Azure! The Citus database is now available as a managed service in the cloud as Azure Cosmos DB for PostgreSQL. Azure documentation links have been updated throughout the post, to point to the new Azure docs.
I recently gave a talk about the Citus extension to Postgres at the Warsaw PostgreSQL Users Group. Unfortunately, I did not get to go in person to beautiful Warsaw, but it was still a nice way to interact with the global Postgres community and talk about what Citus is, how it works, and what it can do for you.
If you are already familiar with Postgres then this talk should be a good introduction to all the powerful capabilities that Citus gives you. The tl;dr is this: Citus is an open source extension to Postgres that transforms Postgres into a distributed database. Citus uses sharding and replication to distribute your data and your Postgres queries across a distributed database cluster.
Shining a light on the performance speedups of Citus (via demo)
Every so often, I try to rethink how I talk about Citus as Postgres and the needs of applications evolve. One thing we have not done very much is talk directly about the performance improvements in Citus. Sometimes it’s actually slower, but at scale can be a lot faster. Therefore, I introduced every Citus feature with some benchmarks that show the performance compared to a (large) Postgres server.
The talk is also worth watching for the demo (the demo starts at 46:52) where I compare the performance of Hyperscale (Citus) on Azure Database for PostgreSQL against a single Postgres server. For the demo, I use GitHub archive data in an analytics use case, and the demo shows >250x speedups for analytical queries with Citus!
Click here to watch the video of my talk at Warsaw PostgreSQL Users Group, on Citus: PostgreSQL at any Scale. The demo starts at 46:52, but the introductory discussion should be useful, too.
Props to the organizers of the Warsaw PostgreSQL Users Group—especially Alicja Kucharczyk—for the time they spend organizing Postgres talks for their community. And for inviting me to give a talk to their Postgres users group. I really appreciated all the good questions, too.
If this is your first intro to Citus & you want to learn more
Here are a few of the getting-started next steps I usually recommend to developers:
- Download Citus packages locally: Citus is open source, so it’s easy to download and try out.
- Try Citus on Azure: And in the months since Microsoft acquired Citus Data last year, we have also integrated Citus into our managed Postgres service on Azure: Citus is now available as Hyperscale (Citus), a built-in deployment option in Azure Database for Postgres. So you can also try out Citus on Azure.
- Read the Citus open source docs: docs.citusdata.com has tutorials for multi-tenant SaaS applications and real-time analytics dashboards, a use case guide for time series data, details on pretty much every Citus feature, installation instructions for how to set Citus up locally on a single server as well as installing on multiple servers.. the Citus docs are quite useful.