As more and more applications convert to cloud native with Kubernetes as a normalizing layer, the ability to have portability between public clouds also becomes easier. Theoretically, an application can now move easily between AWS/Azure/GCP.
However, there are a couple of questions that you still have to consider:
This decision point really lies between implementing a self-managed or a fully managed database.
Before proceeding, let’s clarify the definitions of “self-managed” and “fully managed” within the context of this blog.
There are multiple factors you should be considering in making this decision. I’ll cover a few (but not all) of these decision points and review how they apply to a sample application.
Specific applications could have dependencies that cannot be provided on specific database types and/or database versions. A perfect example is a financial application which uses an older version of a DB such as Ingres DB, which isn’t even available as a managed service. The operations team would have to roll out Ingres DB on Azure compute nodes and manage the entire lifecycle. Another healthcare application may have requirements to use an older version of Postgres that cannot be upgraded due to the simple fact that developers (and/or funds) are not available to upgrade the application code to use newer versions of Postgres. In these circumstances, the easy and obvious decision would be to roll your own DB on the Public Cloud environment.
Database redundancy and high availability is another factor. While potentially simple for small applications, the configuration can become onerous when dealing with a scaled application. Hence in production environments running mission critical workloads, with strict availability requirements, using a managed service such as Azure Database for PostgreSQL simplifies deployments and significantly reduces time to production. It’s easier to configure HA on Azure. Configuring anything similar in a self-managed database requires advanced knowledge and skills.
Whether to save primary storage space (tier 1) or ensure disaster recovery, an ability to consistently backup the database and potentially restore is generally a business requirement. Building the proper backup and restoral mechanisms for a self-managed database requires full knowledge of the backend and storage. Several parts of the operations that need to be designed include:
Hence, this is complex, but it can be built and maintained with substantial operations and capital cost. However, using a managed service such as Azure Database for PostgreSQL, these capabilities are easily available as a configuration option. Not only can data be backed up but the retention days for backup, local vs geo redundancy, type of backup storage can all be selected. In addition, other features like encryption are built-in default capabilities in the product. Restoral is also easy, and the precise point in time can be selected to restore.
Another major concern with databases is compliance. The standard PCI, HIPAA, etc. compliance requirements can be met using either a self-managed or fully managed database as these require specific implementations of tenancy, encryption, redundancy, etc. However, issues such as geo-bounding (storing data in specific geos) are not truly database centric but rather application centric. The application is required to essentially “route” the data to the proper geo-storage. Many businesses seriously consider using a self-managed (mainly on-prem) implementation, due to an obvious ability to control the implementation and meet the requirements. However, a fully managed service on a Public Cloud (such as Azure Database for PostgreSQL) generally has a long set of certifications that ensure the storage meet PCI, HIPAA, etc. requirements. Enabling you to get closer to meeting your compliance needs faster and easier.
Operational expertise is a big factor in this decision. Using Kubernetes adds even more complexity to the puzzle. This complexity is realized because of the need to manage Kubernetes itself, and the interconnections with storage. While the K8S community is working on things like operators to make it simple, you still have to determine how to setup and manage the storage component that K8S connects to. Hence there is still a need to have significant expertise in infrastructure. Instead, a fully managed service with secure connections reduces the need for database expertise (Kubernetes or not) and infrastructure expertise.
Finally, the issue of scale is a large consideration when it comes to self-managed vs fully managed databases. Scaling requires a more complex implementation leading to more nodes, storage, security, etc. Many businesses might have the expertise to manage the scale, but many do not. Fully managed services like such as Hyperscale (Citus) on Azure Database for PostgreSQL allow incremental increases in the database capacity through the addition of new nodes when needed with a selection of varied costs, cores, storage, and connections.
While I covered a few of the more common factors above, there are many more factors. The following table iterates through a more complete list:
Factor |
Self-managed |
Fully managed |
Specific SW dependencies needed for Application? |
Yes |
Potentially |
Potentially Control Compliance (PCI, GDPR, etc.) |
Yes – managed via the app |
Yes – managed via the app |
Operations cost reduction (reduce DBA expertise, SW licenses, etc.) |
No |
Yes |
Simple self-service for Devs |
Potentially |
Yes |
Built-in Disaster Recovery |
No |
Yes |
Built-in SLA |
No |
Yes |
Simplified Scaling |
No |
Yes |
Built in end-to-end security |
No |
Yes |
While the obvious choice per the table is using a fully managed database, it’s truly dependent on business requirements and cost. In many cases self-managed databases are chosen for various political and business decisions, but the truly obvious choice is a Public Cloud based database service, such as Hyperscale (Citus) on Azure Database for PostgreSQL.
As we stated in this blog, many applications are becoming orchestrated via Kubernetes. While the Kubernetes based application can migrate from cloud to cloud, the database can be implemented, secured, scaled, and maintained in one cloud with a secure, potentially low latency, connection.
We built a sample application, called Acme Fitness, which is an e-commerce application that has several services and databases and is deployed with Kubernetes.
Let’s review the application’s databases, what was used and why from a DevOps perspective.
All the databases have different requirements and different implementations, 2 self-managed DBs and 2 fully managed databases. As the application evolves and hopefully the business grows these requirements might change. Monitoring and understanding these needs are an important part of operations to ensure the proper database implementation is available for the application.
In the next blog we will explore in detail how the Orders database was:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.