Welcome to part 2 of the four-part blog series on navigating Oracle to Postgres migrations. I am super stoked to let you know the first post in the series—announcing the new Oracle to Postgres migration guide for Azure—had 1000s of views and was featured in the 389th edition of the Postgres Weekly newsletter. I feel happy that many of you personally messaged me to let me know that the migration guide is helping your migration projects. Thank you for all the feedback.
Oracle to Postgres migrations are hard, multi-step projects—especially for complex applications. From my experience working with those of you who have already done an Oracle to Postgres migration—specifically migrations to our Azure Database for PostgreSQL managed service—many of you spend more than 60% of the entire migration project in the conversion (step 4) and migration (step 6) phases. In the first post in this Oracle to Postgres migration series, I dove deep into the different steps of an Oracle to Postgres migration. In the same spirit, let’s dive into some practical examples of how our Oracle to Postgres migration guide can help you migrate your database storage objects.
Any object in a relational database that defines how data is stored, represented, and accessed is a database storage object. Database storage objects include: Schemas, Tables, Types, Constraints, Indexes, Sequences, Views, Synonyms, Database Links, etc. While each of these objects may mean different things in Postgres (compared to Oracle), it is important for you to know the best practices of migrating each of these objects from your Oracle database to Postgres.
In this post, you will learn about:
The good news is that converting/migrating database storage objects from Oracle to Postgres is relatively easy compared to more complex DB objects such as packages, functions, procedures, etc. (typically known as DB code objects). One of the open source tools that we employ in Oracle to Postgres migrations is the wonderful
Ora2Pg utility. If you’re not yet familiar, Ora2Pg is a free, open source tool used to migrate an Oracle database to a PostgreSQL compatible schema. Ora2pg does a great job in converting DB storage objects automatically and has improved a lot over the years to cater to different combinations of these storage objects. The key to a successful migration lies in understanding how to convert these database storage objects, knowing why
Ora2pg does conversions in a certain way, and ensuring that your converted workload works optimally in Postgres.
Let’s dive into an example right away. In Oracle, users and schemas are essentially the same thing. When you create a user (using the “create user” command), Oracle also creates a schema container to hold all objects that user creates. Whereas in Postgres, all objects created by a user need to be explicitly created in a schema within the database. All Postgres users are global to the entire server by itself. This blog post on schemas in Oracle vs. Postgres depicts this difference well.
You’re essentially going from –
| Oracle server |
| Database 1 | Database 2 | DATABASE
| User 1 | User 2 | User 1 | User 2 | USERS/SCHEMAS
| t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | TABLES
| PostgreSQL instance | |
|--------------------------------------------------| U |
| Database 1 | Database 2 | S |
|-------------------------|------------------------ | E |
| Schema 1 | Schema 2 | Schema 1 | Schema 2 | R |
|------------|------------|------------|------------| S |
| t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | |
The top things to consider when going through database level changes (such as the schema example above) are:
Bottom line, the schema-based semantics are quite different between Oracle and Postgres. So, you have to figure out whether your app is going to work as expected with a different schema, or whether you need to make some adjustments in your application. Don’t worry, though—catching these differences and implementing them are very easy once you get a hang of it.
Given schemas and users are the same thing in Oracle, there are multiple gotchas to take care of. All these gotchas with examples like above are given in the chapter on "Migrating from Oracle Security Model" in page 61 of the Oracle to Postgres migration guide.
When it comes to migrating tables, data type differences between Oracle and PostgreSQL take up a lot of mindshare. Before I tell you where to find specifics on table partitioning, migrating index organized tables, etc. from Oracle, let me walk you through a practical experience with data types to show you how important testing each conversion is.
Oracle has a very rich set of number related data types (think INT, DOUBLE, FLOAT etc.) with an ability to store scale and precision. However, when querying the Oracle dictionary tables to get details on data types for each table column, Oracle stores each of these numeric subtypes as a NUMBER with appropriate scale and precision. Sometimes, irrespective of what the subtype is, these columns get converted to numeric during the conversion process.
Well, here is the kicker. In Oracle, the largest number data type holds 38 digits. In Postgres, the largest numeric datatype can hold 131,072 digits. If you don't follow the tips and best practices in the Oracle to Postgres migration guide—and if the primary key index on this number column in Oracle is translated to a numeric in Postgres—you will pay a performance penalty in Postgres. The good news is, the data type migration best practices in the “Data type conversions” section of our Oracle to Postgres migration guide will help you avoid gotchas like this.
Let’s do an interesting experiment to see the impact of creating an index on a numeric column vs. a BIGINT column in Postgres. In the example below, I am creating 2 two-column tables, one with the ID set to numeric and the other with ID set to BIGINT. I am inserting 1M rows in each and comparing how long it takes to do a self-join count on both.
CREATE TABLE public.arthiaga_numeric(id numeric NOT NULL, name character varying(200)); #Insert 1M rows INSERT INTO public.arthiaga_numeric(id, name) SELECT generate_series(1,1000000), md5(random()::text); INSERT 0 1000000 Query returned successfully in 3 secs 585 msec. #Add primary key ALTER TABLE public.arthiaga_numeric ADD PRIMARY KEY(id); ALTER TABLE Query returned successfully in 1 secs 262 msec. #Do a self-join count select count(*) from public.arthiaga_numeric a, public.arthiaga_numeric b where a.id=b.id; Successfully run. Total query runtime: 1 secs 123 msec.
CREATE TABLE public.arthiaga_bigint(id bigint NOT NULL, name character varying(200)); #Insert 1M rows INSERT INTO public.arthiaga_bigint(id, name) SELECT generate_series(1,1000000), md5(random()::text); INSERT 0 1000000 Query returned successfully in 3 secs 710 msec. #Add primary key ALTER TABLE public.arthiaga_bigint ADD PRIMARY KEY(id); ALTER TABLE Query returned successfully in 1 secs 107 msec. #Do a self-join count select count(*) from public.arthiaga_bigint a, public.arthiaga_bigint b where a.id=b.id; Successfully run. Total query runtime: 1 secs 3 msec.
For a million-row table with two basic columns, you already see an 11% improvement in query response times when I create an index on a BIGINT column as opposed to a numeric column in PostgreSQL. As you can see –
Count query with BIGINT for a 1M row table
Completes in 1.003 seconds
Count query with NUMERIC for a 1M row table
Completes in 1.123 seconds
Imagine the performance penalties you will need to pay for larger and more complex table structures. Imagine the end-to-end performance impact when query response times are high across multiple tables in a workload. I would encourage you try this experiment on larger and more complex tables to see bigger performance gains yourself. For certain workloads, we have measured very high improvements in query response times (north of 80%) by just switching to use the right data type in Postgres.
The key thing to know is, from a table migration standpoint, you should keep an eye on data types, partitioning schemes, indexes, constraints, LOBs, temporary tables, sequences etc. when migrating from Oracle to Postgres. You can learn about all of these aspects in the Oracle to Postgres migration guide.
Next up, let’s look at a couple more of my most favorite learnings based on my experience with migrating DB storage objects in Oracle to Postgres migrations.
Postgres handles NULLs in strings in a completely different way compared to Oracle—period! I highly recommend paying attention to and updating your SQL queries to cater to subtle differences in the treatment of NULLs. To avoid last-minute issues during cutover phases, you’ll definitely want to test (and test, and test) earlier in the process. If you’re migrating from a typical Oracle workload, there is a high chance you will hit this issue in the way Oracle treats NULLs in strings vs. Postgres. Let’s look at an example.
Let’s use the same table created above for this experiment. I will just remove the primary key and add a composite unique constraint to one of the tables above to keep the explanation simple.
CREATE TABLE public.arthiaga_bigint(id bigint NOT NULL, name character varying(200)); #Add composite unique constraint across columns alter table public.arthiaga_bigint add constraint uniq_bigint unique (id, name); #Insert more rows INSERT INTO public.arthiaga_bigint(id, name) VALUES (1, 'Arun'); INSERT INTO public.arthiaga_bigint(id) VALUES (2); INSERT INTO public.arthiaga_bigint(id) VALUES (2); INSERT INTO public.arthiaga_bigint(id) VALUES (3); INSERT INTO public.arthiaga_bigint(id) VALUES (3); #Table select select * from public.arthiaga_bigint;
Guess what the select returns, despite the unique constraint?
As a result, when you’re migrating to Postgres, you will need to handle such exceptions in your application. And if you don’t handle these exceptions in your app, you are in for a lot of surprises.
The same point is applicable for NULLs within strings as well: Postgres does not support NULLs within strings, whereas Oracle does.
Oracle shares a sequence cache across multiple user sessions in the database. In PostgreSQL, the sequence cache is not shared across session and is instead at a per session level.
This blog post on explaining sequence caching behavior differences in Oracle vs. Postgres does a great job on showing this behavior via an example. In the example, any new session in Postgres will get its own set of cached values for a particular sequence range. Whereas in Oracle, sequences are shared across sessions. If your application depends on shared cache values for sequences, it would be good to think about changing the design in the app itself given this difference.
While these are just a few examples based on my experience, the Oracle to Azure Database for PostgreSQL migration guide covers database storage object conversion topics in depth.
This cheat sheet on where to find details about database storage objects in our Oracle to Postgres migration guide should be useful:
As you can see, this guide has a LOT of content and is improved frequently. While these page numbers can change in a future version of this guide, you can still search for the chapters listed above.
Now that you have:
you might want to also check out the Oracle to Azure Database for PostgreSQL migration guide. Let me know if it’s useful. And especially let me know if you see ways to make it even more useful.
If you want to chat about migrations to PostgreSQL in general—or have questions about our PostgreSQL managed service on Azure—you can always reach our product team via email at Ask AzureDB for PostgreSQL. I also love all the updates on our Twitter handle (@AzureDBPostgres) and recommend following us for more Postgres news and updates. We would love to hear from you.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.