How to Migrate Data from a PostgreSQL to HDFS | Toptal®The key reason of Oracle to Postgres migration is reducing cost of the database management system and transferring to open-source system with sufficient scalability, security and customization capabilities. In generic, the database migration includes six basic phases: 

  1. Find all Oracle-specific methods of storing and processing data in the source database and the scope of use (investigation and planning phase)
  2. Select the appropriate tools for schema migration and implement it
  3. Choose the most suitable method of data migration to decrease Oracle system downtime 
  4. Run the data migration handling all transformations required by PostgreSQL DBMS 
  5. Convert all PL/SQL code into PostgreSQL format (using tools for partial automaton and manual post-processing)
  6. Run performance and functional tests, fine-tuning of the resulting database

Migration of Table Definitions from Oracle to Postgres

There are some data types in Oracle having no direct equivalent in PostgreSQL. For example, type DATE in Oracle contains both date and time parts. Therefore, specialist responsible for database migration may choose of the following options to map it into PostgreSQL: 

  • date – pure date without time part
  • time – pure time without date part with time zone optional specification
  • timestamp – date and time with time zone optional specification

There are two options of mapping dates from Oracle to Postgres: either use TIMESTAMP or set up oracfe extension to use Oracle-style date type 

Spatial types also require special attention. Oracle has support for spatial data (SDO_GEOMETRY) by default, while PostgreSQL requires installation of PostGIS extension to work with those types. 

Generic Oracle numeric type NUMBER may store numbers with different precision and scale, that’s why it is important to understand the scope of use for any numeric column. When it is focused on preserving accuracy of calculations, NUMBER types must be mapped to NUMERIC with the same precision and scale during Oracle to Postgres migration. If the top priority is calculation speed, the best mapping would be REAL or DOUBLE PRECISION.

Data Migration from Oracle to Postgres

Since data migration may consume much time for large databases, it is extremely important to choose right strategy and tools for this step. There are three common approaches to the data migration: 

  • snapshot – migrate all data at one step
  • piecewise snapshot – migrate data by chunks in parallel threads or processes 
  • change data replication – continuous loading data by tracking incremental changes  

One of the most valuable bottlenecks of Oracle to Postgres migration are the source data having no direct equivalent in the target DBMS and the external data.

BYTEA was recommended above as the optimal PostgreSQL data type for handling binary data. Nevertheless, caution is advised when migrating substantial binary data (with an average field size of at least 10MB) using BYTEA. This caution stems from the unique characteristics of reading BYTEA data – it can only be retrieved as a single fragment, making incremental reading unfeasible. Consequently, this might lead to a significant increase in RAM usage. As a solution to this challenge, PostgreSQL’s LARGE OBJECT type can be employed. All instances of the LARGE OBJECT type are stored in the system table ‘pg_largeobject,’ which is an inherent component of each database. This table can accommodate a considerable number of rows, up to 4 billion. Moreover, LARGE OBJECT supports a maximum size of 4TB and facilitates piecewise reading.

Another crucial issue of Oracle to Postgres migration is the proper mapping the ROWID, a pseudo-column that identifies the physical location of a record within a table. Although PostgreSQL features a comparable system field known as ctid, it’s important to note that ctid doesn’t directly correspond to ROWID. The PostgreSQL documentation acknowledges that the values of ctid might change during the vacuuming process.

There are three basic methods of emulating ROWID functionality for Oracle to Postgres migration:

  • Use existing primary key (or create new one) to identify rows instead of ROWID
  • Add a serial or bigserial column with auto-generated values and make it a primary/unique key to replace ROWID functionality

On this step it is reasonable to employ special software to simplify the database migration like Oracle to Postgres converter developed by Intelligent Converters. The tool can automate the migration of table definitions, indexes, and constraints with just a few clicks of mouse. It maps Oracle types to the most suitable PostgreSQL equivalents and also provides the option to customize specific type mappings as needed for the project.