wiki:OracleToPg
Last modified 10 years ago Last modified on 03/26/14 13:03:31

Porting from an Oracle database to a Postgres database

The task of porting an Oracle database to a Postgres database is best done using the perl-based ora2pg tool, which at the time of writing is available from http://ora2pg.darold.net/install.html. Installation is straightforward and initial configuration involves logging in as the superuser and editing ora2pg's configuration file as follows :

cd /etc/ora2pg/
cp ora2pg.conf.dist ora2pg.conf
vi ora2pg.conf

This has already been done on the machine vso. It is documented here for other users.

The file is a list of key/value pairs, with the key and value separated by a tab. For instance the line in /etc/ora2pg/ora2pg.conf :

ORACLE_HOME  /usr/local/oracle/10g

would set the ORACLE_HOME value to "/usr/local/oracle/10g".

Typically the following settings need to be changed :

  1. ORACLE_HOME - While I'm not 100% certain that this needs to be set, it can't hurt, and it's easy to do - go to the machine on which Oracle is installed and look at the file /etc/oratab
  2. ORACLE_DSN - Typically set to "dbi:Oracle:"
  3. ORACLE_USER needs to be set to the user who accesses the database
  4. ORACLE_PWD needs to be set to their password
  5. The line reading "#WHERE 1=1" needs to be un-commented so it reads "WHERE 1=1" (ie. delete the "#")
  6. Depending on your version of postgres, you may need to set PG_SUPPORTS_MVIEW to "0", although this may not be very relevant for our work

Once ora2pg is configured, it is theoretically possible to dump all the contents of an Oracle database, but in practice it's probably best done one part at a time (incrementally) because that allows for more stringent checks to be done. It also allows for tables that are continually being updated in Oracle to be moved to postgres as the means to update them in postgres becomes available. This is done with the --allow argument when running ora2pg (see the bash script below). This specifies a comma separated list of entities to export.

For example, in Oracle there is a table named "NOAA_ACTIVE_REGION_LOCATIONS". There are also three Oracle procedures that are used to update this table, somewhat unimaginatively named "UPDATE_ACTIVE_REGION", "UPDATE_ACTIVE_REGION2" and "UPDATE_ACTIVE_REGION3". To extract this into the postgres database, we use ora2pg to extract the SQL to create the table, then the SQL to populate the table using INSERT statements (which can be a large SQL file) and finally the three procedures. This can be done in a bash script as follows :

#!/bin/bash

echo Dumping table scheme
ora2pg --debug --allow NOAA_ACTIVE_REGION_LOCATIONS --out activeRegionSchema.psql --type TABLE
echo

echo Dumping table data
ora2pg --debug --allow NOAA_ACTIVE_REGION_LOCATIONS --out activeRegionData.psql --type INSERT
echo

echo Dumping procedures
ora2pg --debug --plsql --allow UPDATE_ACTIVE_REGION  --out activeRegionProc1.psql --type PROCEDURE
ora2pg --debug --plsql --allow UPDATE_ACTIVE_REGION2 --out activeRegionProc2.psql --type PROCEDURE
ora2pg --debug --plsql --allow UPDATE_ACTIVE_REGION3 --out activeRegionProc3.psql --type PROCEDURE
echo

exit

This writes a set of five files with the .psql extension. We can then read them into postgres with a bash script as follows :

#!/bin/bash

dropdb export_test
createdb export_test
createlang -d export_test plpgsql

cat activeRegionSchema.psql | psql export_test
cat activeRegionData.psql | psql export_test

cat activeRegionProc1.psql | psql export_test
cat activeRegionProc2.psql | psql export_test
cat activeRegionProc3.psql | psql export_test

exit

This creates the table in postgres. Any cron-driven scripts that were updating that table in realtime - "spiders" - must also be modified so that they use the postgres database rather than the Oracle one.

It is also possible, for INSERT SQL exports, to skip the generation of intermediate .psql files used in the examples above and connect directly to a postgres database, although I have not tried this. This involves setting the PG_DSN, PG_USER and PG_PWD settings in the ora2pg.conf file. NOTE that you can use an alternate configuration file with the --conf argument on the command line, so you can try this without disturbing the default configuration.

ora2pg has a man page and is documented somewhat at its web site. Help is also available by running ora2pg --help and the file /etc/ora2pg/ora2pg.conf is pretty well documented.