= Porting from an Oracle database to a Postgres database = The task of porting an Oracle database to a Postgres database is best done using the perl-based ora2pg tool, which at the time of writing is available from [http://ora2pg.darold.net/install.html]. Installation is straightforward and initial configuration involves logging in as the superuser and editing ora2pg's configuration file as follows : {{{ cd /etc/ora2pg/ cp ora2pg.conf.dist ora2pg.conf vi ora2pg.conf }}} '''This has already been done on the machine vso.''' It is documented here for other users. The file is a list of key/value pairs, with the key and value separated by a tab. For instance the line in /etc/ora2pg/ora2pg.conf : {{{ ORACLE_HOME /usr/local/oracle/10g }}} would set the ORACLE_HOME value to "/usr/local/oracle/10g". Typically the following settings need to be changed : 1. '''ORACLE_HOME''' - While I'm not 100% certain that this needs to be set, it can't hurt, and it's easy to do - go to the machine on which Oracle is installed and look at the file '''/etc/oratab''' 1. '''ORACLE_DSN''' - Typically set to "'''dbi:Oracle:'''" 1. '''ORACLE_USER''' needs to be set to the user who accesses the database 1. '''ORACLE_PWD''' needs to be set to their password 1. The line reading "'''#WHERE 1=1'''" needs to be un-commented so it reads "'''WHERE 1=1'''" (ie. delete the "'''#'''") 1. Depending on your version of postgres, you may need to set '''PG_SUPPORTS_MVIEW''' to "0", although this may not be very relevant for our work Once ora2pg is configured, it is theoretically possible to dump all the contents of an Oracle database, but in practice it's probably best done one part at a time (incrementally) because that allows for more stringent checks to be done. It also allows for tables that are continually being updated in Oracle to be moved to postgres as the means to update them in postgres becomes available. This is done with the '''--allow''' argument when running ora2pg (see the bash script below). This specifies a comma separated list of entities to export. For example, in Oracle there is a table named "'''NOAA_ACTIVE_REGION_LOCATIONS'''". There are also three Oracle procedures that are used to update this table, somewhat unimaginatively named "'''UPDATE_ACTIVE_REGION'''", "'''UPDATE_ACTIVE_REGION2'''" and "'''UPDATE_ACTIVE_REGION3'''". To extract this into the postgres database, we use ora2pg to extract the SQL to create the table, then the SQL to populate the table using '''INSERT''' statements (which can be a large SQL file) and finally the three procedures. This can be done in a bash script as follows : {{{ #!/bin/bash echo Dumping table scheme ora2pg --debug --allow NOAA_ACTIVE_REGION_LOCATIONS --out activeRegionSchema.psql --type TABLE echo echo Dumping table data ora2pg --debug --allow NOAA_ACTIVE_REGION_LOCATIONS --out activeRegionData.psql --type INSERT echo echo Dumping procedures ora2pg --debug --plsql --allow UPDATE_ACTIVE_REGION --out activeRegionProc1.psql --type PROCEDURE ora2pg --debug --plsql --allow UPDATE_ACTIVE_REGION2 --out activeRegionProc2.psql --type PROCEDURE ora2pg --debug --plsql --allow UPDATE_ACTIVE_REGION3 --out activeRegionProc3.psql --type PROCEDURE echo exit }}} This writes a set of five files with the .psql extension. We can then read them into postgres with a bash script as follows : {{{ #!/bin/bash dropdb export_test createdb export_test createlang -d export_test plpgsql cat activeRegionSchema.psql | psql export_test cat activeRegionData.psql | psql export_test cat activeRegionProc1.psql | psql export_test cat activeRegionProc2.psql | psql export_test cat activeRegionProc3.psql | psql export_test exit }}} This creates the table in postgres. Any cron-driven scripts that were updating that table in realtime - "spiders" - must also be modified so that they use the postgres database rather than the Oracle one. It is also possible, for '''INSERT''' SQL exports, to skip the generation of intermediate .psql files used in the examples above and connect directly to a postgres database, although I have not tried this. This involves setting the '''PG_DSN''', '''PG_USER''' and '''PG_PWD''' settings in the ora2pg.conf file. '''NOTE''' that you can use an alternate configuration file with the '''--conf''' argument on the command line, so you can try this without disturbing the default configuration. ora2pg has a man page and is documented somewhat at its web site. Help is also available by running '''ora2pg --help''' and the file /etc/ora2pg/ora2pg.conf is pretty well documented.