The first Oracle to PostgreSQL trial

2017-12-24 - Progress - Tony Finch

I have used ora2pg to do a quick export of the IP Register database from Oracle to PostgreSQL. This export included an automatic conversion of the table structure, and the contents of the tables. It did not include the more interesting parts of the schema such as the views, triggers, and stored procedures.

Oracle Instant Client

Before installing ora2pg, I had to install the Oracle client libraries. These are not available in Debian, but Debian's ora2pg package is set up to work with the following installation process.

  • Get the Oracle Instant Client RPMs

    from Oracle's web site. This is a free download, but you will need to create an Oracle account.

    I got the basiclite RPM - it's about half the size of the basic RPM and I didn't need full i18n. I also got the sqlplus RPM so I can talk to Jackdaw directly from my dev VMs.

    The libdbd-oracle-perl package in Debian 9 (Stretch) requires Oracle Instant Client 12.1. I matched the version installed on Jackdaw, which is 12.1.0.2.0.

  • Convert the RPMs to debs (I did this on my workstation)

    $ fakeroot alien oracle-instantclient12.1-basiclite-12.1.0.2.0-1.x86_64.rpm
    $ fakeroot alien oracle-instantclient12.1-sqlplus-12.1.0.2.0-1.x86_64.rpm
    
  • Those packages can be installed on the dev VM, with libaio1 (which is required by Oracle Instant Client but does not appear in the package dependencies), and libdbd-oracle-perl and ora2pg.

  • sqlplus needs a wrapper script that sets environment variables so that it can find its libraries and configuration files. After some debugging I foud that although the documentation claims that glogin.sql is loaded from $ORACLE_HOME/sqlplus/admin/ in fact it is loaded from $SQLPATH.

    To configure connections to Jackdaw, I copied tnsnames.ora and sqlnet.ora from ent.

Running ora2pg

By default, ora2pg exports the table definitions of the schema we are interested in (i.e. ipreg). For the real conversion I intend to port the schema manually, but ora2pg's automatic conversion is handy for a quick trial, and it will probably be a useful guide to translating the data type names.

The commands I ran were:

$ ora2pg --debug
$ mv output.sql tables.sql
$ ora2pg --debug --type copy
$ mv output.sql rows.sql

$ table-fixup.pl <tables.sql >fixed.sql
$ psql -1 -f functions.sql
$ psql -1 -f fixed.sql
$ psql -1 -f rows.sql

The fixup script and SQL functions were necessary to fill in some gaps in ora2pg's conversion, detailed below.

Compatibility problems

  • Oracle treats the empty string as equivalent to NULL but PostgreSQL does not.

    This affects constraints on the lan and mzone tables.

  • The Oracle substr function supports negative offsets which index from the right end of the string, but PostgreSQL does not.

    This affects subdomain constraints on the unique_name, maildom, and service tables. These constraints should be replaced by function calls rather than copies.

  • The ipreg schema uses raw columns for IP addresses and prefixes; ora2pg converted these to bytea.

    The v6_prefix table has a constraint that relies on implicit conversion from raw to a hex string. PostgreSQL is stricter about types, so this expression needs to work on bytea directly.

  • There are a number of cases where ora2pg represented named unique constraints as unnamed constraints with named indexes. This unnecessarily exposes an implementation detail.

  • There were a number of Oracle functions which PostgreSQL doesn't support (even with orafce), so I implemented them in the functions.sql file.

    • regexp_instr()
    • regexp_like()
    • vzise()

Other gotchas

  • The mzone_co, areader, and registrar tables reference the pers table in the jdawadm schema. These foreign key constraints need to be removed.

  • There is a weird bug in ora2pg which mangles the regex [[:cntrl:]] into [[cntrl:]]

    This is used several times in the ipreg schema to ensure that various fields are plain text. The regex is correct in the schema source and in the ALL_CONSTRAINTS table on Jackdaw, which is why I think it is an ora2pg bug.

  • There's another weird bug where a regexp_like(string,regex,flags) expression is converted to string ~ regex, flags which is nonsense.

    There are other calls to regexp_like() in the schema which do not get mangled in this way, but they have non-trivial string expressions whereas the broken one just has a column name.

Performance

The export of the data from Oracle and the import to PostgreSQL took an uncomfortably long time. The SQL dump file is only 2GB so it should be possible to speed up the import considerably.