Since late last year, Microsoft has made the Attunity connectors to Oracle databases available to Enterprise Edition customers. We all recognized right away that these connectors were faster than the pre-existing options from either Microsoft or Oracle, when moving data into or out of an Oracle database. It wasn’t immediately obvious what speeds we could expect from the connectors, so I did some experimenting to see. This posting summarizes some findings from those experiments done earlier this year, but rather than report all the details I will then zero in on the key lessons and one big surprise that came out of the work.
Before getting in to my findings, let me give a little anecdote heard from a colleague: There is at least one SSIS customer that now uses SSIS and the Attunity connectors to move data from an Oracle database to an Oracle database , because SSIS with Attunity outperforms any of the Oracle options! While I can’t corroborate that, the information to follow is from my own measurements.
To do this work, I used two machines each with 24 cores (64-bit, 2.4 GHz), one for SSIS and one for Oracle. The machines were practically an embarrassment of riches for this simple benchmark. The SSIS machine had flat files to read (when loading data into Oracle) or write (when extracting data from Oracle). The SSIS packages were super simple, just a flat file source and an Oracle destination, or an Oracle source and a flat file destination. The data was 75 million rows, 133 bytes per row, of the LINEITEM table from the TPC-H benchmark, as generated by the DBGEN utility.
Some basic findings (remember, you mileage will vary):
The idea that performance with string data would be so different from performance with natural data types was a big surprise. The difference was especially pronounced when extracting data from Oracle. Now let’s face it, we would prefer to see data extracted from Oracle and placed in SQL Server databases! Given the big speed disparity and the fact that most real-world data needs to be in natural data types, I wondered if the same thing would happen if data was cast to string types in the query that SSIS issues against Oracle. So instead of having SSIS simply read the table, I gave it this query to run:
select
TO_CHAR(L_SHIPDATE),
TO_CHAR(L_ORDERKEY),
TO_CHAR(L_DISCOUNT),
TO_CHAR(L_EXTENDEDPRICE),
TO_CHAR(L_SUPPKEY),
TO_CHAR(L_QUANTITY),
L_RETURNFLAG,
TO_CHAR(L_PARTKEY),
L_LINESTATUS,
TO_CHAR(L_TAX),
TO_CHAR(L_COMMITDATE),
TO_CHAR(L_RECEIPTDATE),
L_SHIPMODE,
TO_CHAR(L_LINENUMBER),
L_SHIPINSTRUCT,
L_COMMENT
from ATTUSER.LINEITEM
Then before inserting the data into SQL Server using the SQL Server destination, I put in a data conversion task to get all the data into the correct types.
At this point you must be thinking, “Surely converting the data twice can’t be the fastest way!” Well, here are the results: The first run below read the mixed data types using the Attunity Oracle source with default settings, converted to SQL Server types, then wrote to the SQL Server destination. The second run was like the first, with the addition of setting the batch size larger. The third run was like the first, but on reading from Oracle all the columns were converted to text as discussed above. The last test was like the third, with the addition of setting the batch size larger. Using the string conversion and larger batches, the run was over two times faster than the obvious out-of-the-box configuration.
Overall, the Attunity connectors for Oracle really were fast, as expected. In doing this work a few lessons turned up that hopefully help you get optimal performance.
- Len Wyatt
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.