10 Cool things about the COMPARE_SYNC package

@thatjeffsmith recently recommended an article about making your blog more popular. The article said “lists of 10 things” were great ways to get more readers. Hey, if that’s all it takes…

COMPARE_SYNC is a package that generates SQL to compare data or synchronize tables. Here are 10 good reasons to use it.

1. Efficiently compare tables, views or query results

The COMPARE SQL uses Tom Kyte’s GROUP BY method. When comparing two tables, there is only one full scan of each table. The “new” source can be a table, view or query in parentheses. The “old” source must be a table or a view.

2. Fully synchronize an “old” table with a “new” source

The SYNC SQL compares “old” and “new”, then applies the differences to the “old” table using MERGE. The end result is that the “old” table is identical to the “new” source.

3. UPSERT synchronize the “old” table from a “new” source

If you want to apply changes to the “old” table, but without doing any DELETEs, then MERGE can compare and change with only one full scan of each table!

4. Apply Change Data Capture (CDC) input to the “old” table

All you need is a column in the input that contains ‘D’ when the row is to be deleted. One MERGE statement will do the rest.

5. Synchronize data with or without primary keys

With primary keys, the MERGE statement combines updates, deletes and inserts to do the job. Without primary keys, I generate a MERGE statement that uses only deletes and inserts. (The UPSERT and CDC methods require primary keys.)

6. Do no unnecessary changes

All of the above methods change rows only when they need to be changed! If the data is already in sync, you will see the message “0 rows merged”.

7. Correctly handle remote data, virtual columns and invisible columns

Yes, the “old” table can be remote: you can “push” the changes to the remote database. The generated SQL automatically excludes virtual columns and includes invisible columns, so all real data is synchronized.

8. Customize the SQL to be generated

  • You can explicitly list the columns to be compared or synchronized.
  • You can explicitly list the “primary key” columns, if the “old” target does not have a primary key but there is a unique, non-null column or column list you can use.

9. Avoid SQL injection

The package runs with the privileges of the current user, so there is no risk of “privilege escalation”. Besides, the package doesn’t do anything to your data! It just generates SQL for you to review and execute if you choose.

10. Tested over 27000 times!

I generated code to call the package with every possible combination of input parameters, then I called the package, and then I ran the SQL that was generated. In case you’re wondering, yes I did find (and correct) some bugs.

I hope this package proves useful to some of you out there.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s