In my “Advanced Row Pattern Matching” presentation, I demonstrate using MATCH_RECOGNIZE to compare tables. Kim Berg Hansen asked me to compare this technique with others. I did some quick tests and here are the results with some comments.
|Group by (HASH)||1|
|Group by (SORT)||1.4|
The “Full join” technique only works when we have a primary or unique key that is shared by both tables. I prefer the GROUP BY technique popularized by Tom Kyte, even though it may be a bit slower. When testing, I noticed that the HASH GROUP BY algorithm performs better than SORT GROUP BY, as others have written.
If either of the tables contains duplicate rows (which may happen if we don’t compare all of the columns, or if there is no primary key), then GROUP BY will output one row. This may be a problem if we want data (such as the ROWID) that was not included in the comparison. In that case, we could use analytic functions or the MATCH_RECOGNIZE clause to compare and output all the rows and columns of interest. As you can see, the analytic function is more than twice as slow but it easily beats the MATCH_RECOGNIZE clause.
I use the output from table comparisons to synchronize the tables, so capturing the ROWID is important to me even when a primary or unique key is not available. For that use case, I will prefer analytic functions from now on.