The GROUP
BY
method for comparing table data just turned 18. To celebrate, let’s make it a macro.
My second ever blog post described a method, popularised by Tom Kyte, for comparing the contents of tables or queries. As part of my COMP_SYNC package, the COMPARE_SQL
function returns a SELECT
statement that uses that method. With a macro, instead of returning the statement I can just execute it and compare tables, views and / or named subqueries.
[UPDATE 2022-02-25 : there is now an optional parameter to indicate a primary or unique key.]
- I call the two data sources “rowsets”: one is “old”, the other is “new”.
- Optional excluded columns are excluded from the comparison and the output.
- Optional key columns must identify all rows uniquely.
- With key columns, output column “Z#_OP” indicates the type of change:
- I: Insert into old (present in new, not old)
- D: Delete from old (present in old, not new)
- U: Updated data (as present in new)
- O: Old data (as present in old)
- Without key columns, the output contains only ‘D’ and ‘I’ rows, and column “Z#_CNT” shows the number of rows to be deleted or inserted.
- With key columns, output column “Z#_OP” indicates the type of change:
- The output contains the values used for the comparison, which may differ from the actual values of the data if the data types do not allow direct comparison (LOBs for example).
The code makes heavy use of my SQM_UTIL package to configure the SELECT
template. Depending on the data type of each column, an expression is applied to allow comparisons. For example, LOB column content is replaced by a hash and user-defined types are replaced by JSON.
create or replace function compare_rowsets(
p_old_table in dbms_tf.table_t,
p_new_table in dbms_tf.table_t,
p_key_cols in dbms_tf.columns_t default null,
p_exclude_cols in dbms_tf.columns_t default null
) return clob sql_macro is
/*
Compares tables, views or named subqueries; one is "old", one is "new".
Optional excluded columns are excluded from the comparison and the output.
The output contains the values used for the comparison, which may differ
from the actual values of the data if the data types do not allow
direct comparison (LOBs for example).
Column "Z#_OP" indicates the type of change:
- I: Insert into old (present in new, not old)
- D: Delete from old (present in old, not new)
- U: Updated data (as present in new)
- O: Old data (as present in old)
If present, key column(s) must identify all rows uniquely.
Without key columns, the output contains only 'D' and 'I' rows,
and column "Z#_CNT" shows the number of rows to be deleted or inserted.
*/
l_col_column_names_old long;
l_col_comparables_old long;
l_col_comparables_new long;
l_col_keys long;
l_sql clob;
begin
sqm_util.col_column_names(p_old_table, l_col_column_names_old, p_exclude_cols);
sqm_util.col_comparables(p_old_table, l_col_comparables_old, p_exclude_cols);
sqm_util.col_comparables(p_new_table, l_col_comparables_new, p_exclude_cols);
if p_key_cols is null then
l_sql :=
'select /*+ qb_name(COMPARE) */
decode(sign(sum(Z#_NEW_CNT)), 1, ''I'', ''D'') Z#_OP,
abs(sum(Z#_NEW_CNT)) Z#_CNT,
'|| l_col_column_names_old ||'
FROM (
select /*+ qb_name(old) */
'|| l_col_comparables_old ||'
, -1 Z#_NEW_CNT
from p_old_table O
union all
select /*+ qb_name(new) */
'|| l_col_comparables_new ||'
, 1 Z#_NEW_CNT
from p_new_table N
)
group by
'|| l_col_column_names_old ||'
having sum(Z#_NEW_CNT) != 0';
else
sqm_util.list_columns(p_key_cols, l_col_keys);
l_sql :=
'select /*+ qb_name(COMPARE) */
case count(*) over(partition by
'|| l_col_keys ||'
) - Z#_NEW_CNT
when 0 then ''I''
when 1 then ''U''
when 2 then ''D''
when 3 then ''O''
end Z#_OP,
'|| l_col_column_names_old ||'
FROM (
select
'|| l_col_column_names_old ||',
sum(Z#_NEW_CNT) Z#_NEW_CNT
FROM (
select /*+ qb_name(old) */
'|| l_col_comparables_old ||',
-1 Z#_NEW_CNT
from p_old_table O
union all
select /*+ qb_name(new) */
'|| l_col_comparables_new ||',
1 Z#_NEW_CNT
from p_new_table N
)
group by
'|| l_col_column_names_old ||'
having sum(Z#_NEW_CNT) != 0
)';
end if;
--dbms_output.put_line(l_sql);
return l_sql;
end compare_rowsets;
/
drop table emp2 purge;
create table emp2 as select * from emp;
update emp2 set ename = ename||' KONG' where rownum = 1;
insert into emp2
select EMPNO+1000, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO
from emp2 where rownum <= 4 and job != 'PRESIDENT';
select * from compare_rowsets(
emp, emp2,
p_exclude_cols => columns(comm)
)
order by empno, z#_op;
Z#_OP | Z#_CNT | EMPNO | ENAME | JOB | MGR | HIREDATE | SAL | DEPTNO |
---|---|---|---|---|---|---|---|---|
D | 1 | 7839 | KING | PRESIDENT | 1981-11-17 00:00:00 | 5000 | 10 | |
I | 1 | 7839 | KING KONG | PRESIDENT | 1981-11-17 00:00:00 | 5000 | 10 | |
I | 1 | 8566 | JONES | MANAGER | 7839 | 1981-04-02 00:00:00 | 2975 | 20 |
I | 1 | 8698 | BLAKE | MANAGER | 7839 | 1981-05-01 00:00:00 | 2850 | 30 |
I | 1 | 8782 | CLARK | MANAGER | 7839 | 1981-06-09 00:00:00 | 2450 | 10 |
I | 1 | 8788 | SCOTT | ANALYST | 7566 | 1987-04-19 00:00:00 | 3000 | 20 |
Posts on SQL table macros
- SQL table macros 1: a moving target?
- SQL table macros 2: “polymorphic” is not “dynamic”
- SQL table macros 3: parameter strangeness
- SQL table macros 4: distribute into equal parts
- SQL table macros 5: dynamic but dangerous!
- SQL table macros 6: SQM_UTIL package
- SQL table macros 7: Select excluding
- SQL table macros 8: Print table
- SQL table macros 9: just pivot!
- SQL table macros 10: JUST_PIVOT examples
- SQL table macros 11: JUST_PIVOT for 21c (and 19c)
- SQL table macros 12: compare tables or queries
- SQL table macros 13: compare with primary key