I recently needed to gather multiple columns from multiple rows into one string, then split them out again. JSON_TABLE
did both splits at the same time!
The situation
I want to find the shortest “foreign key” path between two tables in the OBE
schema: OEHR_ORDER_ITEMS
and OEHR_DEPARTMENTS
. I also want the intermediate tables and the constraints used to connect the tables. As a starting point for today, I’ll create a little table with the data for the tables in the OBE
schema that start with ‘OEHR’:
drop table t purge; create table t as with search_for (owner, table_name) as ( select 'OBE', 'OEHR_DEPARTMENTS' from dual union all select 'OBE', 'OEHR_ORDER_ITEMS' from dual ) , pur as ( select OWNER, CONSTRAINT_NAME, TABLE_NAME, R_OWNER, R_CONSTRAINT_NAME from dba_constraints where constraint_type in('P','U','R') and owner = 'OBE' and table_name like 'OEHR%' ) , relations as ( select a.table_name, a.constraint_name, b.table_name r_table_name, a.r_constraint_name from pur a join pur b on (a.R_OWNER, a.R_CONSTRAINT_NAME) = ((b.OWNER, b.CONSTRAINT_NAME)) and (a.OWNER, a.table_name) != ((b.OWNER, b.table_name)) ) select * from relations;
Now I’ll use a hierarchical query to find paths from one table to the other:
select rownum rn, level lvl, TABLE_NAME, CONSTRAINT_NAME, R_TABLE_NAME, R_CONSTRAINT_NAME from t a start with table_name = 'OEHR_ORDER_ITEMS' connect by nocycle table_name = prior r_table_name and table_name != 'OEHR_DEPARTMENTS';
RN | LVL | TABLE_NAME | CONSTRAINT_NAME | R_TABLE_NAME | R_CONSTRAINT_NAME |
---|---|---|---|---|---|
1 | 1 | OEHR_ORDER_ITEMS | OEHR_ORDER_ITEMS_ORDER_ID_FK | OEHR_ORDERS | OEHR_ORDER_PK |
2 | 2 | OEHR_ORDERS | OEHR_ORDERS_CUSTOMER_ID_FK | OEHR_CUSTOMERS | OEHR_CUSTOMERS_PK |
3 | 3 | OEHR_CUSTOMERS | OEHR_CUST_ACCT_MGR_FK | OEHR_EMPLOYEES | OEHR_EMP_EMP_ID_PK |
4 | 4 | OEHR_EMPLOYEES | OEHR_EMP_DEPT_FK | OEHR_DEPARTMENTS | OEHR_DEPT_ID_PK |
5 | 4 | OEHR_EMPLOYEES | OEHR_EMP_JOB_FK | OEHR_JOBS | OEHR_JOB_ID_PK |
6 | 2 | OEHR_ORDERS | OEHR_ORDERS_SALES_REP_FK | OEHR_EMPLOYEES | OEHR_EMP_EMP_ID_PK |
7 | 3 | OEHR_EMPLOYEES | OEHR_EMP_DEPT_FK | OEHR_DEPARTMENTS | OEHR_DEPT_ID_PK |
8 | 3 | OEHR_EMPLOYEES | OEHR_EMP_JOB_FK | OEHR_JOBS | OEHR_JOB_ID_PK |
9 | 1 | OEHR_ORDER_ITEMS | OEHR_ORDER_ITEMS_PRODUCT_ID_FK | OEHR_PRODUCT_INFORMATION | OEHR_PRODUCT_INFO_PK |
Just looking at the data myself, I can see that the shortest path will include rows 1, 6 and 7. Rows 1 through 4 will work too, but the path is longer.
Getting the data for the whole path
The problem is, how can I identify those three rows I want and just query them? The only way I see is to use SYS_CONNECT_BY_PATH
(or the equivalent with a CTE) to string together all the data from all the rows I want – which means I have to split that data out again later.
To simplify what follows, I’m going to manually delete the rows that don’t matter:
delete from t where table_name not in ('OEHR_ORDER_ITEMS','OEHR_ORDERS','OEHR_EMPLOYEES'); delete from t where r_table_name not in ('OEHR_DEPARTMENTS','OEHR_ORDERS','OEHR_EMPLOYEES'); commit;
The rows that are left contain the data I want:
TABLE_NAME | CONSTRAINT_NAME | R_TABLE_NAME | R_CONSTRAINT_NAME |
---|---|---|---|
OEHR_ORDER_ITEMS | OEHR_ORDER_ITEMS_ORDER_ID_FK | OEHR_ORDERS | OEHR_ORDER_PK |
OEHR_ORDERS | OEHR_ORDERS_SALES_REP_FK | OEHR_EMPLOYEES | OEHR_EMP_EMP_ID_PK |
OEHR_EMPLOYEES | OEHR_EMP_DEPT_FK | OEHR_DEPARTMENTS | OEHR_DEPT_ID_PK |
Now I’ll string that data together:
select sys_connect_by_path( table_name ||'/'|| constraint_name||'/'|| r_table_name||'/'||r_constraint_name ,'<' ) path from t a where r_table_name = 'OEHR_DEPARTMENTS' start with table_name = 'OEHR_ORDER_ITEMS' connect by table_name = prior r_table_name; <OEHR_ORDER_ITEMS/OEHR_ORDER_ITEMS_ORDER_ID_FK/OEHR_ORDERS/OEHR_ORDER_PK <OEHR_ORDERS/OEHR_ORDERS_SALES_REP_FK/OEHR_EMPLOYEES/OEHR_EMP_EMP_ID_PK <OEHR_EMPLOYEES/OEHR_EMP_DEPT_FK/OEHR_DEPARTMENTS/OEHR_DEPT_ID_PK
I split the data into three lines for legibility. Note that I have to use different literals in the two parameters of SYS_CONNECT_BY_PATH
, otherwise I get an exception.
Splitting out the data
Now we’re finally getting to the point. I can use JSON_ARRAY to string together the data for one row. Once I have all the data for all the rows, I can turn that into an array of arrays. Finally, JSON_TABLE will split everything out for me. Let’s start with JSON_ARRAY:
select sys_connect_by_path( json_array(table_name, constraint_name, r_table_name, r_constraint_name) , ' ' ) path from t a where r_table_name = 'OEHR_DEPARTMENTS' start with table_name = 'OEHR_ORDER_ITEMS' connect by table_name = prior r_table_name; ["OEHR_ORDER_ITEMS","OEHR_ORDER_ITEMS_ORDER_ID_FK","OEHR_ORDERS","OEHR_ORDER_PK"] ["OEHR_ORDERS","OEHR_ORDERS_SALES_REP_FK","OEHR_EMPLOYEES","OEHR_EMP_EMP_ID_PK"] ["OEHR_EMPLOYEES","OEHR_EMP_DEPT_FK","OEHR_DEPARTMENTS","OEHR_DEPT_ID_PK"]
Now I’ll turn that data into an array of arrays:
with data as ( select sys_connect_by_path( json_array(table_name, constraint_name, r_table_name, r_constraint_name) , '/' ) path from t a where r_table_name = 'OEHR_DEPARTMENTS' start with table_name = 'OEHR_ORDER_ITEMS' connect by table_name = prior r_table_name ) select '['||substr(replace(path,'/',','),2) || ']' path from data;
[ [ "OEHR_ORDER_ITEMS", "OEHR_ORDER_ITEMS_ORDER_ID_FK", "OEHR_ORDERS", "OEHR_ORDER_PK" ], [ "OEHR_ORDERS", "OEHR_ORDERS_SALES_REP_FK", "OEHR_EMPLOYEES", "OEHR_EMP_EMP_ID_PK" ], [ "OEHR_EMPLOYEES", "OEHR_EMP_DEPT_FK", "OEHR_DEPARTMENTS", "OEHR_DEPT_ID_PK" ] ]
Now JSON_TABLE will break down the top array into rows, just by putting ‘$[*]’ in the second parameter. The COLUMNS will do the rest, splitting the data in the inner arrays into different columns.
with data as ( select sys_connect_by_path( json_array(table_name, constraint_name, r_table_name, r_constraint_name) , '/' ) path from t a where r_table_name = 'OEHR_DEPARTMENTS' start with table_name = 'OEHR_ORDER_ITEMS' connect by table_name = prior r_table_name ) select a.* from data, json_table( '['||substr(replace(path,'/',','),2) || ']', '$[*]' columns( rn for ordinality, table_name varchar2(128) path '$[0]', constraint_name varchar2(128) path '$[1]', r_table_name varchar2(128) path '$[2]', r_constraint_name varchar2(128) path '$[3]' ) ) a;
RN | TABLE_NAME | CONSTRAINT_NAME | R_TABLE_NAME | R_CONSTRAINT_NAME |
---|---|---|---|---|
1 | OEHR_ORDER_ITEMS | OEHR_ORDER_ITEMS_ORDER_ID_FK | OEHR_ORDERS | OEHR_ORDER_PK |
2 | OEHR_ORDERS | OEHR_ORDERS_SALES_REP_FK | OEHR_EMPLOYEES | OEHR_EMP_EMP_ID_PK |
3 | OEHR_EMPLOYEES | OEHR_EMP_DEPT_FK | OEHR_DEPARTMENTS | OEHR_DEPT_ID_PK |
Whew!
P.S. If you think a CTE solution is cleaner:
with data( TABLE_NAME, R_TABLE_NAME, path ) as ( select n.TABLE_NAME, n.R_TABLE_NAME, json_array(n.TABLE_NAME, n.CONSTRAINT_NAME, n.R_TABLE_NAME, n.R_CONSTRAINT_NAME) from t n where table_name = 'OEHR_ORDER_ITEMS' union all select n.TABLE_NAME, n.R_TABLE_NAME, o.path || ',' || json_array(n.TABLE_NAME, n.CONSTRAINT_NAME, n.R_TABLE_NAME, n.R_CONSTRAINT_NAME) from t n join data o on (n.table_name = o.r_table_name) and n.table_name != 'OEHR_DEPARTMENTS' ) select b.* from data a, json_table( '['|| path || ']', '$[*]' columns( rn for ordinality, table_name varchar2(128) path '$[0]', constraint_name varchar2(128) path '$[1]', r_table_name varchar2(128) path '$[2]', r_constraint_name varchar2(128) path '$[3]' ) ) b where a.r_table_name = 'OEHR_DEPARTMENTS';