Split a string again and again

I recently needed to gather multiple columns from multiple rows into one string, then split them out again. JSON_TABLE did both splits at the same time!

The situation

I want to find the shortest “foreign key” path between two tables in the OBE schema: OEHR_ORDER_ITEMS and OEHR_DEPARTMENTS. I also want the intermediate tables and the constraints used to connect the tables. As a starting point for today, I’ll create a little table with the data for the tables in the OBE schema that start with ‘OEHR’:

drop table t purge;
create table t as
with search_for (owner, table_name) as (
  select 'OBE', 'OEHR_DEPARTMENTS' from dual union all
  select 'OBE', 'OEHR_ORDER_ITEMS' from dual
)
, pur as (
  select OWNER, CONSTRAINT_NAME, TABLE_NAME,
    R_OWNER, R_CONSTRAINT_NAME
  from dba_constraints
  where constraint_type in('P','U','R')
  and owner = 'OBE'
  and table_name like 'OEHR%'
)
, relations as (
  select a.table_name, a.constraint_name,
    b.table_name r_table_name, a.r_constraint_name
  from pur a join pur b
  on (a.R_OWNER, a.R_CONSTRAINT_NAME) = ((b.OWNER, b.CONSTRAINT_NAME))
  and (a.OWNER, a.table_name) != ((b.OWNER, b.table_name))
)
select * from relations;

Now I’ll use a hierarchical query to find paths from one table to the other:

select rownum rn, level lvl, TABLE_NAME, CONSTRAINT_NAME, R_TABLE_NAME, R_CONSTRAINT_NAME
from t a
start with table_name = 'OEHR_ORDER_ITEMS'
connect by nocycle table_name = prior r_table_name
and table_name != 'OEHR_DEPARTMENTS';
RN LVL TABLE_NAME CONSTRAINT_NAME R_TABLE_NAME R_CONSTRAINT_NAME
1 1 OEHR_ORDER_ITEMS OEHR_ORDER_ITEMS_ORDER_ID_FK OEHR_ORDERS OEHR_ORDER_PK
2 2 OEHR_ORDERS OEHR_ORDERS_CUSTOMER_ID_FK OEHR_CUSTOMERS OEHR_CUSTOMERS_PK
3 3 OEHR_CUSTOMERS OEHR_CUST_ACCT_MGR_FK OEHR_EMPLOYEES OEHR_EMP_EMP_ID_PK
4 4 OEHR_EMPLOYEES OEHR_EMP_DEPT_FK OEHR_DEPARTMENTS OEHR_DEPT_ID_PK
5 4 OEHR_EMPLOYEES OEHR_EMP_JOB_FK OEHR_JOBS OEHR_JOB_ID_PK
6 2 OEHR_ORDERS OEHR_ORDERS_SALES_REP_FK OEHR_EMPLOYEES OEHR_EMP_EMP_ID_PK
7 3 OEHR_EMPLOYEES OEHR_EMP_DEPT_FK OEHR_DEPARTMENTS OEHR_DEPT_ID_PK
8 3 OEHR_EMPLOYEES OEHR_EMP_JOB_FK OEHR_JOBS OEHR_JOB_ID_PK
9 1 OEHR_ORDER_ITEMS OEHR_ORDER_ITEMS_PRODUCT_ID_FK OEHR_PRODUCT_INFORMATION OEHR_PRODUCT_INFO_PK

 

Just looking at the data myself, I can see that the shortest path will include rows 1, 6 and 7. Rows 1 through 4 will work too, but the path is longer.

Getting the data for the whole path

The problem is, how can I identify those three rows I want and just query them? The only way I see is to use SYS_CONNECT_BY_PATH (or the equivalent with a CTE) to string together all the data from all the rows I want – which means I have to split that data out again later.

To simplify what follows, I’m going to manually delete the rows that don’t matter:

delete from t where table_name not in ('OEHR_ORDER_ITEMS','OEHR_ORDERS','OEHR_EMPLOYEES');
delete from t where r_table_name not in ('OEHR_DEPARTMENTS','OEHR_ORDERS','OEHR_EMPLOYEES');
commit;

The rows that are left contain the data I want:

TABLE_NAME CONSTRAINT_NAME R_TABLE_NAME R_CONSTRAINT_NAME
OEHR_ORDER_ITEMS OEHR_ORDER_ITEMS_ORDER_ID_FK OEHR_ORDERS OEHR_ORDER_PK
OEHR_ORDERS OEHR_ORDERS_SALES_REP_FK OEHR_EMPLOYEES OEHR_EMP_EMP_ID_PK
OEHR_EMPLOYEES OEHR_EMP_DEPT_FK OEHR_DEPARTMENTS OEHR_DEPT_ID_PK

 

Now I’ll string that data together:

select sys_connect_by_path(
  table_name  ||'/'||  constraint_name||'/'||
  r_table_name||'/'||r_constraint_name
  ,'<'
) path
from t a
where r_table_name = 'OEHR_DEPARTMENTS'
start with table_name = 'OEHR_ORDER_ITEMS'
connect by table_name = prior r_table_name;

<OEHR_ORDER_ITEMS/OEHR_ORDER_ITEMS_ORDER_ID_FK/OEHR_ORDERS/OEHR_ORDER_PK
<OEHR_ORDERS/OEHR_ORDERS_SALES_REP_FK/OEHR_EMPLOYEES/OEHR_EMP_EMP_ID_PK
<OEHR_EMPLOYEES/OEHR_EMP_DEPT_FK/OEHR_DEPARTMENTS/OEHR_DEPT_ID_PK

I split the data into three lines for legibility. Note that I have to use different literals in the two parameters of SYS_CONNECT_BY_PATH, otherwise I get an exception.

Splitting out the data

Now we’re finally getting to the point. I can use JSON_ARRAY to string together the data for one row. Once I have all the data for all the rows, I can turn that into an array of arrays. Finally, JSON_TABLE will split everything out for me. Let’s start with JSON_ARRAY:

select sys_connect_by_path(
  json_array(table_name, constraint_name, r_table_name, r_constraint_name)
  , '
'
) path
from t a
where r_table_name = 'OEHR_DEPARTMENTS'
start with table_name = 'OEHR_ORDER_ITEMS'
connect by table_name = prior r_table_name;

["OEHR_ORDER_ITEMS","OEHR_ORDER_ITEMS_ORDER_ID_FK","OEHR_ORDERS","OEHR_ORDER_PK"]
["OEHR_ORDERS","OEHR_ORDERS_SALES_REP_FK","OEHR_EMPLOYEES","OEHR_EMP_EMP_ID_PK"]
["OEHR_EMPLOYEES","OEHR_EMP_DEPT_FK","OEHR_DEPARTMENTS","OEHR_DEPT_ID_PK"]

Now I’ll turn that data into an array of arrays:

with data as (
  select sys_connect_by_path(
    json_array(table_name, constraint_name, r_table_name, r_constraint_name)
    , '/'
  ) path
  from t a
  where r_table_name = 'OEHR_DEPARTMENTS'
  start with table_name = 'OEHR_ORDER_ITEMS'
  connect by table_name = prior r_table_name
)
select '['||substr(replace(path,'/',','),2) || ']' path
from data;
[
  [
    "OEHR_ORDER_ITEMS",
    "OEHR_ORDER_ITEMS_ORDER_ID_FK",
    "OEHR_ORDERS",
    "OEHR_ORDER_PK"
  ],
  [
    "OEHR_ORDERS",
    "OEHR_ORDERS_SALES_REP_FK",
    "OEHR_EMPLOYEES",
    "OEHR_EMP_EMP_ID_PK"
  ],
  [
    "OEHR_EMPLOYEES",
    "OEHR_EMP_DEPT_FK",
    "OEHR_DEPARTMENTS",
    "OEHR_DEPT_ID_PK"
  ]
]

Now JSON_TABLE will break down the top array into rows, just by putting ‘$[*]’ in the second parameter. The COLUMNS will do the rest, splitting the data in the inner arrays into different columns.

with data as (
  select sys_connect_by_path(
    json_array(table_name, constraint_name, r_table_name, r_constraint_name)
    , '/'
  ) path
  from t a
  where r_table_name = 'OEHR_DEPARTMENTS'
  start with table_name = 'OEHR_ORDER_ITEMS'
  connect by table_name = prior r_table_name
)
select a.* from data, json_table(
  '['||substr(replace(path,'/',','),2) || ']',
  '$[*]' columns(
    rn for ordinality,
    table_name varchar2(128) path '$[0]',
    constraint_name varchar2(128) path '$[1]',
    r_table_name varchar2(128) path '$[2]',
    r_constraint_name varchar2(128) path '$[3]'
  )
) a;
RN TABLE_NAME CONSTRAINT_NAME R_TABLE_NAME R_CONSTRAINT_NAME
1 OEHR_ORDER_ITEMS OEHR_ORDER_ITEMS_ORDER_ID_FK OEHR_ORDERS OEHR_ORDER_PK
2 OEHR_ORDERS OEHR_ORDERS_SALES_REP_FK OEHR_EMPLOYEES OEHR_EMP_EMP_ID_PK
3 OEHR_EMPLOYEES OEHR_EMP_DEPT_FK OEHR_DEPARTMENTS OEHR_DEPT_ID_PK

 

Whew!

P.S. If you think a CTE solution is cleaner:

with data( TABLE_NAME, R_TABLE_NAME, path ) as (
  select n.TABLE_NAME, n.R_TABLE_NAME,
    json_array(n.TABLE_NAME, n.CONSTRAINT_NAME, n.R_TABLE_NAME, n.R_CONSTRAINT_NAME)
  from t n
  where table_name = 'OEHR_ORDER_ITEMS'
  union all
  select n.TABLE_NAME, n.R_TABLE_NAME,
    o.path || ',' ||
    json_array(n.TABLE_NAME, n.CONSTRAINT_NAME, n.R_TABLE_NAME, n.R_CONSTRAINT_NAME)
  from t n
  join data o on (n.table_name = o.r_table_name)
    and n.table_name != 'OEHR_DEPARTMENTS'
)
select b.* from data a, json_table(
  '['|| path || ']',
  '$[*]' columns(
    rn for ordinality,
    table_name varchar2(128) path '$[0]',
    constraint_name varchar2(128) path '$[1]',
    r_table_name varchar2(128) path '$[2]',
    r_constraint_name varchar2(128) path '$[3]'
  )
) b
where a.r_table_name = 'OEHR_DEPARTMENTS';
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s