At last, a use case where PL/SQL is faster than SQL : avoiding unnecessary ordering !
In the last few years, different solutions have been found to represent hierarchical data in JSON format :
- In 2018, Marc Bleron provided a user-defined aggregate function called JSONNest ;
- In 2020, “Paulzip” and I proposed very similar SQL solutions in the Oracle SQL & PL/SQL forum, which he combined in a blog post ;
- Recently (January 2022), “mathguy” proposed a recursive solution written in PL/SQL in that Oracle forum ;
- The discussion started by mathguy led me to propose a very different PL/SQL solution, improved by suggestions from mathguy and Paulzip.
I won’t go into the SQL solutions in detail. Basically there are three stages :
- Get the data in hierarchical order and return the order, the level, and the data as a JSON object.
- Using the
LEAD()
andLAG()
functions, determine whether the current JSON object is a child, a parent or a sibling and apply the appropriate JSON “glue”. - Aggregate the result into a CLOB, preserving the original order.
Notice that stages 2 and 3 require the data to be in the order established by stage 1. In the SQL solutions, the execution plan shows that unnecessary sorts were done, sometimes requiring the use of TEMP
space : yuck !
I finally realised that a PL/SQL function could read the output from stage 1, and for each input row add the JSON “glue” and append the result to the CLOB that the function returns. No extra sorting, and it runs about 40% faster than the most optimal SQL solution. Let me explain a few details :
- The main input is a
CURSOR
expression : this is not a literal, but a real liveSELECT
statement that can include bind variables. It returns only the level and a JSON object with the desired data, so it is quite generic. TheORDER SIBLINGS BY
clause is optional. - The input parameter is a strongly typed
REF
CURSOR
, which allows it to be fetched into a table of P/SQL records. - The input is fetched using
BULK
COLLECT
withLIMIT
, which uses little memory. Special handling is needed, because I need to know the levels of the rows immediately preceding and following the row being processed. - If a JSON object is a parent, I add a name / value pair where the value is an array. The name is by default ‘children’ but it is configurable by an optional second parameter.
- Concatenating text to a CLOB is a costly operation. To speed it up, I fill up a VARCHAR2 buffer, then concatenate the buffer to the CLOB.
- Use
PLSQL_OPTIMIZE_LEVEL=3
in order to “inline” the calls toAPPEND_JSO
. This shaves off about .2 seconds per million rows processed.
create or replace package hier_to_clob is
type t_hier_rec is record(
lvl pls_integer,
jso varchar2(4000)
);
type tt_hier_rec is table of t_hier_rec;
TYPE t_hier_rc IS REF CURSOR RETURN t_hier_rec;
function get(
p_hier_rc in t_hier_rc,
p_array_name varchar2 default 'children',
p_limit integer default 500
) return clob;
end hier_to_clob;
/
create or replace package body hier_to_clob is
function get(
p_hier_rc in t_hier_rc,
p_array_name varchar2 default 'children',
p_limit integer default 500
) return clob is
l_array_name varchar2(130) := ',"'||p_array_name||'":[';
lt_hier_rec tt_hier_rec;
l_hier_rec_prev t_hier_rec := t_hier_rec(0, null);
l_lvl_prev2 pls_integer := 0;
l_clob clob;
l_buffer varchar2(32767) := null;
l_buflen pls_integer := 0;
do_prev boolean:= false;
procedure append_jso(
p_jso varchar2,
p_lvl_prev pls_integer,
p_lvl pls_integer,
p_lvl_next pls_integer
) is
l_jso varchar2(4000);
begin
l_jso :=
case
when p_lvl_prev = 0 then null
when p_lvl - p_lvl_prev = 1 then l_array_name
when p_lvl > 1 then ','
end ||
rtrim(p_jso, '}') ||
rpad('}', (p_lvl - p_lvl_next) * 2 + 1, ']}');
if l_buflen + lengthb(l_jso) > 32767 then
l_clob := l_clob || l_buffer;
l_buffer := l_jso;
l_buflen := lengthb(l_buffer);
else
l_buffer := l_buffer || l_jso;
l_buflen := l_buflen + lengthb(l_jso);
end if;
end append_jso;
begin
loop
fetch p_hier_rc bulk collect into lt_hier_rec limit p_limit;
if do_prev then
append_jso(
l_hier_rec_prev.jso,
l_lvl_prev2,
l_hier_rec_prev.lvl,
case when lt_hier_rec.count > 0 then lt_hier_rec(1).lvl else 1 end
);
do_prev := false;
end if;
for i in 1..lt_hier_rec.count-1 loop
append_jso(
lt_hier_rec(i).jso,
case
when i = 1 then l_hier_rec_prev.lvl
else lt_hier_rec(i-1).lvl
end,
lt_hier_rec(i).lvl,
lt_hier_rec(i+1).lvl
);
end loop;
if lt_hier_rec.count > 0 then
l_lvl_prev2 :=
case lt_hier_rec.count
when 1 then l_hier_rec_prev.lvl
else lt_hier_rec(lt_hier_rec.count-1).lvl
end;
l_hier_rec_prev := lt_hier_rec(lt_hier_rec.count);
do_prev := true;
end if;
exit when p_hier_rc%notfound;
end loop;
if do_prev then
append_jso(
l_hier_rec_prev.jso,
l_lvl_prev2,
l_hier_rec_prev.lvl,
1
);
end if;
if l_buflen > 0 then
l_clob := l_clob || l_buffer;
end if;
return l_clob;
end get;
end hier_to_clob;
/
select hier_to_clob.get(
cursor(
select level, json_object(empno, ename)
from emp
start with empno = 7566
connect by mgr = prior empno
),
'grunts'
)
from dual;
/* pretty printing done outside of the function */
{
"empno" : 7566,
"ename" : "JONES",
"grunts" :
[
{
"empno" : 7788,
"ename" : "SCOTT",
"grunts" :
[
{
"empno" : 7876,
"ename" : "ADAMS"
}
]
},
{
"empno" : 7902,
"ename" : "FORD",
"grunts" :
[
{
"empno" : 7369,
"ename" : "SMITH"
}
]
}
]
}