Make public package variables available to SQL

I recently posted an experimental approach to accessing package constants from SQL. Here is a more robust approach that includes all variables.

The problem

Within PL/SQL, SQL statements can be either static or dynamic. Static SQL can access any PL/SQL variables that are within scope, including variables defined in package specifications. PL/SQL rewrites the statement, putting a bind variable where the PL/SQL variable was and binding the value before execution.

Dynamic SQL, whether submitted from PL/SQL or any other language, cannot directly access package variables, even though they are part of the published specification. This frustrates developers, since access to functions has no such restriction. Getter functions are a workaround, but they are no fun.

Possible solutions

  • Table functions: whether pipelined or not, table functions can submit static SQL but be accessed by dynamic SQL. This approach is not very flexible, but when it fits the bill it can be used in production today.
  • Literals: for constants only, rewrite the SQL and replace package constants by their values, as found in DBA_SOURCE. My previous post tried this.
  • Bind variables: this would mimic what PL/SQL does for static SQL. To be a full solution, the values would have to be looked up and bound before executing the rewritten statement.
  • Getter functions.

Today’s choice

Starting with database version 12.1, we can declare PL/SQL functions in the WITH clause of a SELECT statement, so that is where I’m going to put my getter functions.

Take for example the statement:

select dbms_lob.call c from dual

Add a WITH clause containing a function, then call the function instead of accessing the package constant:

WITH function WF1_dbms_lob_call return dbms_lob.call%type deterministic is 
begin return dbms_lob.call; end;
select WF1_dbms_lob_call c from dual
  • By declaring the function to be deterministic, it is called once per execution and not once per row – at least in my tests using 18c. There appears to be no need to use scalar subquery caching.
  • Unlike the “literal” solution, any package variable that is valid in static SQL will be valid in the rewritten SQL.
  • There is no need to access the values of the variables in order to bind them.
  • The %type attribute ensures that the function returns a value of the same data type as the package variable.

Finding the package variables

Before making and using a getter function, I have to know that my SQL is trying to access a package variable! Instead of trying to reinvent parsing, I use the DBMS_SQL.PARSE procedure.

Let’s assume a SELECT statement is perfectly valid, except for a reference to a package variable. When I call DBMS_SQL.PARSE, I will get one of two errors:

  • “ORA-06553: PLS-221: ‘<variable>’ is not a procedure or is undefined”
  • ORA-00904: “<package>”.”<record/object>”.”<member>”: invalid identifier

I can also access the “last error position”, which gives me the location of the package variable within the SQL text. I now have enough information to extract the package variable text from the SELECT, generate a getter function and call that function instead of referencing the variable directly. After that, I go through the same parse / replace operations until the statement parses without error.

Please note that if there truly is an invalid identifier in the statement, the generated getter function will cause the next parse to fail with a different error message, which is enough to raise an appropriate exception.

Known limitations

  • Requires database version 12.1 or later. Tested in 18c.
  • This version is restricted to SELECT statements. There may or may not be a preexisting WITH clause.
  • The SQL statement, including the generated getter functions, cannot exceed 32767 bytes in length. When I try to parse anything longer, DBMS_SQL gives me the wrong last error position.
  • At this time, I do not handle quoted identifiers.
  • In theory, comments are allowed between members and periods:
    select dbms_lob  /*WEIRD*/ . call from dual;
    I can handle the spaces, but not the comments.
  • If you want to access a boolean in SQL, this utility can’t help you.

No SQL Translation Framework yet

The code here just rewrites the SELECT; it doesn’t use the SQL Translation Framework to execute it directly. Maybe next time?

create or replace package translate_pkg_vars authid current_user as
  procedure translate_sql(
    sql_text in CLOB, 
    translated_text out nocopy CLOB
  );
end translate_pkg_vars;
/
create or replace package body translate_pkg_vars as

  -- Standard prefix of functions in the generated WITH clause
  wf constant varchar2(8) := 'WF';
  
  procedure translate_sql(
    sql_text in CLOB, 
    translated_text out nocopy CLOB
  ) is
    not_a_procedure exception;
    pragma exception_init(not_a_procedure, -6553);
    invalid_identifier exception;
    pragma exception_init(invalid_identifier, -904);
    
    g_with_clause varchar2(32767);
    g_select_sql varchar2(32767);
    l_cursor_id integer;
    g_last_error_position integer;
    g_f_num integer := 0;
    
    procedure check_input is
      l_first_word varchar2(256);
      clob_too_long exception;
      pragma exception_init(clob_too_long, -6502);
      l_clob_too_long_msg constant varchar2(999) :=
        'Cannot parse statement longer than 32767 bytes: '
        || 'DBMS_SQL might return erroneous error position.';
    begin
      -- initialize with clause and select sql. Make sure input is 32k max.
      select lower(regexp_substr(sql_text,'(\s+|(/\*[^/*]*\*/)|(--.*
))*(\S+)',1,1,'im',4)) into l_first_word from dual;
      if l_first_word = 'select' then
        g_with_clause := null;
        g_select_sql := sql_text;
      elsif l_first_word = 'with' then
        g_with_clause := 'WITH
';
        g_select_sql := regexp_replace(sql_text,'(\s+|(/\*[^/*]*\*/)|(--.*
))*(\S+)','\1',1,1,'im');
      else
        raise_application_error(-20000, 'Input must be a SELECT statement.');
      end if;
    exception when clob_too_long then
      raise_application_error(-20001, l_clob_too_long_msg);
    end check_input;
    
    procedure add_function is
      l_errm varchar2(4000);
      l_member varchar2(128);
      l_identifier varchar2(4000);
      l_func_name varchar2(30 BYTE);
    begin
      -- change last error position from byte 0-offset to character 1-offset
      -- find last member of identifier
      -- find entire identifier
      -- generate function name
      -- add function definition to with clause
      -- replace identifier by function call.
      g_last_error_position :=
        length(
          substrb(g_with_clause||g_select_sql,1,g_last_error_position)
        ) 
        - nvl(lengthb(g_with_clause), 0)
        + 1;
      l_errm := sqlerrm;
      if sqlcode = -6553 then
        if instr(l_errm, 'PLS-221') = 0 then
          raise_application_error(
            -20002,
            'Identifier '||l_identifier||' is invalid even within PL/SQL.'
          );
        else
          select regexp_substr(l_errm, '''(.+)''',1,1,null,1) into l_member
          from dual;
        end if;
      else
        select regexp_substr(l_errm, '"([^"]+)"[^"]*$',1,1,null,1) into l_member
        from dual;
      end if;
      select regexp_substr(
        g_select_sql, 
        '([[:alpha:]][[:alpha:]0-9_#$]*\s*[.]\s*)+'||l_member,
        g_last_error_position,1,'im'
      )
      into l_identifier from dual;
      g_f_num := g_f_num + 1;
      l_func_name := substrb(
        WF||g_f_num||'_'||regexp_replace(l_identifier,'\s|[.]','_'),
        1,30
      );
      g_with_clause := nvl(g_with_clause, 'WITH
')
        || 'function '||l_func_name||' return '
        ||l_identifier||'%type deterministic is begin return '
        ||l_identifier||'; end;
';
      g_select_sql := regexp_replace(
        g_select_sql, l_identifier, l_func_name, g_last_error_position, 1
      );

    end add_function;
    
  begin
    check_input;
    loop
      begin
        l_cursor_id := dbms_sql.open_cursor();
        dbms_sql.parse(l_cursor_id,g_with_clause||g_select_sql,1);
        dbms_sql.close_cursor(l_cursor_id);
        -- If all is well, we are done
        exit;
      exception when not_a_procedure or invalid_identifier then
        g_last_error_position := dbms_sql.last_error_position;
        dbms_sql.close_cursor(l_cursor_id);
        add_function;
      when others then
        if l_cursor_id is not null then
          dbms_sql.close_cursor(l_cursor_id);
        end if;
        raise;
      end;
    end loop;
    
    translated_text := g_with_clause||g_select_sql;
    
  end translate_sql;
  
end translate_pkg_vars;
/
declare
  sql_text  CLOB := 'select dbms_lob.call c from dual';
  translated_text CLOB;
begin
  translate_pkg_vars.translate_sql(sql_text, translated_text);
  dbms_output.put_line('output:
'||translated_text);
end;
/
output:
WITH
function WF1_dbms_lob_call return dbms_lob.call%type deterministic is begin return dbms_lob.call; end;
select WF1_dbms_lob_call c from dual
Advertisements

Make public package constants available to SQL

This is an enhancement request on ODC and a recent Twitter topic. Here is a homemade solution using the SQL Translation Framework.

UPDATE 2018-12-31: this solution became obsolete pretty quickly! Please see this post for a much better solution.

When executing static SQL inside PL/SQL, the developer can directly reference package variables because PL/SQL will automatically treat them as bind variables and bind the correct values. If the developer wants to test the SQL statement outside of PL/SQL, he/she discovers that the package variable is interpreted as a package function, which of course does not exist.

The package I wrote can create a translation profile, enable it, disable it and drop it. It also contains the TRANSLATE_SQL procedure that does the work. It is meant strictly for development environments as a transparent workaround.

The procedure parses the statement, and if it finds something that can be a reference to a CONSTANT variable defined in a package, it substitutes the value found in DBA_SOURCE.

create or replace package translate_constants as
  g_profile_name constant varchar2(30) := 'CONSTANTS';
  procedure create_profile;
  procedure enable_trans;
  procedure translate_sql(
    sql_text in CLOB, 
    translated_text out nocopy CLOB
  );
  procedure translate_error(
    error_code in BINARY_INTEGER,
    translated_code out BINARY_INTEGER,
    translated_sqlstate out nocopy VARCHAR2
  );
  procedure disable_trans;
  procedure drop_profile;
end translate_constants;
/
create or replace package body translate_constants as
  type t_parm_val is record (
    parm varchar2(512),
    val varchar2(4000)
  );
  type tt_parm_val is table of t_parm_val;
  lt_parm_val tt_parm_val := null;
  
  procedure do(p_sql in varchar2) is
  begin
    dbms_output.put_line(p_sql);
    execute immediate p_sql;
  end do;
  
  procedure create_profile is
    name_in_use exception;
    pragma exception_init(name_in_use, -955);
  begin
    dbms_sql_translator.create_profile(g_profile_name); 
  DBMS_SQL_TRANSLATOR.SET_ATTRIBUTE(
    profile_name     =>  g_profile_name, 
    attribute_name   =>  DBMS_SQL_TRANSLATOR.ATTR_TRANSLATOR, 
    attribute_value  =>  $$PLSQL_UNIT); 
  exception when name_in_use then null;
  end create_profile;
  
  procedure enable_trans is
  begin
    do('alter session set sql_translation_profile = '|| g_profile_name);
    do('alter session set events = ''10601 trace name context forever, level 32''');
  end enable_trans;
  
  procedure translate_sql(
    sql_text in CLOB, 
    translated_text out nocopy CLOB
  ) is
    l_first_word varchar2(256);
    l_cursor_id integer;
    not_a_procedure exception;
    pragma exception_init(not_a_procedure, -6553);
    l_errm varchar2(4000);
    l_constant varchar2(4000);
    l_identifier varchar2(4000);
    l_test_sql varchar2(4000);
    l_text varchar2(4000);
  begin
    select lower(regexp_substr(sql_text,'\w+')) into l_first_word from dual;
    if l_first_word not in ('select','with','insert','update','delete','merge') then
      translated_text := null;
      return;
    end if;
    l_errm := 'start';
    translated_text := sql_text;
    while l_errm is not null loop
      l_errm := null;
      begin
        l_cursor_id := dbms_sql.open_cursor();
        dbms_output.put_line(translated_text);
        dbms_sql.parse(l_cursor_id,translated_text,1);
      exception when not_a_procedure then
        l_errm := sqlerrm;
        if instr(l_errm, 'PLS-221:') = 0 then
          raise;
        end if;
        dbms_output.put_line(l_errm);
        select regexp_substr(l_errm, q'§'(.+)'§',1,1,null,1) into l_constant from dual;
        dbms_output.put_line(l_constant);
        select regexp_substr(
          translated_text, 
          '([A-Z][A-Z0-9#_$]*[.])?[A-Z][A-Z0-9#_$]*[.]'||l_constant
          ,1,1,'i'
        )
        into l_identifier from dual;
        dbms_output.put_line(l_identifier);        
        dbms_utility.expand_sql_text(
          'select ' || l_identifier || ' from dual',
          l_test_sql
        );
        dbms_output.put_line(l_test_sql);
        dbms_output.put_line(regexp_substr(l_test_sql,'"(.*?)"',1,1,null,1));
        dbms_output.put_line(regexp_substr(l_test_sql,'"(.*?)"',1,2,null,1));
        select regexp_substr(text, ':=\s*?(\S.*?)\s*;',1,1,null,1)
        into l_text from dba_source
        where owner = regexp_substr(l_test_sql,'"(.*?)"',1,1,null,1)
          and name = regexp_substr(l_test_sql,'"(.*?)"',1,2,null,1) 
          and type = 'PACKAGE' 
          and regexp_like(text, l_constant||'\s+constant\s+','i');
        
        dbms_output.put_line('>'||l_text);
        translated_text := regexp_replace(translated_text,'([^.])'||l_identifier,'\1'||l_text,1,0,'m');
      end;
      dbms_sql.close_cursor(l_cursor_id);
    end loop;
    if l_cursor_id is not null then
      dbms_sql.close_cursor(l_cursor_id);
    end if;
  exception when others then
    if l_cursor_id is not null then
      dbms_sql.close_cursor(l_cursor_id);
    end if;
    raise;

  end translate_sql;
  
  procedure translate_error(
    error_code in BINARY_INTEGER,
    translated_code out BINARY_INTEGER,
    translated_sqlstate out nocopy VARCHAR2
  ) is
  begin
    translated_sqlstate := null;
  end translate_error;
  
  procedure disable_trans is
  begin
    do('alter session set sql_translation_profile = null');
    do('alter session set events = ''10601 trace name context off''');
  end disable_trans;
  
  procedure drop_profile is
    profile_not_found exception;
    pragma exception_init(profile_not_found, -24252);
  begin
    dbms_sql_translator.drop_profile(g_profile_name);
  exception when profile_not_found then null;
  end drop_profile;
  
end translate_constants;
/

And a small test:

exec translate_constants.create_profile;
exec translate_constants.enable_trans;

select dbms_lob.lobmaxsize maxsize,
sys.dbms_lob.lobmaxsize samething from dual;

   MAXSIZE  SAMETHING
---------- ----------
1.8447E+19 1.8447E+19
          
exec translate_constants.disable_trans;
exec translate_constants.drop_profile;

Remember, this is only to help developers. It is not something to ever deploy in a test or production environment.

SQL Templates and multiple REPLACE

I frequently use what I call “SQL templates”, where I replace multiple placeholders by calculated values. Here are two methods that don’t require a stored procedure.

What do I mean by “template”?

SQL statements have to be parsed before they can be executed. The standard way to minimise parsing is to use bind variables. If we use a bind variable instead of a literal, the same statement can be parsed once and executed many times with different values assigned to the variable.

We cannot put the names of tables or columns into bind variables: these are data objects that have to be known at parse time.

What I mean by “template” is a SQL statement with a specific logic, but without the names of the data objects. I get the names I need from input parameters and queries, then I put them in the statement where the placeholders were. If I do this twice with different parameters and the resulting text is different, the optimizer will of course parse each result separately before execution.

So: don’t confuse this technique with bind variables.

Please keep in mind that I never use this technique in production. I generate the SQL, read it, test it and then deploy it. This technique is meant to make SQL development more productive, not replace the developer.

If you ever want to generate SQL in production, work very hard to avoid SQL injection bugs. For example, use DBMS_ASSERT or data dictionary lookups to make sure input parameters are really data objects.

Examples

select * from emp where ename = 'KING';

This statement is not a template and it doesn’t use bind variables.

select * from emp where ename = :ename;

This statement is not a template, but it uses a bind variable.

select /*+ qb_name(COMPARE) */
  #COL_LIST#,
  decode(sign(sum(NEW_CNT)), 1, 'I', 'D') OP,
  abs(sum(NEW_CNT)) CNT
FROM (
  select /*+ qb_name(old) */
  #COL_LIST#
    , -1 NEW_CNT
  from #OLD_TABLE# O
  union all
  select /*+ qb_name(new) */
  #COL_LIST#
    , 1 NEW_CNT
  from #NEW_TABLE# N
)
group by
  #COL_LIST#
having sum(NEW_CNT) != 0
order by 1, OP;

This is a “template” as I use the word here. It shows the logic for comparing two tables using the GROUP BY method. To get a usable SQL statement, I need to replace all the placeholders (enclosed in #) by real values.

Model clause technique

Here is one way to do multiple replacements using pure SQL:

with template(sql_text) as (select cast(
q'§<template>§'
as varchar2(4000)) from dual
)
, parms(v_from, v_to) as (
  <parms>
)
select sql_text from template
model
REFERENCE P ON (select rownum p_dim, v_from, v_to from parms)
  DIMENSION BY (p_dim) measures (v_from, v_to) 
main M
dimension by (1 m_dim)
measures(sql_text)
rules iterate(99) until P.v_from[iteration_number+1] is null (
  sql_text[1] = replace(sql_text[1],P.v_from[iteration_number+1], P.v_to[iteration_number+1])
);
  1. Put the template text in the TEMPLATE subquery.
  2. In the PARMS subquery, create one row per replacement: V_FROM is the placeholder and V_TO is the value.
  3. The MODEL clause will iterate through the reference model P and do each replacement in turn.

Here is an example that generates a comparison. It’s a silly example because I am comparing EMP to itself, but it’s just for illustration!

with template(sql_text) as (select cast(
q'§select /*+ qb_name(COMPARE) */
  #COL_LIST#,
  decode(sign(sum(NEW_CNT)), 1, 'I', 'D') OP,
  abs(sum(NEW_CNT)) CNT
FROM (
  select /*+ qb_name(old) */
  #COL_LIST#
    , -1 NEW_CNT
  from #OLD_TABLE# O
  union all
  select /*+ qb_name(new) */
  #COL_LIST#
    , 1 NEW_CNT
  from #NEW_TABLE# N
)
group by
  #COL_LIST#
having sum(NEW_CNT) != 0
order by 1, OP;§'
as varchar2(4000)) from dual
)
, parms(v_from, v_to) as (
  select '#NEW_TABLE#', 'EMP' from dual
  union all
  select '#OLD_TABLE#', 'EMP' from dual
  union all
  select '#COL_LIST#',
    listagg(column_name, ',')  within group(order by internal_column_id)
  from user_tab_cols
  where table_name = 'EMP'
  and virtual_column = 'NO'
)
select sql_text from template
model
REFERENCE P ON (select rownum p_dim, v_from, v_to from parms)
  DIMENSION BY (p_dim) measures (v_from, v_to) 
main M
dimension by (1 m_dim)
measures(sql_text)
rules iterate(99) until P.v_from[iteration_number+1] is null (
  sql_text[1] = replace(sql_text[1],P.v_from[iteration_number+1], P.v_to[iteration_number+1])
);
select /*+ qb_name(COMPARE) */
  EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO,
  decode(sign(sum(NEW_CNT)), 1, 'I', 'D') OP,
  abs(sum(NEW_CNT)) CNT
FROM (
  select /*+ qb_name(old) */
  EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
    , -1 NEW_CNT
  from EMP O
  union all
  select /*+ qb_name(new) */
  EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
    , 1 NEW_CNT
  from EMP N
)
group by
  EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
having sum(NEW_CNT) != 0
order by 1, OP;

Using new stuff: JSON

The problem with the MODEL technique is that the output cannot be a CLOB. To get a CLOB, suppose we use a bit of PL/SQL in a WITH clause? Now we have to pass the placeholder / value pairs as parameters. I decided that a JSON object would be a reasonable implementation. Warning: this looks nice but requires version 12.2 or later.

with  function multi_replace(
    p_template in varchar2,
    p_parms in varchar2
  ) return clob is
    l_out_text clob := p_template;
    l_json JSON_OBJECT_T := JSON_OBJECT_T.parse(p_parms);
    l_keys JSON_KEY_LIST;
  begin
    l_keys := l_json.get_keys;
    for i in 1..l_keys.count loop
      l_out_text := replace(l_out_text, l_keys(i), l_json.get_string(l_keys(i)));
    end loop;
    return l_out_text;
  end multi_replace;
input(template, parms) as (select
q'§select /*+ qb_name(COMPARE) */
  #COL_LIST#,
  decode(sign(sum(NEW_CNT)), 1, 'I', 'D') OP,
  abs(sum(NEW_CNT)) CNT
FROM (
  select /*+ qb_name(old) */
  #COL_LIST#
    , -1 NEW_CNT
  from #OLD_TABLE# O
  union all
  select /*+ qb_name(new) */
  #COL_LIST#
    , 1 NEW_CNT
  from #NEW_TABLE# N
)
group by
  #COL_LIST#
having sum(NEW_CNT) != 0
order by 1, OP;§',
  json_object(
    '#NEW_TABLE#' is 'EMP',
    '#OLD_TABLE#' is 'EMP',
    '#COL_LIST#'  is (
      select listagg(column_name, ',')  within group(order by internal_column_id)
      from user_tab_cols
      where table_name = 'EMP' and virtual_column = 'NO'
    )
  )
  from dual
)
select multi_replace(template, parms) sql_text
from input;
  • The PL/SQL function parses the input JSON object, gets all the keys (the placeholders), then uses them to get the values one by one and apply them to the template.
  • The JSON_OBJECT function in SQL turns each placeholder / replacement pair into a JSON name:value pair.

I like this approach because it shows each placeholder and its replacement value side by side. It’s just easier for me to read (and write!).

Ranges with NULLs 07: Swiss Army Knife

I came up with this name a few years ago for a problem that I couldn’t solve then: analyse a series of ranges and say whether and how they overlap. It turns out the solution is not that hard.

Current list of posts: 

The Problem

I want to order a set of ranges by “from” and “to”, then determine the relations between each range and the ranges that follow it. The only limit is with the “precedes” relation: I know that the first range may precede any number of following ranges, so I only want relations where one range immediately precedes the next, with no intervening range.

MATCH_RECOGNIZE to the rescue

The MATCH_RECOGNIZE clause (introduced in 12c) is a good fit here, for several reasons:

  • I can PARTITION BY different test cases and ORDER BY the “from” and “to” values.
  • I can say AFTER MATCH SKIP TO NEXT ROW to use each and every range as a starting point.
  • I can use the pipe character | for alternation. Alternation matches a single regular expression from a list of several possible regular expressions. Alternatives are preferred in the order they are specified.

Reminders (skip if you are reading the whole series)

  • Ranges have exclusive end points, which allows ranges that “meet” without gaps or overlaps.
  • Three categories:
    1. Neither “To” nor “From” may be NULL
    2. “To” may be NULL, but not “From”
    3. Both “From” and “To” may be NULL.
  • My data consists of an object and a “from” and “to” range.
  • I propose solutions that require database version 12.1 or later.

No NULLs

This is the same data as in my previous post Ranges with NULLs 03: Gaps

create table t(
  obj_id varchar2(9), 
  f int, t int, af int, at int, bf int, bt int
);

Insert into T values ('1p    0',1,2,1,2,3,4);
Insert into T values ('1p    0',3,4,1,2,3,4);
Insert into T values ('2m    0',1,2,1,2,2,3);
Insert into T values ('2m    0',2,3,1,2,2,3);
Insert into T values ('3o    0',1,3,1,3,2,4);
Insert into T values ('3o    0',2,4,1,3,2,4);
Insert into T values ('4f    0',1,3,1,3,2,3);
Insert into T values ('4f    0',2,3,1,3,2,3);
Insert into T values ('5c    0',1,4,1,4,2,3);
Insert into T values ('5c    0',2,3,1,4,2,3);
Insert into T values ('6s    0',1,2,1,2,1,3);
Insert into T values ('6s    0',1,3,1,2,1,3);
Insert into T values ('7e    0',1,2,1,2,1,2);
Insert into T values ('7e    0',1,2,1,2,1,2);
  1. There are two rows = two ranges for each test case.
  2. OBJ_ID indicates the relation between the ranges: “precedes, meets, overlaps, finished by, contains, starts, equals”. The 0 at the end means there are no nulls.
  3. F is the “from” value.
  4. T is the “to” value (exclusive).
  5. So we can check our results more easily, each row contains the values for both ranges,
    1. AF and AT for the first range,
    2. BF and BT for the second range.
The code:
  • line 4: I partition by the test case and order by “from” and “to”.
  • line 5: the classifier() function returns the name of the condition that was met, which is also the name of the relation I am looking for.
  • line 7: I check each row in turn.
  • line 8: A is the first row (which I do not return), and the following rows belong to one of the other conditions.
  • I check the relations “backwards” so that some checks are implicit:
    • equals: between the first row and the current row, “from” and “to” are the same.
    • starts: I don’t explicitly check that the first “to” is less than the current “to”. At this point they cannot be equal and the ORDER BY guarantees that the first “to” is not greater than the current “to”.
    • after the “starts” condition, the first “from” must be less than the current “to”, so I don’t have to check it explicitly.
    • I don’t care about “precedes” if there are intervening rows.
select OBJ_ID, PRIOR_F, PRIOR_T, relation, F, T
from t
match_recognize(
  partition by obj_id order by f, t
  measures a.f prior_f, a.t prior_t, classifier() relation
  all rows per match
  after match skip to next row
  pattern( {-a-} (equals|starts|finished_by|contains|overlaps|meets|precedes)+ )
  define equals as a.f = f and a.t = t,
    starts as a.f = f, -- if true, a.t must be < t because of order by
    -- from now on a.f must be < f because of order by
    contains as a.t > t,
    finished_by as a.t = t,
    overlaps as a.t > f and a.t < t,
    meets as a.t = f,
    -- at this point a.t must be < f
    precedes as count(*) = 2    
);
OBJ_ID PRIOR_F PRIOR_T RELATION F T
1p    0 1 2 PRECEDES 3 4
2m    0 1 2 MEETS 2 3
3o    0 1 3 OVERLAPS 2 4
4f    0 1 3 FINISHED_BY 2 3
5c    0 1 4 CONTAINS 2 3
6s    0 1 2 STARTS 1 3
7e    0 1 2 EQUALS 1 2

 

“To” NULL

delete from t;
Insert into T values ('1p    1',1,2,1,2,3,null);
Insert into T values ('1p    1',3,null,1,2,3,null);
Insert into T values ('2m    1',1,2,1,2,2,null);
Insert into T values ('2m    1',2,null,1,2,2,null);
Insert into T values ('3o    1',1,3,1,3,2,null);
Insert into T values ('3o    1',2,null,1,3,2,null);
Insert into T values ('4f    1',1,null,1,null,2,null);
Insert into T values ('4f    1',2,null,1,null,2,null);
Insert into T values ('5c    1',1,null,1,null,2,3);
Insert into T values ('5c    1',2,3,1,null,2,3);
Insert into T values ('6s    1',1,2,1,2,1,null);
Insert into T values ('6s    1',1,null,1,2,1,null);
Insert into T values ('7e    1',1,null,1,null,1,null);
Insert into T values ('7e    1',1,null,1,null,1,null);
  • lines 9 and 12: with equality comparisons, I use DECODE() to treat NULL values as equal.
  • lines 11 and 13: with inequality comparisons, I check explicitly for NULL when necessary.
select OBJ_ID, PRIOR_F, PRIOR_T, relation, F, T
from t
match_recognize(
  partition by obj_id order by f, t
  measures a.f prior_f, a.t prior_t, classifier() relation
  all rows per match
  after match skip to next row
  pattern( {-a-} (equals|starts|finished_by|contains|overlaps|meets|precedes)+ )
  define equals as a.f = f and decode(a.t,t,0,1) = 0,
    starts as a.f = f,
    contains as (a.t > t or a.t is null),
    finished_by as decode(a.t,t,0,1) = 0,
    overlaps as a.t > f and (a.t < t or t is null),
    meets as a.t = f,
    precedes as count(*) = 2    
);
OBJ_ID PRIOR_F PRIOR_T RELATION F T
1p    1 1 2 PRECEDES 3
2m    1 1 2 MEETS 2
3o    1 1 3 OVERLAPS 2
4f    1 1 FINISHED_BY 2
5c    1 1 CONTAINS 2 3
6s    1 1 2 STARTS 1
7e    1 1 EQUALS 1

 

With “From” and “To” NULLs

delete from t;
Insert into T values ('1p    2',3,null,null,2,3,null);
Insert into T values ('1p    2',null,2,null,2,3,null);
Insert into T values ('2m    2',2,null,null,2,2,null);
Insert into T values ('2m    2',null,2,null,2,2,null);
Insert into T values ('3o    2',2,null,null,3,2,null);
Insert into T values ('3o    2',null,3,null,3,2,null);
Insert into T values ('4f    2',2,null,null,null,2,null);
Insert into T values ('4f    2',null,null,null,null,2,null);
Insert into T values ('5c    2',2,3,null,null,2,3);
Insert into T values ('5c    2',null,null,null,null,2,3);
Insert into T values ('6s    2',null,2,null,2,null,null);
Insert into T values ('6s    2',null,null,null,2,null,null);
Insert into T values ('7e    2',null,null,null,null,null,null);
Insert into T values ('7e    2',null,null,null,null,null,null);
select OBJ_ID, PRIOR_F, PRIOR_T, relation, F, T
from t
match_recognize(
  partition by obj_id order by f nulls first, t
  measures a.f prior_f, a.t prior_t, classifier() relation
  all rows per match
  after match skip to next row
  pattern( {-a-} (equals|starts|finished_by|contains|overlaps|meets|precedes)+ )
  define equals as decode(a.f,f,0,1) = 0 and decode(a.t,t,0,1) = 0,
    starts as decode(a.f,f,0,1) = 0,
    contains as a.t > t or a.t is null,
    finished_by as decode(a.t,t,0,1) = 0,
    overlaps as a.t > f and (a.t < t or t is null),
    meets as a.t = f,
    precedes as count(*) = 2    
);
OBJ_ID PRIOR_F PRIOR_T RELATION F T
1p    2 2 PRECEDES 3
2m    2 2 MEETS 2
3o    2 3 OVERLAPS 2
4f    2 FINISHED_BY 2
5c    2 CONTAINS 2 3
6s    2 2 STARTS
7e    2 EQUALS

 

Ranges with NULLs 06: Overlaps with Conflicting Data

In 2014 I attacked the problem of “Overlapping ranges with priorities”. This time I’ll deal with NULLs and propose an improved solution.

Current list of posts: 

Requirement

Assume overlapping ranges with one attribute. I call the overlapping parts “segments”.

  • Segments: break each range down into segments, including the range’s attribute.
  • Priority: when two or more segments with the same “from” and “to” have different values, the minimum value has priority.
  • Merge: once all the segments have proper values assigned, merge contiguous segments having the same values.

Reminders

  • In my world, ranges have exclusive end points: this is called the “closed-open” type. It allows ranges that “meet” without gaps or overlaps.
  • I classify ranges in three categories:
    1. Neither “To” nor “From” may be NULL
    2. “To” may be NULL, but not “From”
    3. Both “From” and “To” may be NULL.
  • I propose solutions that require database version 12.1 or later.

Test data with no NULLs allowed

This isn’t a full test harness, just the minimum to illustrate the solution.

CREATE TABLE T (
OBJ_ID    VARCHAR2 ( 20 BYTE ),
F         NUMBER,
T         NUMBER,
ATTR_ID   NUMBER,
AB_ATTR   VARCHAR2 ( 90 BYTE )
);
Insert into T values ('3o    01','1','3','1','1:1=1');
Insert into T values ('3o    01','2','4','1','1:1=1');
Insert into T values ('3o    02','1','3','1','2:1<2');
Insert into T values ('3o    02','2','4','2','2:1<2');
Insert into T values ('3o    03','1','3','2','3:2>1');
Insert into T values ('3o    03','2','4','1','3:2>1');
Insert into T values ('3o    04','1','3',null,'4:N-1');
Insert into T values ('3o    04','2','4','1','4:N-1');
Insert into T values ('3o    05','1','3','1','5:1-N');
Insert into T values ('3o    05','2','4',null,'5:1-N');
Insert into T values ('3o    06','1','3',null,'6:N-N');
Insert into T values ('3o    06','2','4',null,'6:N-N');
Insert into T values ('5c    01','1','4','1','1:1=1');
Insert into T values ('5c    01','2','3','1','1:1=1');
Insert into T values ('5c    02','1','4','1','2:1<2');
Insert into T values ('5c    02','2','3','2','2:1<2');
Insert into T values ('5c    03','1','4','2','3:2>1');
Insert into T values ('5c    03','2','3','1','3:2>1');
Insert into T values ('5c    04','1','4',null,'4:N-1');
Insert into T values ('5c    04','2','3','1','4:N-1');
Insert into T values ('5c    05','1','4','1','5:1-N');
Insert into T values ('5c    05','2','3',null,'5:1-N');
Insert into T values ('5c    06','1','4',null,'6:N-N');
Insert into T values ('5c    06','2','3',null,'6:N-N');
OBJ_ID F T ATTR_ID AB_ATTR
3o    01 1 3 1 1:1=1
3o    01 2 4 1 1:1=1
3o    02 1 3 1 2:1<2
3o    02 2 4 2 2:1<2
3o    03 1 3 2 3:2>1
3o    03 2 4 1 3:2>1
3o    04 1 3 4:N-1
3o    04 2 4 1 4:N-1
3o    05 1 3 1 5:1-N
3o    05 2 4 5:1-N
3o    06 1 3 6:N-N
3o    06 2 4 6:N-N
5c    01 1 4 1 1:1=1
5c    01 2 3 1 1:1=1
5c    02 1 4 1 2:1<2
5c    02 2 3 2 2:1<2
5c    03 1 4 2 3:2>1
5c    03 2 3 1 3:2>1
5c    04 1 4 4:N-1
5c    04 2 3 1 4:N-1
5c    05 1 4 1 5:1-N
5c    05 2 3 5:1-N
5c    06 1 4 6:N-N
5c    06 2 3 6:N-N

 

  1. There are two rows = two ranges for each test case.
  2. OBJ_ID indicates the relation between the ranges: “overlaps, contains”. The 0 in position 6 means there are no nulls.
  3. F is the “from” value.
  4. T is the “to” value (exclusive).
  5. So we can check our results more easily, each row contains the relationship between the ATTR_ID values. There are 6 such relationships, identified by the last digit of OBJ_ID.

Non-null ranges

  • First I UNPIVOT F and T into one column called FT  that contains all the boundary points, and I create a column called SEG_START that is -1 when the source is F and 1 with the source is T. I also retain the original value of T in every new row.
  • Then I use MATCH_RECOGNIZE to identify, each time SEG_START = -1, all the boundary points up to and including the original T. I output each segment with the original ATTR_ID value.
  • Then I use GROUP BY to reduce duplicate segments to one, retaining the minimum ATTR_ID
  • Finally, I use MATCH_RECOGNIZE again to merge the contiguous segments with the same ATTR_ID value.
with unpivoted as (
  select * from t
  unpivot ( (ft, t) for seg_start in ((f,t) as -1, (t,t) as 1))
)
, segmented as (
  select obj_id, ab_attr, seg_f, seg_t, range_attr_id
  from unpivoted
  match_recognize(
    partition by obj_id order by ft, seg_start, t
    measures prev(ft) seg_f, ft seg_t, first(attr_id) range_attr_id
    all rows per match
    after match skip to next row
    pattern({-a-} b+)
    define a as seg_start = -1,
      b as ft <= first(t)
  )
  where seg_f < seg_t
)
, grouped as (
  select OBJ_ID, AB_ATTR, SEG_F, SEG_T, min(RANGE_ATTR_ID) attr_id
  from segmented
  group by OBJ_ID, AB_ATTR, SEG_F, SEG_T
)
select * from grouped
match_recognize(
  partition by obj_id, ab_attr, attr_id order by seg_f, seg_t
  measures first(seg_f) f, last(seg_t) t
  pattern(a b*)
  define b as seg_f = prev(seg_t)
)
order by obj_id, f, t;
OBJ_ID AB_ATTR ATTR_ID F T
3o    01 1:1=1 1 1 4
3o    02 2:1<2 1 1 3
3o    02 2:1<2 2 3 4
3o    03 3:2>1 2 1 2
3o    03 3:2>1 1 2 4
3o    04 4:N-1 1 2
3o    04 4:N-1 1 2 4
3o    05 5:1-N 1 1 3
3o    05 5:1-N 3 4
3o    06 6:N-N 1 4
5c    01 1:1=1 1 1 4
5c    02 2:1<2 1 1 4
5c    03 3:2>1 2 1 2
5c    03 3:2>1 1 2 3
5c    03 3:2>1 2 3 4
5c    04 4:N-1 1 2
5c    04 4:N-1 1 2 3
5c    04 4:N-1 3 4
5c    05 5:1-N 1 1 4
5c    06 6:N-N 1 4

 

Segments with “To” NULL

Insert into T values ('3o    11','1','3','1','1:1=1');
Insert into T values ('3o    11','2',null,'1','1:1=1');
Insert into T values ('3o    12','1','3','1','2:1<2');
Insert into T values ('3o    12','2',null,'2','2:11');
Insert into T values ('3o    13','2',null,'1','3:2>1');
Insert into T values ('3o    14','1','3',null,'4:N-1');
Insert into T values ('3o    14','2',null,'1','4:N-1');
Insert into T values ('3o    15','1','3','1','5:1-N');
Insert into T values ('3o    15','2',null,null,'5:1-N');
Insert into T values ('3o    16','1','3',null,'6:N-N');
Insert into T values ('3o    16','2',null,null,'6:N-N');
Insert into T values ('5c    11','1',null,'1','1:1=1');
Insert into T values ('5c    11','2','3','1','1:1=1');
Insert into T values ('5c    12','1',null,'1','2:1<2');
Insert into T values ('5c    12','2','3','2','2:11');
Insert into T values ('5c    13','2','3','1','3:2>1');
Insert into T values ('5c    14','1',null,null,'4:N-1');
Insert into T values ('5c    14','2','3','1','4:N-1');
Insert into T values ('5c    15','1',null,'1','5:1-N');
Insert into T values ('5c    15','2','3',null,'5:1-N');
Insert into T values ('5c    16','1',null,null,'6:N-N');
Insert into T values ('5c    16','2','3',null,'6:N-N');
  • line 3: add the INCLUDE NULLS option to include T when it is NULL.
  • lines 15 and 17: include segments that end with NULL.
with unpivoted as (
  select * from t
  unpivot INCLUDE NULLS ( (ft, t) for seg_start in ((f,t) as -1, (t,t) as 1))
)
, segmented as (
  select obj_id, ab_attr, seg_f, seg_t, range_attr_id
  from unpivoted
  match_recognize(
    partition by obj_id order by ft, seg_start, t
    measures prev(ft) seg_f, ft seg_t, first(attr_id) range_attr_id
    all rows per match
    after match skip to next row
    pattern({-a-} b+)
    define a as seg_start = -1,
      b as ft <= first(t) OR FIRST(T) IS NULL
  )
  where seg_f < seg_t OR SEG_T IS NULL
)
, grouped as (
  select OBJ_ID, AB_ATTR, SEG_F, SEG_T, min(RANGE_ATTR_ID) attr_id
  from segmented
  group by OBJ_ID, AB_ATTR, SEG_F, SEG_T
)
select * from grouped
match_recognize(
  partition by obj_id, ab_attr, attr_id order by seg_f, seg_t
  measures first(seg_f) f, last(seg_t) t
  pattern(a b*)
  define b as seg_f = prev(seg_t)
)
order by obj_id, f, t;

Just so your mouse won’t get too tired from scrolling, I will spare you the output. There are 20 additional rows for the new test cases.

With “From” and “To” NULLs

Insert into T values ('3o    21',null,'3','1','1:1=1');
Insert into T values ('3o    21','2',null,'1','1:1=1');
Insert into T values ('3o    22',null,'3','1','2:1<2');
Insert into T values ('3o    22','2',null,'2','2:11');
Insert into T values ('3o    23','2',null,'1','3:2>1');
Insert into T values ('3o    24',null,'3',null,'4:N-1');
Insert into T values ('3o    24','2',null,'1','4:N-1');
Insert into T values ('3o    25',null,'3','1','5:1-N');
Insert into T values ('3o    25','2',null,null,'5:1-N');
Insert into T values ('3o    26',null,'3',null,'6:N-N');
Insert into T values ('3o    26','2',null,null,'6:N-N');
Insert into T values ('5c    21',null,null,'1','1:1=1');
Insert into T values ('5c    21','2','3','1','1:1=1');
Insert into T values ('5c    22',null,null,'1','2:1<2');
Insert into T values ('5c    22','2','3','2','2:11');
Insert into T values ('5c    23','2','3','1','3:2>1');
Insert into T values ('5c    24',null,null,null,'4:N-1');
Insert into T values ('5c    24','2','3','1','4:N-1');
Insert into T values ('5c    25',null,null,'1','5:1-N');
Insert into T values ('5c    25','2','3',null,'5:1-N');
Insert into T values ('5c    26',null,null,null,'6:N-N');
Insert into T values ('5c    26','2','3',null,'6:N-N');
  • lines 1 through 9: as explained in my previous post, I create a new column called NULL_ORDER so that NULL “from” values come first.
  • lines 10 through 25: I combine segmenting and grouping in one SELECT just to show that it can be done: the MATCH_RECOGNIZE clause is executed before the GROUP BY. The NULL_ORDER column is used in ordering and comparisons.
  • line 28: order by seg_f NULLS FIRST and it all works!
with unpivoted as (
  select u.*,
    case when seg_start = -1 and ft is null then -1
         when seg_start = 1  and ft is null then  1
         else 0
    end null_order
  from t
  unpivot include nulls ( (ft, t) for seg_start in ((f,t) as -1, (t,t) as 1)) u
)
, segmented_and_grouped as (
  select obj_id, ab_attr, seg_f, seg_t, min(range_attr_id) attr_id
  from unpivoted
  match_recognize(
    partition by obj_id order by null_order, ft, seg_start, t
    measures prev(ft) seg_f, ft seg_t, first(attr_id) range_attr_id, 
      prev(null_order) null_order_f, null_order null_order_t
    all rows per match
    after match skip to next row
    pattern({-a-} b+)
    define a as seg_start = -1,
      b as null_order = -1 or ft <= first(t) or first(t) is null
  )
  where null_order_f < null_order_t or seg_f < seg_t
  group by obj_id, ab_attr, seg_f, seg_t
)
select * from segmented_and_grouped
match_recognize(
  partition by obj_id, ab_attr, attr_id order by seg_f nulls first
  measures first(seg_f) f, last(seg_t) t
  pattern(a+)
  define a as count(*) = 1 or seg_f = prev(seg_t)
)
order by obj_id, f nulls first, t;

Next up: Swiss Army knife.

Ranges with NULLs 05: Segments

Take two overlapping ranges: “1 to 3” and “2 to 4”. We can divide them into three smaller ranges: “1 to 2”, “2 to 3” and “3 to 4”. I’ll call these smaller ranges segments. We sometimes need these segments to solve advanced range problems.

Current list of posts: 

Requirement

We know that ranges “1 to 3” and “2 to 4” overlap, and the overlapping part is “2 to 3”. I call that part a segment; a segment has the following properties:

  • Its starting point is a boundary point (start or end) of an original range;
  • Its ending point is the next boundary point, either of that range or some overlapping range;
  • It is entirely covered by at least one original range (so there is no segment corresponding to a “gap”);

The requirement here is to find all the segments belonging to a group of ranges. In this example we should find “1 to 2”, “2 to 3” and “3 to 4”. As a bonus, I’ll determine for each segment how many original ranges it belongs to: here “2 to 3” belongs to 2 ranges and the others belong to only one.

Reminders (skip if you are reading the whole series)

  • In my world, ranges have exclusive end points: this is called the “closed-open” type. It allows ranges that “meet” without gaps or overlaps.
  • I classify ranges in three categories:
    1. Neither “To” nor “From” may be NULL
    2. “To” may be NULL, but not “From”
    3. Both “From” and “To” may be NULL.
  • My data consists of an object and a “from” and “to” range.
  • I propose solutions that require database version 12.1 or later.

Test data with no NULLs allowed

This isn’t a full test harness, just the minimum to illustrate the solution. This is the same data as in my previous post Ranges with NULLs 03: Gaps

create table t(
  obj_id varchar2(9), 
  f int, t int, af int, at int, bf int, bt int
);

Insert into T values ('1p    0',1,2,1,2,3,4);
Insert into T values ('1p    0',3,4,1,2,3,4);
Insert into T values ('2m    0',1,2,1,2,2,3);
Insert into T values ('2m    0',2,3,1,2,2,3);
Insert into T values ('3o    0',1,3,1,3,2,4);
Insert into T values ('3o    0',2,4,1,3,2,4);
Insert into T values ('4f    0',1,3,1,3,2,3);
Insert into T values ('4f    0',2,3,1,3,2,3);
Insert into T values ('5c    0',1,4,1,4,2,3);
Insert into T values ('5c    0',2,3,1,4,2,3);
Insert into T values ('6s    0',1,2,1,2,1,3);
Insert into T values ('6s    0',1,3,1,2,1,3);
Insert into T values ('7e    0',1,2,1,2,1,2);
Insert into T values ('7e    0',1,2,1,2,1,2);
  1. There are two rows = two ranges for each test case.
  2. OBJ_ID indicates the relation between the ranges: “precedes, meets, overlaps, finished by, contains, starts, equals”. The 0 at the end means there are no nulls.
  3. F is the “from” value.
  4. T is the “to” value (exclusive).
  5. So we can check our results more easily, each row contains the values for both ranges,
    1. AF and AT for the first range,
    2. BF and BT for the second range.

Segments in non-null ranges

  • First I UNPIVOT F and T into one column (called F for convenience) that contains all the boundary points, and I create a column called SEG_START that is 1 when the source is F and -1 with the source is T.
  • Then I use the analytic LEAD function to pair each boundary point with the next boundary point, while also doing a running sum of SEG_START.
  • When the running sum is 0, I have a “gap” so I filter out that row. I also filter out rows where F and T are equal.
with unpivoted as (
  select * from t
  unpivot(f for seg_start in(f as 1, t as -1))
)
select OBJ_ID, F, T, NUM_SEGS, AF, AT, BF, BT
from (
  select u.*,
  lead(f) over(partition by obj_id order by f) t,
  sum(seg_start) over(partition by obj_id order by f) num_segs
  from unpivoted u
)
where num_segs > 0 and f < t;
OBJ_ID F T NUM_SEGS AF AT BF BT
1p 0 1 2 1 1 2 3 4
1p 0 3 4 1 1 2 3 4
2m 0 1 2 1 1 2 2 3
2m 0 2 3 1 1 2 2 3
3o 0 1 2 1 1 3 2 4
3o 0 2 3 2 1 3 2 4
3o 0 3 4 1 1 3 2 4
4f 0 1 2 1 1 3 2 3
4f 0 2 3 2 1 3 2 3
5c 0 1 2 1 1 4 2 3
5c 0 2 3 2 1 4 2 3
5c 0 3 4 1 1 4 2 3
6s 0 1 2 2 1 2 1 3
6s 0 2 3 1 1 2 1 3
7e 0 1 2 2 1 2 1 2

 

Segments with “To” NULL

Insert into T values ('1p    1',1,2,1,2,3,null);
Insert into T values ('1p    1',3,null,1,2,3,null);
Insert into T values ('2m    1',1,2,1,2,2,null);
Insert into T values ('2m    1',2,null,1,2,2,null);
Insert into T values ('3o    1',1,3,1,3,2,null);
Insert into T values ('3o    1',2,null,1,3,2,null);
Insert into T values ('4f    1',1,null,1,null,2,null);
Insert into T values ('4f    1',2,null,1,null,2,null);
Insert into T values ('5c    1',1,null,1,null,2,3);
Insert into T values ('5c    1',2,3,1,null,2,3);
Insert into T values ('6s    1',1,2,1,2,1,null);
Insert into T values ('6s    1',1,null,1,2,1,null);
Insert into T values ('7e    1',1,null,1,null,1,null);
Insert into T values ('7e    1',1,null,1,null,1,null);
commit;
  • line 3: add the INCLUDE NULLS option to include T when it is NULL.
  • line 12: include segments that end with NULL.
with unpivoted as (
  select * from t
  unpivot include nulls (f for seg_start in(f as 1, t as -1))
)
select OBJ_ID, F, T, NUM_SEGS, AF, AT, BF, BT
from (
  select u.*,
  lead(f) over(partition by obj_id order by f) t,
  sum(seg_start) over(partition by obj_id order by f) num_segs
  from unpivoted u
)
where num_segs > 0 and (f < t or t is null);
OBJ_ID F T NUM_SEGS AF AT BF BT
1p 0 1 2 1 1 2 3 4
1p 0 3 4 1 1 2 3 4
1p 1 1 2 1 1 2 3
1p 1 3 1 1 2 3
2m 0 1 2 1 1 2 2 3
2m 0 2 3 1 1 2 2 3
2m 1 1 2 1 1 2 2
2m 1 2 1 1 2 2
3o 0 1 2 1 1 3 2 4
3o 0 2 3 2 1 3 2 4
3o 0 3 4 1 1 3 2 4
3o 1 1 2 1 1 3 2
3o 1 2 3 2 1 3 2
3o 1 3 1 1 3 2
4f 0 1 2 1 1 3 2 3
4f 0 2 3 2 1 3 2 3
4f 1 1 2 1 1 2
4f 1 2 2 1 2
5c 0 1 2 1 1 4 2 3
5c 0 2 3 2 1 4 2 3
5c 0 3 4 1 1 4 2 3
5c 1 1 2 1 1 2 3
5c 1 2 3 2 1 2 3
5c 1 3 1 1 2 3
6s 0 1 2 2 1 2 1 3
6s 0 2 3 1 1 2 1 3
6s 1 1 2 2 1 2 1
6s 1 2 1 1 2 1
7e 0 1 2 2 1 2 1 2
7e 1 1 2 1 1

 

With “From” and “To” NULLs

Insert into T values ('1p    2',3,null,null,2,3,null);
Insert into T values ('1p    2',null,2,null,2,3,null);
Insert into T values ('2m    2',2,null,null,2,2,null);
Insert into T values ('2m    2',null,2,null,2,2,null);
Insert into T values ('3o    2',2,null,null,3,2,null);
Insert into T values ('3o    2',null,3,null,3,2,null);
Insert into T values ('4f    2',2,null,null,null,2,null);
Insert into T values ('4f    2',null,null,null,null,2,null);
Insert into T values ('5c    2',2,3,null,null,2,3);
Insert into T values ('5c    2',null,null,null,null,2,3);
Insert into T values ('6s    2',null,2,null,2,null,null);
Insert into T values ('6s    2',null,null,null,2,null,null);
Insert into T values ('7e    2',null,null,null,null,null,null);
Insert into T values ('7e    2',null,null,null,null,null,null);
commit;

This is why I don’t like to allow NULL in “from” columns. After we UNPIVOT, we have to sort the null “from” values first and the null “to” values last! As a workaround, I create a column called NULL_ORDER that is -1 when F is NULL, 1 when T is NULL and 0 otherwise. I then test or order by this column when necessary.

with unpivoted as (
  select u.*,
    case when seg_start =  1 and f is null then -1
         when seg_start = -1 and f is null then  1
         else 0
    end null_order
  from t
  unpivot include nulls (f for seg_start in(f as 1, t as -1)) u
)
select OBJ_ID, F, T, NUM_SEGS, AF, AT, BF, BT
from (
  select u.*,
  lead(null_order) over(partition by obj_id order by null_order, f) t_null_order,
  lead(f) over(partition by obj_id order by null_order, f) t,
  sum(seg_start) over(partition by obj_id order by null_order, f) num_segs
  from unpivoted u
)
where num_segs > 0 and (null_order < t_null_order or f < t);
OBJ_ID F T NUM_SEGS AF AT BF BT
1p 0 1 2 1 1 2 3 4
1p 0 3 4 1 1 2 3 4
1p 1 1 2 1 1 2 3  
1p 1 3   1 1 2 3  
1p 2   2 1   2 3  
1p 2 3   1   2 3  
2m 0 1 2 1 1 2 2 3
2m 0 2 3 1 1 2 2 3
2m 1 1 2 1 1 2 2  
2m 1 2   1 1 2 2  
2m 2   2 1   2 2  
2m 2 2   1   2 2  
3o 0 1 2 1 1 3 2 4
3o 0 2 3 2 1 3 2 4
3o 0 3 4 1 1 3 2 4
3o 1 1 2 1 1 3 2  
3o 1 2 3 2 1 3 2  
3o 1 3   1 1 3 2  
3o 2   2 1   3 2  
3o 2 2 3 2   3 2  
3o 2 3   1   3 2  
4f 0 1 2 1 1 3 2 3
4f 0 2 3 2 1 3 2 3
4f 1 1 2 1 1   2  
4f 1 2   2 1   2  
4f 2   2 1     2  
4f 2 2   2     2  
5c 0 1 2 1 1 4 2 3
5c 0 2 3 2 1 4 2 3
5c 0 3 4 1 1 4 2 3
5c 1 1 2 1 1   2 3
5c 1 2 3 2 1   2 3
5c 1 3   1 1   2 3
5c 2   2 1     2 3
5c 2 2 3 2     2 3
5c 2 3   1     2 3
6s 0 1 2 2 1 2 1 3
6s 0 2 3 1 1 2 1 3
6s 1 1 2 2 1 2 1  
6s 1 2   1 1 2 1  
6s 2   2 2   2    
6s 2 2   1   2    
7e 0 1 2 2 1 2 1 2
7e 1 1   2 1   1  
7e 2     2        

 
Next up: “joining” each segment to the original ranges.

#DOAG2018: my presentations

I had a great time at DOAG 2018, both during the conference and in the evenings. Thanks to everyone who participated in my two sessions: it was a pleasure to share a bit of knowledge with you!

Here are links to my two presentations on Slideshare. Please download the files so you can see the animations.

Best regards,
Stew

https://www.slideshare.net/stewashton/json-in-18c-json-as-it-was-meant-to-be

https://www.slideshare.net/stewashton/make-your-data-dance-pivot-and-group-by-in-oracle-sql