This is my third post about temporal tables that don’t allow overlaps. I’m going to add support for temporal gaps, which will also allow objects to stop existing at some point.
Test Data
I’m going to add a NOT NULL numeric field called START_GAP, which can be 0 or 1. When START_GAP = 0, the row contains data that is valid starting at FROM_DATE. When START_GAP = 1, then FROM_DATE is actually the end date of the previous row.
The data can now contain all the relations between time intervals that I listed previously.
alter session set nls_date_format='dd';
drop table a purge;
create table a (
case_id varchar2(32),
from_date date,
start_gap number(1,0) not null check(start_gap in (0,1)),
a_string varchar2(32),
primary key(case_id, from_date)
);
Insert into A values ('01-precedes',to_date('2016-01-01','yyyy-mm-dd'),0,'A01-precedes');
Insert into A values ('01-precedes',to_date('2016-01-02','yyyy-mm-dd'),1,'A01-precedes');
Insert into A values ('02-meets',to_date('2016-01-01','yyyy-mm-dd'),0,'A02-meets');
Insert into A values ('02-meets',to_date('2016-01-02','yyyy-mm-dd'),1,'A02-meets');
Insert into A values ('03-overlaps',to_date('2016-01-01','yyyy-mm-dd'),0,'A03-overlaps');
Insert into A values ('03-overlaps',to_date('2016-01-03','yyyy-mm-dd'),1,'A03-overlaps');
Insert into A values ('04-finished by',to_date('2016-01-01','yyyy-mm-dd'),0,'A04-finished by');
Insert into A values ('04-finished by',to_date('2016-01-03','yyyy-mm-dd'),1,'A04-finished by');
Insert into A values ('05-contains',to_date('2016-01-01','yyyy-mm-dd'),0,'A05-contains');
Insert into A values ('05-contains',to_date('2016-01-04','yyyy-mm-dd'),1,'A05-contains');
Insert into A values ('06-starts',to_date('2016-01-01','yyyy-mm-dd'),0,'A06-starts');
Insert into A values ('06-starts',to_date('2016-01-02','yyyy-mm-dd'),1,'A06-starts');
Insert into A values ('07-equals',to_date('2016-01-01','yyyy-mm-dd'),0,'A07-equals');
Insert into A values ('07-equals',to_date('2016-01-02','yyyy-mm-dd'),1,'A07-equals');
drop table b purge;
create table b (
case_id varchar2(32),
from_date date,
start_gap number(1,0) not null check(start_gap in (0,1)),
b_string varchar2(32),
primary key(case_id, from_date)
);
Insert into B values ('01-precedes',to_date('2016-01-03','yyyy-mm-dd'),0,'B01-precedes');
Insert into B values ('01-precedes',to_date('2016-01-04','yyyy-mm-dd'),1,'B01-precedes');
Insert into B values ('02-meets',to_date('2016-01-02','yyyy-mm-dd'),0,'B02-meets');
Insert into B values ('02-meets',to_date('2016-01-03','yyyy-mm-dd'),1,'B02-meets');
Insert into B values ('03-overlaps',to_date('2016-01-02','yyyy-mm-dd'),0,'B03-overlaps');
Insert into B values ('03-overlaps',to_date('2016-01-04','yyyy-mm-dd'),1,'B03-overlaps');
Insert into B values ('04-finished by',to_date('2016-01-02','yyyy-mm-dd'),0,'B04-finished by');
Insert into B values ('04-finished by',to_date('2016-01-03','yyyy-mm-dd'),1,'B04-finished by');
Insert into B values ('05-contains',to_date('2016-01-02','yyyy-mm-dd'),0,'B05-contains');
Insert into B values ('05-contains',to_date('2016-01-03','yyyy-mm-dd'),1,'B05-contains');
Insert into B values ('06-starts',to_date('2016-01-01','yyyy-mm-dd'),0,'B06-starts');
Insert into B values ('06-starts',to_date('2016-01-03','yyyy-mm-dd'),1,'B06-starts');
Insert into B values ('07-equals',to_date('2016-01-01','yyyy-mm-dd'),0,'B07-equals');
Insert into B values ('07-equals',to_date('2016-01-02','yyyy-mm-dd'),1,'B07-equals');
commit;
select * from a
union all
select * from b
order by 1,2,4;
CASE_ID |
FROM_DATE |
START_GAP |
A_STRING or B_STRING |
01-precedes |
01 |
0 |
A01-precedes |
01-precedes |
02 |
1 |
A01-precedes |
01-precedes |
03 |
0 |
B01-precedes |
01-precedes |
04 |
1 |
B01-precedes |
02-meets |
01 |
0 |
A02-meets |
02-meets |
02 |
1 |
A02-meets |
02-meets |
02 |
0 |
B02-meets |
02-meets |
03 |
1 |
B02-meets |
03-overlaps |
01 |
0 |
A03-overlaps |
03-overlaps |
02 |
0 |
B03-overlaps |
03-overlaps |
03 |
1 |
A03-overlaps |
03-overlaps |
04 |
1 |
B03-overlaps |
04-finished by |
01 |
0 |
A04-finished by |
04-finished by |
02 |
0 |
B04-finished by |
04-finished by |
03 |
1 |
A04-finished by |
04-finished by |
03 |
1 |
B04-finished by |
05-contains |
01 |
0 |
A05-contains |
05-contains |
02 |
0 |
B05-contains |
05-contains |
03 |
1 |
B05-contains |
05-contains |
04 |
1 |
A05-contains |
06-starts |
01 |
0 |
A06-starts |
06-starts |
01 |
0 |
B06-starts |
06-starts |
02 |
1 |
A06-starts |
06-starts |
03 |
1 |
B06-starts |
07-equals |
01 |
0 |
A07-equals |
07-equals |
01 |
0 |
B07-equals |
07-equals |
02 |
1 |
A07-equals |
07-equals |
02 |
1 |
B07-equals |
What’s the idea?
As before, we need to carry down values and eliminate certain rows. How do we manage that for the new rows where START_GAP = 1?
- For table A, if the most recent START_GAP is 1 then A_STRING should be NULL. Same for B.
- If the most recent START_GAP is 1 for both tables, then that row should be eliminated.
- If the first row is from A, then B.START_GAP will be NULL. In that case B.START_GAP should be made = 1, since the B object did not exist at that point.
- As before, TO_DATE must be NULL or greater than FROM_DATE, otherwise the row is eliminated.
The MATCH_RECOGNIZE solution
Note that I don’t create a ROWTYPE column anymore: since START_GAP is defined as NOT NULL, I can use A.START_GAP to determine whether the current row comes from table A.
select case_id, from_date, to_date, a_str a_string, b_str b_string
from (
select case_id, from_date, start_gap agap, a_string,
null bgap, null b_string
from a
union all
select case_id, from_date, null, null, start_gap bgap, b_string
from b
)
match_recognize(
partition by case_id order by from_date
measures next(from_date) to_date,
nvl(a.agap,1) + nvl(b.bgap,1) abgap,
case a.agap when 0 then a.a_string end a_str,
case b.bgap when 0 then b.b_string end b_str
all rows per match
pattern( (a|b)+ )
define a as agap is not null
)
where (to_date > from_date or to_date is null)
and abgap < 2;
CASE_ID |
FROM_DATE |
TO_DATE |
A_STRING |
B_STRING |
01-precedes |
01 |
02 |
A01-precedes |
|
01-precedes |
03 |
04 |
|
B01-precedes |
02-meets |
01 |
02 |
A02-meets |
|
02-meets |
02 |
03 |
|
B02-meets |
03-overlaps |
01 |
02 |
A03-overlaps |
|
03-overlaps |
02 |
03 |
A03-overlaps |
B03-overlaps |
03-overlaps |
03 |
04 |
|
B03-overlaps |
04-finished by |
01 |
02 |
A04-finished by |
|
04-finished by |
02 |
03 |
A04-finished by |
B04-finished by |
05-contains |
01 |
02 |
A05-contains |
|
05-contains |
02 |
03 |
A05-contains |
B05-contains |
05-contains |
03 |
04 |
A05-contains |
|
06-starts |
01 |
02 |
A06-starts |
B06-starts |
06-starts |
02 |
03 |
|
B06-starts |
07-equals |
01 |
02 |
A07-equals |
B07-equals |
The Analytic solution
This one took me some time to figure out. I broke it down into 3 logical steps using WITH clauses.
- ALL_ROWS just does the UNION ALL, while transforming NULL values to CHR(0)
- ANALYTIC_ROWS applies all the analytic functions
- The main SELECT transforms CHR(0) to NULL, makes a value NULL if the most recent START_GAP = 1, and eliminates excess rows.
with all_rows as (
select case_id, from_date, start_gap agap, nvl(a_string, chr(0)) a_string,
null bgap, null b_string
from a
union all
select case_id, from_date, null, null,
start_gap bgap, nvl(b_string, chr(0)) b_string
from b
)
, analyzed_rows as (
select case_id, from_date,
lead(from_date) over(partition by case_id order by from_date) to_date,
last_value(agap) ignore nulls
over(partition by case_id order by from_date) agap,
last_value(bgap) ignore nulls
over(partition by case_id order by from_date) bgap,
last_value(a_string) ignore nulls
over(partition by case_id order by from_date) a_string,
last_value(b_string) ignore nulls
over(partition by case_id order by from_date) b_string
from all_rows
)
select case_id, from_date, to_date,
case when agap = 0 then nullif(a_string, chr(0)) end a_string,
case when bgap = 0 then nullif(b_string, chr(0)) end b_string
from analyzed_rows
where (to_date > from_date or to_date is null)
and nvl(agap,1) + nvl(bgap,1) < 2;
There’s more?
Yes indeed, I’m not through yet.
At this point I can define temporal tables that allow gaps but not overlaps. The attributes can be NULL but FROM_DATE can’t. The TO_DATE is not stored in the same row, but it can be displayed that way. I can also “join” two such tables in an efficient manner.
I only have one problem: your tables probably aren’t built that way. You have FROM_DATE and TO_DATE in each row, and if you say you don’t have overlapping ranges then you want me to believe you.
In my next post I’ll try to “join” tables that store FROM_DATE and TO_DATE, assuming there are no overlaps.