LISTAGG() as a Cumulative Function

LISTAGG() can be used as an analytic function, but cannot be cumulative. What does that mean, and is there a workaround?

I’ll demonstrate with the boring old DEPT and EMP tables, just because they have so few rows, and I’ll keep only the department and employee names:

create or replace view denames as
select dname, ename from dept join emp using(deptno);

select * from denames
order by dname, ename;
DNAME ENAME
ACCOUNTING CLARK
ACCOUNTING KING
ACCOUNTING MILLER
RESEARCH ADAMS
RESEARCH FORD
RESEARCH JONES
RESEARCH SCOTT
RESEARCH SMITH
SALES ALLEN
SALES BLAKE
SALES JAMES
SALES MARTIN
SALES TURNER
SALES WARD

 

Reminder: aggregate versus analytic

“Aggregate functions return a single result row based on groups of rows, rather than on single rows.” (19c doc)

select dname,
  listagg(ename, ',') within group(order by ename) enames
from denames
group by dname
order by dname;
DNAME ENAMES
ACCOUNTING CLARK,KING,MILLER
RESEARCH ADAMS,FORD,JONES,SCOTT,SMITH
SALES ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

By contrast, “analytic functions…differ from aggregate functions in that they return multiple rows for each group…All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed.” (19c). In other words, the output rows are already determined and any analytic function simply adds a column to each existing row. The “analytic clause”, starting with the keyword OVER(), says what rows provide data for the function.

select de.*,
  listagg(ename, ',') within group(order by ename)
    over(partition by dname)
  enames
from denames de;
DNAME ENAME ENAMES
ACCOUNTING CLARK CLARK,KING,MILLER
ACCOUNTING KING CLARK,KING,MILLER
ACCOUNTING MILLER CLARK,KING,MILLER
RESEARCH ADAMS ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH FORD ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH JONES ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH SCOTT ADAMS,FORD,JONES,SCOTT,SMITH
RESEARCH SMITH ADAMS,FORD,JONES,SCOTT,SMITH
SALES ALLEN ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES BLAKE ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES JAMES ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES MARTIN ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES TURNER ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
SALES WARD ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

LISTAGG() Limitations

The “analytic clause” has three parts:

  1. PARTITION BY: like GROUP BY, separates the data into chunks based on one or more values;
  2. ORDER BY: within the “partition”, orders the rows and sets up the next part;
  3. Windowing clause: says what rows within the ordered partition are concerned by the function.

Some analytic functions allow only the first or second parts, and LISTAGG() is one of them. Let’s try to use it as a cumulative function:

select de.*,
  listagg(ename, ',') within group(order by ename)
    over(partition by dname order by ename)
  enames
from denames de;

SQL Error: ORA-30487: ORDER BY not allowed here
30487. 00000 -  "ORDER BY not allowed here"
*Cause:    DISTINCT functions and RATIO_TO_REPORT cannot have an ORDER BY

The error message is a bit strange, but the conclusion is: it doesn’t work.

The workaround

The idea is simple: take what LISTAGG() gives us and do a SUBSTR() based on the row number:

select dname, ename,
  substr(enames, 1, instr(enames, ',', 1, rn) - 1) enames
from (
  select de.*,
    listagg(ename, ',') within group(order by ename)
      over(partition by dname)
    || ','
    enames,
    row_number() over(partition by dname order by ename) rn
  from denames de
);
DNAME ENAME ENAMES
ACCOUNTING CLARK CLARK
ACCOUNTING KING CLARK,KING
ACCOUNTING MILLER CLARK,KING,MILLER
RESEARCH ADAMS ADAMS
RESEARCH FORD ADAMS,FORD
RESEARCH JONES ADAMS,FORD,JONES
RESEARCH SCOTT ADAMS,FORD,JONES,SCOTT
RESEARCH SMITH ADAMS,FORD,JONES,SCOTT,SMITH
SALES ALLEN ALLEN
SALES BLAKE ALLEN,BLAKE
SALES JAMES ALLEN,BLAKE,JAMES
SALES MARTIN ALLEN,BLAKE,JAMES,MARTIN
SALES TURNER ALLEN,BLAKE,JAMES,MARTIN,TURNER
SALES WARD ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

Caveats

This solution seems fine, as long as the source values are NOT NULL. If there is a NULL value, the “cumulative value” becomes NULL. Look at the last row of the output.

update emp set ename = null where ename = 'ALLEN';

(repeat the query)
DNAME ENAME ENAMES
ACCOUNTING CLARK CLARK
ACCOUNTING KING CLARK,KING
ACCOUNTING MILLER CLARK,KING,MILLER
RESEARCH ADAMS ADAMS
RESEARCH FORD ADAMS,FORD
RESEARCH JONES ADAMS,FORD,JONES
RESEARCH SCOTT ADAMS,FORD,JONES,SCOTT
RESEARCH SMITH ADAMS,FORD,JONES,SCOTT,SMITH
SALES BLAKE BLAKE
SALES JAMES BLAKE,JAMES
SALES MARTIN BLAKE,JAMES,MARTIN
SALES TURNER BLAKE,JAMES,MARTIN,TURNER
SALES WARD BLAKE,JAMES,MARTIN,TURNER,WARD
SALES

 

Also, with version 19c LISTAGG() allows the DISTINCT keyword. If you want to use DISTINCT, then instead of ROW_NUMBER() use DENSE_RANK()!

update emp set ename = 'MARTIN' where ename is null;

select dname, ename,
  substr(enames, 1, instr(enames, ',', 1, rn) - 1) enames
from (
  select de.*,
    listagg(DISTINCT ename, ',') within group(order by ename)
      over(partition by dname)
    || ','
    enames,
    DENSE_RANK() over(partition by dname order by ename) rn
  from denames de
);
DNAME ENAME ENAMES
ACCOUNTING CLARK CLARK
ACCOUNTING KING CLARK,KING
ACCOUNTING MILLER CLARK,KING,MILLER
RESEARCH ADAMS ADAMS
RESEARCH FORD ADAMS,FORD
RESEARCH JONES ADAMS,FORD,JONES
RESEARCH SCOTT ADAMS,FORD,JONES,SCOTT
RESEARCH SMITH ADAMS,FORD,JONES,SCOTT,SMITH
SALES BLAKE BLAKE
SALES JAMES BLAKE,JAMES
SALES MARTIN BLAKE,JAMES,MARTIN
SALES MARTIN BLAKE,JAMES,MARTIN
SALES TURNER BLAKE,JAMES,MARTIN,TURNER
SALES WARD BLAKE,JAMES,MARTIN,TURNER,WARD

 
With these precautions, why not use LISTAGG() as a cumulative function if you want? Hope this helps…

2 thoughts on “LISTAGG() as a Cumulative Function

  1. Excellent as usual, and very inventive, but can I speculate that you have combined 2 test runs here (in the final scenario), because having set ALLEN to null, here we set null to MARTIN, and in the results ALLEN and MARTIN are both present, but BLAKE is not

  2. Hi Andy,

    Thanks for pointing that out! I have corrected the output: MARTIN appears twice, BLAKE once and ALLEN is gone.

    Best regards, Stew

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s