PIVOT Function with Totals

I have updated the ADVANCED_PIVOT function to support row and/or column totals. It’s neat, but you need to know what you’re doing.

No totals

Let’s start from an example that has no totals. If we didn’t want to pivot, we could do a simple GROUP BY:

SQL> select dname, job, sum(sal)
  2  from emp join dept using(deptno)
  3  group by dname, job
  4  order by 1,2;

DNAME          JOB         SUM(SAL)
-------------- --------- ----------
ACCOUNTING     CLERK           1300
ACCOUNTING     MANAGER         2450
ACCOUNTING     PRESIDENT       5000
RESEARCH       ANALYST         3000
RESEARCH       CLERK           1900
RESEARCH       MANAGER         2975
SALES          CLERK            950
SALES          MANAGER         2850
SALES          SALESMAN        5600

If we do want to pivot, the PIVOT clause will do the aggregation for us:

SQL> var rc refcursor

SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select dname, job, sal
  5  from emp join dept using(deptno)
  6  )@',
  7  p_aggfuncs => 'sum(sal)',
  8  p_pivotfor => 'job'
  9  );
 10  end;
 11  /

print :rc

DNAME             ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN
-------------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING                      1300       2450       5000           
RESEARCH             3000       1900       2975                      
SALES                            950       2850                  5600

Horizontal total

Now suppose we want an extra column that shows the total for each row. Is that a “row total” or a “column total”? To avoid confusion, I’m going to say “horizontal total” when we get an extra column, and “vertical total” when we get an extra row.

The first thing I realized is that, if we want any totals, we have to compute them beforehand: the PIVOT clause will only be used to create the pivot columns and move the data to the right places. Here is a query to do that:

SQL> select dname, 
  2    job,
  3    sum(sal) sal
  4  from emp join dept using(deptno)
  5  group by dname, rollup(job);

DNAME          JOB              SAL
-------------- --------- ----------
SALES          CLERK            950
SALES          MANAGER         2850
SALES          SALESMAN        5600
SALES                          9400
RESEARCH       CLERK           1900
RESEARCH       ANALYST         3000
RESEARCH       MANAGER         2975
RESEARCH                       7875
ACCOUNTING     CLERK           1300
ACCOUNTING     MANAGER         2450
ACCOUNTING     PRESIDENT       5000
ACCOUNTING                     8750

Notice the use of ROLLUP(JOB) to get totals for all the jobs, but always at the DNAME level.

Notice also that the total rows have a JOB value of NULL. PIVOT will not allow us to pivot data based on a NULL column value. So we use the GROUPING() function to find out when there is a total of all jobs, and replace the NULL with a literal.

SQL> select dname, 
  2    decode(grouping(job),1,'(Total)',job) job,
  3    sum(sal) sal
  4  from emp join dept using(deptno)
  5  group by dname, rollup(job);

DNAME          JOB              SAL
-------------- --------- ----------
SALES          CLERK            950
SALES          MANAGER         2850
SALES          SALESMAN        5600
SALES          (Total)         9400
RESEARCH       CLERK           1900
RESEARCH       ANALYST         3000
RESEARCH       MANAGER         2975
RESEARCH       (Total)         7875
ACCOUNTING     CLERK           1300
ACCOUNTING     MANAGER         2450
ACCOUNTING     PRESIDENT       5000
ACCOUNTING     (Total)         8750

Notice I always put the literal ‘(Total)’ before the value. If the value is not a string, Oracle will implicitly convert it. If we said decode(grouping(<column>),0,<column>,'(Total)') and <column> was a date or a number, we would get a conversion exception.

Now that we have the totals with a valid label, let’s plug that into my function:

SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select dname, 
  5    decode(grouping(job),1,'(Total)',job) job,
  6    sum(sal) sal
  7  from emp join dept using(deptno)
  8  group by dname, rollup(job)
  9  )@',
 10  p_aggfuncs => 'sum(sal)',
 11  p_pivotfor => 'job'
 12  );
 13  end;
 14  /

DNAME             (Total)    ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN
-------------- ---------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING           8750                  1300       2450       5000           
RESEARCH             7875       3000       1900       2975                      
SALES                9400                   950       2850                  5600

Oops! The Total column is in the wrong place. To fix that, I added an optional parameter p_total_label. Whenever we order by a column, I will make sure that the content of that parameter goes last. Let’s try it:

SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select dname, 
  5    decode(grouping(job),1,'(Total)',job) job,
  6    sum(sal) sal
  7  from emp join dept using(deptno)
  8  group by dname, rollup(job)
  9  )@',
 10  p_aggfuncs => 'sum(sal)',
 11  p_pivotfor => 'job',
 12  p_total_label => '(Total)'
 13  );
 14  end;
 15  /

DNAME             ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN    (Total)
-------------- ---------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING                      1300       2450       5000                  8750
RESEARCH             3000       1900       2975                             7875
SALES                            950       2850                  5600       9400

Warnings:

  1. Make sure you use the same literal value for p_total_label and within the p_source query! The function will never try to check for coherence, so it’s up to you.
  2. To use something like ‘(Total)’ instead of NULL, dates and numbers must be converted to strings. If that happens, the function can only order by the string value, not the original date or number value!

Vertical total

Let’s try to ROLLUP by DNAME rather than JOB, in order to get an extra row with the totals for each job.

SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select decode(grouping(dname),1,'(Total)',dname) dname, 
  5    job,
  6    sum(sal) sal
  7  from emp join dept using(deptno)
  8  group by rollup(dname), job
  9  )@',
 10  p_aggfuncs => 'sum(sal)',
 11  p_pivotfor => 'job',
 12  p_total_label => '(Total)'
 13  );
 14  end;
 15  /

DNAME             ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN
-------------- ---------- ---------- ---------- ---------- ----------
(Total)              3000       4150       8275       5000       5600
ACCOUNTING                      1300       2450       5000           
RESEARCH             3000       1900       2975                      
SALES                            950       2850                  5600

We still have a problem with sorting the rows. The function has no idea that the DNAME column has a ‘(Total)’ row that needs to go last. To fix that, I added another optional parameter¬†p_ordercol. The function will order the final query by the column listed here, making sure that the total line goes last.

SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select decode(grouping(dname),1,'(Total)',dname) dname, 
  5    job,
  6    sum(sal) sal
  7  from emp join dept using(deptno)
  8  group by rollup(dname), job
  9  )@',
 10  p_aggfuncs => 'sum(sal)',
 11  p_pivotfor => 'job',
 12  p_total_label => '(Total)',
 13  p_ordercol => 'dname'
 14  );
 15  end;
 16  /

DNAME             ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN
-------------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING                      1300       2450       5000           
RESEARCH             3000       1900       2975                      
SALES                            950       2850                  5600
(Total)              3000       4150       8275       5000       5600

Horizontal and vertical totals

To get both totals, we just need to change the source query. We could combine two ROLLUPs, but in this case the CUBE function is more concise:

SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select decode(grouping(dname),1,'(Total)',dname) dname, 
  5    decode(grouping(job),1,'(Total)',job) job,
  6    sum(sal) sal
  7  from emp join dept using(deptno)
  8  group by cube(dname,job)
  9  )@',
 10  p_aggfuncs => 'sum(sal)',
 11  p_pivotfor => 'job',
 12  p_total_label => '(Total)',
 13  p_ordercol => 'dname'
 14  );
 15  end;
 16  /

DNAME             ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN    (Total)
-------------- ---------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING                      1300       2450       5000                  8750
RESEARCH             3000       1900       2975                             7875
SALES                            950       2850                  5600       9400
(Total)              3000       4150       8275       5000       5600      26025

Two vertical totals

Now we’re getting fancy! Suppose we want totals by department, but also subtotals within each department by the year someone was hired?

  • row 10: get the hiring year
  • row 13: instead of ROLLUP on DNAME only, we ROLLUP on DNAME and HIREYEAR
  • row 5: HIREYEAR should say ‘(Total)’ only when we are on a subtotal line, not on the grand total.
  • row 18: the p_ordercol parameter can be a list of column names
SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select decode(grouping(dname),1,'(Total)',dname) dname,
  5    decode(grouping_id(dname,hireyear),1,'(Total)',hireyear) hireyear,
  6    job,
  7    sum(sal) sal
  8  from (
  9    select dname, job, sal,
 10      extract(year from hiredate) hireyear
 11    from dept join emp using(deptno)
 12  )
 13  group by rollup(dname, hireyear), job
 14  )@',
 15  p_aggfuncs => 'sum(sal)',
 16  p_pivotfor => 'job',
 17  p_total_label => '(Total)',
 18  p_ordercol => 'dname,hireyear'
 19  );
 20  end;
 21  /

DNAME          HIREYEAR     ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN
-------------- --------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING     1981                                  2450       5000           
ACCOUNTING     1982                       1300                                 
ACCOUNTING     (Total)                    1300       2450       5000           
RESEARCH       1980                        800                                 
RESEARCH       1981            3000                  2975                      
RESEARCH       1987                       1100                                 
RESEARCH       (Total)         3000       1900       2975                      
SALES          1981                        950       2850                  5600
SALES          (Total)                     950       2850                  5600
(Total)                        3000       4150       8275       5000       5600

Two vertical totals, one horizontal

  • row 13: we ROLLUP on the vertical axis (DNAME, HIREYEAR) and on the horizontal (JOB)
  • row 6: JOB must never have a NULL value, else the PIVOT will fail.
SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select decode(grouping(dname),1,'(Total)',dname) dname,
  5    decode(grouping_id(dname,hireyear),1,'(Total)',hireyear) hireyear,
  6    decode(grouping(job),1,'(Total)',job) job,
  7    sum(sal) sal
  8  from (
  9    select dname, job, sal,
 10      extract(year from hiredate) hireyear
 11    from dept join emp using(deptno)
 12  )
 13  group by rollup(dname, hireyear), rollup(job)
 14  )@',
 15  p_aggfuncs => 'sum(sal)',
 16  p_pivotfor => 'job',
 17  p_total_label => '(Total)',
 18  p_ordercol => 'dname,hireyear'
 19  );
 20  end;
 21  /

DNAME          HIREYEAR     ANALYST      CLERK    MANAGER  PRESIDENT   SALESMAN    (Total)
-------------- --------- ---------- ---------- ---------- ---------- ---------- ----------
ACCOUNTING     1981                                  2450       5000                  7450
ACCOUNTING     1982                       1300                                        1300
ACCOUNTING     (Total)                    1300       2450       5000                  8750
RESEARCH       1980                        800                                         800
RESEARCH       1981            3000                  2975                             5975
RESEARCH       1987                       1100                                        1100
RESEARCH       (Total)         3000       1900       2975                             7875
SALES          1981                        950       2850                  5600       9400
SALES          (Total)                     950       2850                  5600       9400
(Total)                        3000       4150       8275       5000       5600      26025

One vertical, two horizontal

  • row 5: HIREYEAR is now a source for a pivot column, so it must never be NULL.
  • row 16: the p_pivotfor parameter can be a list of column names
SQL> begin
  2  :rc := advanced_pivot(
  3  p_source => q'@(
  4  select decode(grouping(dname),1,'(Total)',dname) dname,
  5    decode(grouping(hireyear),1,'(Total)',hireyear) hireyear,
  6    decode(grouping(job),1,'(Total)',job) job,
  7    sum(sal) sal
  8  from (
  9    select dname, job, sal,
 10      extract(year from hiredate) hireyear
 11    from dept join emp using(deptno)
 12  )
 13  group by rollup(dname), rollup(hireyear, job)
 14  )@',
 15  p_aggfuncs => 'sum(sal)',
 16  p_pivotfor => 'hireyear,job',
 17  p_total_label => '(Total)',
 18  p_ordercol => 'dname'
 19  );
 20  end;
 21  /

DNAME          1980_CLERK 1980_(Total) 1981_ANALYST 1981_CLERK 1981_MANAGER 1981_PRESIDENT 1981_SALESMAN 1981_(Total) 1982_CLERK 1982_(Total) 1987_CLERK 1987_(Total) (Total)_(Total)
-------------- ---------- ------------ ------------ ---------- ------------ -------------- ------------- ------------ ---------- ------------ ---------- ------------ ---------------
ACCOUNTING                                                             2450           5000                       7450       1300         1300                                    8750
RESEARCH              800          800         3000                    2975                                      5975                               1100         1100            7875
SALES                                                      950         2850                         5600         9400                                                            9400
(Total)               800          800         3000        950         8275           5000          5600        22825       1300         1300       1100         1100           26025

Conclusion

Did you really read down to here? If so, congratulations! Here is what I wish you would take away from this post:

  • The ROLLUP and CUBE extensions to GROUP BY are very powerful means to get various levels of subtotals, totals and grand totals.
  • The PIVOT clause can manage vertical and / or horizontal totals very cleanly.
  • The source query should do one ROLLUP on the vertical columns and another on the horizontal columns, except when there is exactly one of each: then just use CUBE.
  • The ADVANCED_PIVOT function generates the appropriate PIVOT clause based on the data, and can handle multiple levels of vertical and / or horizontal totals.
      • The horizontal “p_pivotfor” columns must never be NULL.
      • The vertical “p_ordercol” columns can be NULL or labeled, your choice. If there are dates or numbers, NULL allows for sorting by the original values.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s