Optimistic ORA_ROWSCN 2: the SCN

In my previous post, I covered lost updates. Before saying how ORA_ROWSCN can help avoid them, I need to talk about the SCN. It’s hard to be simple and correct about such a central element of the Oracle database, but I’ll try.

Aside from the documentation, I am relying on Oracle Core by Jonathan Lewis, chapter 3, along with this blog post and Jonathan’s answer to my question on OTN.

The Commit SCN

The documentation defines the SCN, or “System Change Number”, as a “database ordering primitive. The value of an SCN is the logical point in time at which changes are made to a database.” It also calls it a “stamp that defines a committed version of a database at a point in time. Oracle assigns every committed transaction a unique SCN.”

In other words:

  • Every change to data in the database is made as part of a transaction.
  • The changes in a transaction become effective when the transaction is committed.
  • Every committed transaction gets a unique SCN.
  • The SCN increments at each commit, which allows it to “order the database” in time.
  • Every “commit SCN” “defines a committed version of the database”, so the SCN is like a version number that applies to the entire database.

The “commit SCN” of a transaction is precisely the SCN that was assigned to a transaction on commit.

[UPDATE 2017-07-24: Jonathan Lewis just pointed me to a blog post by Frits Hoogland where he showed that it is possible for two commits to share the same SCN. To quote Frits: “both commits … have the same SCN. At this point I think it’s fitting to theorise why this is happening. I think this is happening and makes sense because the table is protected for consistency at another level in the database, which is the locking mechanisms (TX locks for rows, TM locks for segments), which guarantees that two transactions at the same time do not compromise data consistency. And because of that, it’s not a problem to batch them with the same SCN.”

Frits seems to have shown an error, or at least a simplification, in the documentation. Either that or “unique” means every transaction has just one “commit SCN”, even if one “commit SCN” can be associated with two simultaneous transactions.

Fortunately for me, this refinement doesn’t invalidate the rest of this post.]

The SCN and Read Consistency

In Oracle, queries are “read consistent”, which means “The data returned by a query is committed and consistent for a single point in time.” This point in time is defined by – you guessed it – an SCN. I’ll call that SCN the “read-consistent SCN”. By default Oracle uses the SCN that was current when the query started, but within read-only or serializable transactions it uses the SCN when the transaction began.

Now comes the hard part.

When we submit a query, Oracle gets the data blocks it needs to read the rows of interest. The problem is that each data block may contain uncommitted data, or data that was committed after the “read-consistent SCN”. To determine that, Oracle consults the ITL (Interested Transaction List) in the block header, and if necessary the UNDO that the ITL entries point to.

There are three case to consider:

  1. All the ITL entries show committed transactions having “commit SCNs” that are less than or equal to the “read-consistent” SCN.
  2. All the ITL entries show committed transactions, but there is at least one “commit SCN” greater than the “read-consistent” SCN.
  3. At least one ITL entry shows a transaction that has not committed.

Case 1. is the most frequent and the simplest: none of the data is more recent than the read-consistent SCN, so the query may use the block as is.

Case 2. is not as simple as you might think. The “commit SCN” in an ITL entry may be an upper bound: it cannot be less than the real “commit SCN”, but it could be greater. In this case Oracle cross-checks with the “transaction table slot” in the UNDO segment header. If it finds that the transaction was actually committed before the read-consistent SCN, it will use the block in the query, and it will lower the “commit SCN” stored in the ITL entry.

Case 3. is similar to case 2. The ITL entry says “transaction not committed”, but that may be because Oracle did not have time to update the ITL entry before the block was flushed from memory. Oracle must cross-check with the “transaction table slot” in the UNDO segment header. If it says “actually, the transaction has committed”, then the ITL entry is updated with an upper bound “commit SCN”. This is known as “delayed block cleanout”.

We can see case 3. and case 2. at work in the following example, using two independent sessions:

  • Session 1
create table T1 (a number, ts timestamp, val number);
create table T2 (a number, ts timestamp, val number);

begin
    insert into T1 values (1, localtimestamp, 0);
    insert into T1 values (2, localtimestamp, 0);
    commit;
end;
/
begin
    insert into T2 values (3, localtimestamp, 0);
    commit;
end;
/
select dbms_flashback.get_system_change_number current_scn from dual;

CURRENT_SCN
-----------
   20031509

update T1 set ts = localtimestamp, val = 1 where a = 2;
alter system flush buffer_cache; -- (1)
commit;

begin
    for i in 1 .. 5000 loop
        update T2 set val = val + 1;
        commit; -- (2)
    end loop;
end;
/
select dbms_flashback.get_system_change_number current_scn from dual;

CURRENT_SCN
-----------
   20036606

(1) Notice that we update T1, then flush the block from memory before the COMMIT. This prevents Oracle from doing the “commit cleanout” in memory that would normally put the real “commit SCN” in the ITL entry.

(2) After the update to T1, we do 5000 transactions to T2 in order to increment the SCN.

  • Session 2
set transaction read only; -- This simulates a long-running query
  • Session 1
begin
    for i in 1 .. 5000 loop
        update T2 set val = val + 1;
        commit;
    end loop;
end;
/
select dbms_flashback.get_system_change_number current_scn,
ora_rowscn from T1 t;

CURRENT_SCN ORA_ROWSCN
----------- ----------
   20041709   20041359
   20041709   20041359

We use the pseudocolumn ORA_ROWSCN to query the “last change SCN” of the block, which corresponds to the highest SCN in the ITL. Notice the value is the same for both rows, even though we updated one row later than the other. Notice also that ORA_ROWSCN shows an SCN value that is clearly less than the current SCN (so we know the data is read consistent with the query) but much greater than the real commit SCN.

What happened here was what I called case 3. above. The data block was flushed from memory before the commit, so Oracle had to go to the UNDO to find out what happened. It discovered the transaction was committed some time ago, and updated the ITL entry to an SCN it could find efficiently and that was “good enough” for the query being executed. This SCN is still almost 10,000 more than the real “commit SCN”.

  • Session 2
select dbms_flashback.get_system_change_number current_scn,
ora_rowscn from T1 t;

CURRENT_SCN ORA_ROWSCN
----------- ----------
   20041714   20036604
   20041714   20036604

This is a example of what I called case 2. above. Oracle got the data block, found an SCN that was too recent for the query, and went to the UNDO to find out if the real commit SCN was lower. Once it found an SCN that was not greater than the read-consistent SCN (at the beginning of the transaction), it updated the ITL entry (together with the “last change SCN”) and used the data block in the query.

  • Session 1
select dbms_flashback.get_system_change_number current_scn,
ora_rowscn from T1 t;

CURRENT_SCN ORA_ROWSCN
----------- ----------
   20041714   20036604
   20041714   20036604

This is just to show that the ITL entry was well and truly changed for all sessions, not just for Session 2.

To summarize what we have seen: whenever a query accesses a data block, Oracle has to determine whether the block has changed earlier or later than the read-consistent SCN of the query. It may have to go to the UNDO to find that out, but if it does it updates one or more ITL entries so the next query doesn’t have to do the same work over again. At the end of the query, all the data blocks that were read consistent now have ORA_ROWSCN values no greater than the read-consistent SCN.

By the way, flashback queries work the same way. Using AS OF SCN nnnn, I can lower an ORA_ROWSCN several times in a row and get pretty close to the real commit SCN.

ROWDEPENDENCIES

I said the pseudocolumn ORA_ROWSCN tells us what is the most recent SCN in the ITL. That is true if the table was created with NOROWDEPENDENCIES, which is the default. If we create the table using ROWDEPENDENCIES, Oracle makes room for an SCN for each row. For this kind of table, ORA_ROWSCN shows us the SCN stored with each row, not the overall SCN we discussed up until now. I’ll have more to say about this in a later blog post.

Summary

  • Oracle orders all transactions in time by assigning each one a unique SCN on commit: this is called the “commit SCN”. [UPDATE 2017-07-24: it may be possible for two transactions to share the same “commit SCN” (and therefore be simultaneous), but no transaction will ever have more than one “commit SCN”.]
  • Every query returns results that were consistent as of an SCN that I call the “read-consistent SCN”: it corresponds to the start of the query (or the start of the transaction).
  • During query processing, Oracle compares the two SCNs and returns results where the “commit SCN” is never greater than the “read-consistent SCN”.

Sometimes a data block has incomplete information in its ITL, and Oracle consults the UNDO to determine whether the “commit SCN” is not greater than the “read-consistent SCN”. If it is, Oracle will update the ITL to save work for the next query.

The end result is that, after the query finishes, every ITL entry for eligible blocks will contain an SCN less than or equal to the “read-consistent SCN” of the query. This is crucial for doing optimistic locking with ORA_ROWSCN.

More later…

Optimistic ORA_ROWSCN 1: Lost Updates

I’ve gotten lots of feedback about my proposal to use ORA_ROWSCN for optimistic locking. I’m going to start over in more detail and try to leave less scope for misinterpretation. Thanks to all for your comments, especially Jonathan Lewis who saved me from some technical doubts.

“Optimistic locking” is a type of solution to the problem of “lost updates”. So what do we mean by “lost update”, anyway?

Same row, same column

The Oracle documentation defines a lost update as: “A data integrity problem in which one writer of data overwrites the changes of a different writer modifying the same data.” The Concepts document gives an example where two sessions update the same column in the same row of the same table:

  • Session 1 updates, but does not commit yet
  • Session 2 selects the row, but does not see the change made by Session 1 because it is not committed yet.
  • Session 2 updates the same column in the same row. In fact, it has to wait on Session 1.
  • Session 1 commits.
  • Session 2 is no longer blocked by Session 1, so the Session 2 update overwrites the Session 1 update.
  • Session 2 commits.
  • Session 1 queries the row and discovers its update is “lost”.

Why say this update is “lost”? Simply because Session 2 did not see it when it queried the data. Session 2 overwrote the update without ever seeing it. The documentation also calls this a “conflicting write”.

I think the following scenario is also a “lost update”.

  • Session 2 selects a row.
  • Session 1 updates a column in the row and commits.
  • Session 2 updates the same column in the same row and commits.
  • Session 1 queries the row and discovers its update is “lost”.

In both scenarios, the Session 1 update is “lost” because Session 2 (that is, the user behind Session 2) did not see the change before overwriting it.

Why should we care?

In real life, what we have here is a failure to communicate. Behind the two sessions are two users who want to put different values into the same place in the database. In collaborative software such as Lotus Notes, Session 1 would create a document and Session 2 would create another document marked as a “conflict”: after the fact, human beings would work together to resolve the conflict.

In the database, we can’t store both values simultaneously, so we generally communicate the existence of a conflict by refusing to execute the second update. Whoever gets the error can query the row again, see the change made by the first update, then decide what to do next.

Update vs. Changed Value

An “update” to a column does not necessarily mean that the value of the column changed!

If I say UPDATE EMP SET SAL = SAL, Oracle will perform an update, executing the same triggers and creating UNDO as it does when the SAL value changes. (There is one difference: if the column is indexed and the value does not change, Oracle doesn’t touch the index.)

Updating columns to the same value may mean two different things: either

  1. the application uses generic UPDATE code and always updates the same columns, whether the user wanted to update all of them or just some of them; or
  2. a user deliberately updates a column with the same value it had previously, as a way of confirming that the value is still valid.

I suspect the first case is much more common than the second, but the second cannot be completely ruled out.

Same row, different columns

In both of the scenarios above, suppose Session 1 updates one column and Session 2 updates another column in the same row. As long as Session 2 leaves the first column alone, there is no conflict, no “lost update”.

Multiple rows and columns

I recently learned of a project where the “lost update” scenario has been extended to a group of rows. Since the rows are related functionally, the developers want to avoid applying changes to one row if any of the others in the group have changed. I’ll call this requirement “detecting changed context” instead of “detecting lost update”.

“Lost updates”, and beyond

Optimistic locking basically means detecting updates or changes that were committed between the time the user queried the data and the time (s)he updated it.

But: different applications may want to detect different things. For example:

  • Detect only changed values in columns the user really wants to change.
  • Detect changed values in any updatable column of the row the user wants to change.
  • Detect changed values in the row, even if the user cannot update them.
  • Detect any update to the row, whether values changed or not.
  • Detect changes in the context, meaning all rows related to the target row(s), even those the user does not intend to update.

Mechanisms for detecting changes / updates

In the old “client / server” architectures, each user kept the same session for querying the data and updating it. To avoid conflicts at the row level, it was sufficient to do SELECT FOR UPDATE instead of a simple SELECT. That way, the current user saw all previous changes and the later users would see all the current changes. “Lost updates” were purely and simply avoided. This is known as “pessimistic locking”, since we assume conflicts will occur and take steps to avoid them.

In stateless architectures today, a user may obtain one session for querying the data and another for updating it. No session will remember anything about what happened when the user accessed the database previously. Therefore, SELECT FOR UPDATE is out of the question. This led to various “optimistic locking” strategies, where the data is to some extent queried again and compared to the values that were originally shown to the user. Some methods are:

  • column compare (all columns or only those of interest)
  • hash compare (hashing all or some columns)
  • timestamp compare (a row has been updated if the timestamp is different)
  • version compare (a rows has been updated if the version number has incremented).

Note that for efficiency, conflicts within a row can be detected directly in the UPDATE statement – a prior SELECT FOR UPDATE is not needed.

Where do I fit in?

My proposal is inspired by Tom Kyte (even though he gave up on ORA_ROWSCN himself). I want to use ORA_ROWSCN to detect conflicting updates to a row, whether any values changed or not. We’ll see in the following posts how this works, what requirements it fits and how much we can rely on it.

Optimistic Locking with one Timestamp?

A reader called chris asked me a good question about optimistic locking: why not use a “user-defined” scn (e.g. timestamp) as an extra column to achieve the same goal as ORA_ROWSCN? Well, chris, you’d have to use one per row, not one per transaction.

The Context

In this series of posts, I am assuming a two-phase API:

  1. The first API queries data and sends it to the user.
  2. The second API receives changes from the user to be applied to the database.

The goal is to avoid “lost updates”, by refusing to apply the changes if they would wipe out changes made by others between step 1. and step 2.

Update time and commit time

To really change a row in the database, we have to update it and commit. If we roll back instead, it’s as if the update never happened. What’s more, the update is one statement and the commit is another, so they do not occur at exactly the same moment. In the example I have been using to demonstrate, there is even an overlap: two sessions try to update the same row at about the same time, and the first commit only happens after both UPDATE statements have been issued.

What this means is that “update time” is not the same as “commit time”.

Which time can we control?

Let’s say we add a column called CHANGE_TS to our table, and we set it to SYSTIMESTAMP every time we update a row. That column will contain the “update time”, not the “commit time”. On the other hand, when Oracle updates ORA_ROWSCN the change has been committed, and the SCN is equal to (or possibly greater than) the SCN of the transaction. In other words, ORA_ROWSCN reflects “commit time”.

One SCN vs. one timestamp

The SCN is “a stamp that defines a committed version of a database at a point in time. Oracle assigns every committed transaction a unique SCN.” To repeat: the SCN is a point-in-time version number for all committed data in the database.

If we send read-consistent data to the user, one SCN is enough to identify the “version” of that data. If that same SCN comes back with data to be updated, it is enough to determine whether a row has been updated by someone else in the meantime. Of course, we don’t know which column(s) have been updated, or whether the values have really changed – but the SCN is enough to double-check if we want.

  • We have the data the user wants to change.
  • We can get the current committed data with a normal query.
  • We can get the data we originally sent to the user with a flashback query “as of” that same SCN.

By comparing the data of interest, we can determine what values, if any, have actually changed in the meantime. If we want to go this far, we do have to write and test the code, but it will only execute for the rows whose SCN has actually changed. Most of the time, the SCN will not have changed and the update will take place efficiently.

If we use a TIMESTAMP column, we cannot send just one “current” timestamp to the user.

  • At the time of our query, there could be uncommitted data with slightly older timestamps, and our query would not see that data.
  • While our user is pondering his screen, that data gets committed.
  • When our user sends back new data with that timestamp,
    it would not see any of the changes that were updated before our query but committed afterwards.

Here is a scenario to illustrate.

BATCH MILLER
drop table emp_ts purge;
create table emp_ts as select e.*, systimestamp change_ts from emp e;

– E1: SMITH submits a batch job for 10% raises

update emp_ts set sal = sal * 1.1, change_ts = systimestamp;

— E2: MILLER queries SMITH and sees committed data

select empno, sal, deptno, systimestamp current_ts from emp_ts
where ename = ‘SMITH’;

EMPNO SAL DEPTNO CURRENT_TS
7369 800 20 2017-07-02 05:21:02.640271 -04:00
– E3: MILLER reassigns SMITH: update waits for batch job to commit

update emp_ts set sal = 800, deptno = 30, change_ts = systimestamp
where empno = 7369
and change_ts <= ‘2017-07-02 05:21:02.640271 -04:00’

– E4: the batch job commits

commit;

-The application sees “1 row updated.”
– E5: MILLER commits

commit;

– E6: SMITH discovers he was the only one who didn’t get a raise!

select empno, sal, deptno from emp_ts
where ename = ‘SMITH’;

EMPNO SAL DEPTNO
7369 800 30

Conclusion: one timestamp is not enough

We have seen that comparing user-defined timestamps to just one overall “current” timestamp does not always work. If you use timestamps, the “query” API must send a timestamp for every table row involved in the query, and the user must send back a timestamp for every row to be updated. The user-defined timestamp becomes a version number for that row.

The SCN, on the other hand, is a version number for the entire database, so passing just one back and forth is enough to signal any number of “lost updates”. We can provide APIs (using JSON for example) that completely hide the table / row organisation of the data in the database!

Remember, I am assuming a two-phase API:

  • The first API queries data and sends it to the user.
    • With timestamps, every row sent to the user has to include its timestamp.
    • With SCNs, only one SCN needs to be sent.
  • The second API receives changes from the user to be applied to the database.
    • With timestamps, the user sends back the timestamp of each row to be updated, and for each row the API compares the returned timestamp to the current timestamp.
    • With SCNs, the user sends back one SCN, and for each row the API compares its current SCN to that one SCN.

More on Optimistic Locking with ORA_ROWSCN

Thanks to comments by Tony Hasler and pingbacks from Jeff Kemp, here’s more detail on how optimistic locking works with SCNs, especially with respect to “restarts” during update.

(I assume you have already read Avoiding Lost Updates with ORA_ROWSCN and Optimistic Locking: One SCN to rule them all before reading this post.)

Jeff’s post lists several methods for dealing with lost updates.

  • One of them, the “column compare” method, is different from the others. Suppose application A updates the DNAME column in DEPT, and application B updates the LOC column. If A checks only DNAME for a conflicting change, and B checks only LOC, then both applications can modify the same row independently.
  • The other methods, based on hashes, timestamps or version numbers, stop two sessions from updating the same row even if the columns involved are different.

So how does the “SCN” method behave? Well, it depends on whether there is a “restart”.

Again, suppose A in one session wants to update DEPT.DNAME and B in another session wants to update DEPT.LOC.

Scenario 1

  1. A: query DEPT where DEPTNO=10 and get current SCN
  2. B: query DEPT where DEPTNO=10 and get current SCN
  3. B: update LOC
  4. B: commit
  5. A: update DNAME, referring to the DNAME column and checking the SCN
    Result: since the ORA_ROWSCN has changed, the update is refused.

Scenario 2

  1. A: query DEPT where DEPTNO=10 and get current SCN
  2. B: query DEPT where DEPTNO=10 and get current SCN
  3. B: update LOC
  4. A: update DNAME, referring to the DNAME column and checking the SCN, which appears OK.
    The update will be blocked waiting for B to commit.
  5. B: COMMIT
  6. A: the update is unblocked and Oracle checks to see if the DNAME column has been updated. Since it has not, there is no restart and the ORA_ROWSCN pseudocolumn is not re-evaluated.
    Result: the update is allowed!

So, the behavior of the SCN method is inconsistent: when a restart is involved, applications can update different columns in the same row, but when no restart is involved they cannot.

Scenario 3

This scenario is just to emphasize that there is no “restart” if B rolls back.

  1. A: query DEPT where DEPTNO=10 and get current SCN
  2. B: query DEPT where DEPTNO=10 and get current SCN
  3. B: update LOC and DNAME
  4. A: update DNAME, referring to the DNAME column and checking the SCN, which appears OK.
    The update will be blocked waiting for B to commit.
  5. B: ROLLBACK
  6. A: the update is unblocked and Oracle checks to see if the DNAME column has been changed. Since it has not (because of the rollback), there is no restart.
    Result: the update is allowed.

Dealing with “false positives”

If you want to allow independent updates to different columns in the same row, the SCN method will not work – most of the time. It will only work in a restart scenario, since you can decide which columns will trigger a restart. Even then, you know a column has been updated but you don’t know if its value is different: someone may have updated a column to the same value.

If that is truly a problem for you, our famous SCN can come to the rescue. When a row is not updated, do a flashback query (“AS OF SCN <your SCN>”) on that row. You can then do a “column compare” to see whether the changes concern you or not.

By the way, if you use the SCN method on a table without ROWDEPENDENCIES you risk getting “false positives” even when the row of interest has not been updated. Again, you can double-check with a flashback query and a “column compare”.

Conclusion

The SCN method will warn of a “lost update”:

  • If the table does not have ROWDEPENDENCIES: when any row in the block has been updated, deleted or inserted.
  • If the table has ROWDEPENDENCIES and no restart is involved: when any column in the same row has been updated.
  • If the table has ROWDEPENDENCIES and a restart is involved: when the columns listed in the WHERE clause of the update have been updated.

To make sure the warning is not a “false positive”, use the SCN in a flashback query, compare the columns of interest and then decide whether to redo the update.

With that extra code, the SCN method can be used with confidence, even on tables without ROWDEPENDENCIES.

If your table has ROWDEPENDENCIES and you accept that two sessions cannot update the same row at the same time no matter what, then you don’t need that extra code.

 

Optimistic Locking: One SCN to rule them all

Previously I showed how to avoid lost updates with ORA_ROWSCN. Now let’s create an API that avoids lost updates with just one SCN.

What kind of API?

A transaction consists of one or more changes to data that should happen together: either all should happen or none. When the transaction commits, all the changes happen; when the transaction rolls back, none of them happen. This is the “atomic” part of ACID, the set of properties of database transactions.

A “statement”, be it SQL or a call to the PL/SQL engine, is also atomic:

  • When a “client” calls the database and the call returns with an exception, then any changes made during the call are automatically rolled back. All the data is exactly what it was before the call was made.
  • Of course, when everything goes well the changes are still there.
  • In both cases, the client must either COMMIT or ROLLBACK whatever changes remain that are within the scope of the transaction.

This “statement level atomicity” is guaranteed as long as no COMMITs or ROLLBACKS are executed during the call, and as long as the exception is allowed to return to the client.

So there are two kinds of atomicity we can leverage: at the statement level and the transaction level.

Either way, when a client calls an “atomic API”, it must provide all the input necessary to make all the changes that form a logical unit. If the client has to make more than one call, then the API is obviously not “atomic”.

How does optimistic locking fit in?

In order to do optimistic locking, there is a two-step process:

  1. The application gets data from the “old” rows.
    Since the rows must be checked later for changes, row-based version numbers or checksums may accompany the data.
  2. The application sends the “new” data back, together with whatever “old” data is needed to check the rows for changes, and the checking is executed within the UPDATE statement.

I suggest that for each API that updates the database, we should provide one API that provides the “old” data in the first place.

Getting the “old” data – and one SCN

Ideally, the “old” data should be read consistent: all the data should have existed at the same point in time in the database. Oracle guarantees this consistency within one SQL query, but it does not normally guarantee it across queries. To get read consistency across queries, use a “read-only transaction” or flashback queries.

However we do it, if we get read consistent data we can also get the SCN used by our query or queries. If any of the rows change afterwards, the ORA_ROWSCN of those rows must be greater than that SCN. We can use the SCN like a universal version number to check for any changes to any of the rows we queried.

Example: getting the data

I’m going to demonstrate using the 12cR2 SQL/JSON functions, because they allow me to get all the data I need for one logical unit in a single SQL statement, even if the rows involved are not related and have different columns.

In my stupid example, I assume a user wants to manually update the departments in the DEPT table: the DNAME and LOC columns should resemble the DEPARTMENT_NAME and CITY columns from the DEPARTMENTS and LOCATIONS tables in the HR schema.

with json_departments as(
    select json_arrayagg(
        json_object(
            'DEPARTMENT_ID' value DEPARTMENT_ID,
            'DEPARTMENT_NAME' value DEPARTMENT_NAME,
            'CITY' value CITY,
            'STATE_PROVINCE' value STATE_PROVINCE,
            'COUNTRY_ID' value COUNTRY_ID
        )
        order by DEPARTMENT_ID
    ) departments
    from hr.departments join hr.locations using(location_id)
    where DEPARTMENT_ID <= 40
)
, json_depts as (
    select json_arrayagg(
        json_object(
            'DEPTNO' value DEPTNO,
            'DNAME' value DNAME,
            'LOC' value LOC
        )
        order by deptno
    ) depts
    from dept
)
select json_object(
    'SCN' value timestamp_to_scn(systimestamp),
    'DEPARTMENTS' value departments,
    'DEPTS' value depts
) json_output
from json_departments, json_depts;
{
  "SCN": 14080746,
  "DEPARTMENTS": [
    {
      "DEPARTMENT_ID": 10,
      "DEPARTMENT_NAME": "Administration",
      "CITY": "Seattle",
      "STATE_PROVINCE": "Washington",
      "COUNTRY_ID": "US"
    },
    {
      "DEPARTMENT_ID": 20,
      "DEPARTMENT_NAME": "Marketing",
      "CITY": "Toronto",
      "STATE_PROVINCE": "Ontario",
      "COUNTRY_ID": "CA"
    },
    {
      "DEPARTMENT_ID": 30,
      "DEPARTMENT_NAME": "Purchasing",
      "CITY": "Seattle",
      "STATE_PROVINCE": "Washington",
      "COUNTRY_ID": "US"
    },
    {
      "DEPARTMENT_ID": 40,
      "DEPARTMENT_NAME": "Human Resources",
      "CITY": "London",
      "STATE_PROVINCE": null,
      "COUNTRY_ID": "UK"
    }
  ],
  "DEPTS": [
    {
      "DEPTNO": 10,
      "DNAME": "ACCOUNTING",
      "LOC": "NEW YORK"
    },
    {
      "DEPTNO": 20,
      "DNAME": "RESEARCH",
      "LOC": "DALLAS"
    },
    {
      "DEPTNO": 30,
      "DNAME": "SALES",
      "LOC": "CHICAGO"
    },
    {
      "DEPTNO": 40,
      "DNAME": "OPERATIONS",
      "LOC": "BOSTON"
    }
  ]
}

I used timestamp_to_scn(systimestamp) instead of accessing V$DATABASE or using DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER because the latter two require that privileges be granted. I suggest granting the necessary privileges. Notice that the data from the two tables are decorrelated.

Example: the update API

Now that the user has decided on the changes, the application calls the API with more JSON:

{"Action":"Update","SCN":14080746,"DEPTS":[
            {"DEPTNO":10,"DNAME":"Admin","LOC":"Seattle"},
            {"DEPTNO":20,"DNAME":"Marketing","LOC":"Toronto"},
            {"DEPTNO":30,"DNAME":"Purchasing","LOC":"Seattle"},
            {"DEPTNO":40,"DNAME":"HR","LOC":"London"}
            ]}

Notice that the only extra data required for optimistic locking is that one SCN.

Now here is the API that converts the JSON to relational data and updates DEPT:

create or replace procedure dept_xapi(p_json_input varchar2) as
    cursor c_input is select 0 SCN, deptno, dname, loc from dept;
    type tt_dept is table of c_input%rowtype;
    lt_dept tt_dept;
    l_rowcount number;
begin
    select SCN, deptno, dname, loc bulk collect into lt_dept
    from json_table(p_json_input, '$'
        columns(
            action path '$.Action',
            scn number path '$.SCN',
            nested path '$.DEPTS[*]'
            columns(
                deptno number path '$.DEPTNO',
                dname path '$.DNAME',
                loc path '$.LOC'
            )
        )
    )
    where action = 'Update';
    forall i in 1..lt_dept.count
        update dept set dname = lt_dept(i).dname, loc = lt_dept(i).loc
            where deptno = lt_dept(i).deptno
            and ora_rowscn <= lt_dept(i).scn
            and coalesce('@',dname||loc) is not null;
    for i in 1..lt_dept.count loop
        if sql%bulk_rowcount(i) = 0 then
            raise_application_error(
                -20001,
                'DEPT row with DEPTNO = ' || lt_dept(i).deptno||' already changed by another user. No updates have been made.');
        end if;
    end loop;
end dept_xapi;
/

…and to call the API:

begin
    dept_xapi('{"Action":"Update","SCN":14080746,"DEPTS":[
            {"DEPTNO":10,"DNAME":"Admin","LOC":"Seattle"},
            {"DEPTNO":20,"DNAME":"Marketing","LOC":"Toronto"},
            {"DEPTNO":30,"DNAME":"Purchasing","LOC":"Seattle"},
            {"DEPTNO":40,"DNAME":"HR","LOC":"London"}
            ]}');
end;
/

Conclusion

The API as I have coded it is atomic: it either does all the updates or does none of them – as long as it is called from a client. If it is called from other PL/SQL code that does explicit COMMITs or ROLLBACKs, or does not let the exception go back to the client, then the atomicity does not operate.

In order to fit into the “SmartDB” paradigm as promoted by Toon Koppelaars and Bryn Llewellyn, this code would not be called directly from the client. Rather, it would be called by another API that would explicitly COMMIT or ROLLBACK before returning to the client.

Whichever path you choose, the client must receive read-consistent data, and provide all the data for the related updates in one call. To avoid lost updates, just add one SCN

Avoiding Lost Updates with ORA_ROWSCN

Applications today use “optimistic locking” to avoid lost updates. Tom Kyte thought of using ORA_ROWSCN for optimistic locking, then decided it didn’t work (see ORA_ROWSCN for optimistic locking). I now think it does!

If you don’t understand what an SCN is, and how it helps with read consistency, this post may go over your head. Sorry I can’t take the time to try explaining all that here…

Test Data and Use Case

By default, Oracle keeps one ORA_ROWSCN per block. However, we can use ROWDEPENDENCIES when creating the table to make room for one SCN per row.

I’m going to create the venerable EMP table with ROWDEPENDENCIES:

CREATE TABLE EMP (
    EMPNO NUMBER(4) CONSTRAINT PK_EMP PRIMARY KEY,
    ENAME VARCHAR2(10),
    JOB VARCHAR2(9),
    MGR NUMBER(4),
    HIREDATE DATE,
    SAL NUMBER(7,2),
    COMM NUMBER(7,2),
    DEPTNO NUMBER(2)
) rowdependencies;

INSERT INTO EMP VALUES
(7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,NULL,20);
INSERT INTO EMP VALUES
(7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30);
INSERT INTO EMP VALUES
(7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30);
INSERT INTO EMP VALUES
(7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,NULL,20);
INSERT INTO EMP VALUES
(7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30);
INSERT INTO EMP VALUES
(7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,NULL,30);
INSERT INTO EMP VALUES
(7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,NULL,10);
INSERT INTO EMP VALUES
(7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,NULL,20);
INSERT INTO EMP VALUES
(7839,'KING','PRESIDENT',NULL,to_date('17-11-1981','dd-mm-yyyy'),5000,NULL,10);
INSERT INTO EMP VALUES
(7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30);
INSERT INTO EMP VALUES
(7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,NULL,20);
INSERT INTO EMP VALUES
(7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,NULL,30);
INSERT INTO EMP VALUES
(7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,NULL,20);
INSERT INTO EMP VALUES
(7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,NULL,10);
commit;

select distinct ora_rowscn from emp;
ORA_ROWSCN
13848205

 

In honor of Oracle’s 40-year birthday, SMITH is going to submit a batch job to give everyone a 10% raise, including himself. At the same time, MILLER is doing a reorganization: he assigns SMITH to department 30, using an OLTP application that updates both department number and salary.

Lost Updates

Here is what happens when an application neglects “optimistic locking”:

BATCH MILLER
– A1: SMITH submits a batch job for 10% raises

update emp set sal = sal * 1.1;

– A2: MILLER queries SMITH and sees committed data

select empno, sal, deptno from emp
where ename = ‘SMITH’;

EMPNO SAL DEPTNO
7369 800 20
– A3: MILLER reassigns SMITH: update waits for batch job to commit

update emp set sal = 800, deptno = 30
where empno = 7369;

– A4: the batch job commits

commit;

-The application sees “1 rows updated.”
– A5: MILLER commits

commit;

– A6: SMITH discovers he was the only one who didn’t get a raise!

select empno, sal, deptno from emp
where ename = ‘SMITH’;

EMPNO SAL DEPTNO
7369 800 30

 

You see what happened here? Oracle told MILLER that SMITH’s salary was 800, because that was the latest committed value. The batch update had not committed, so the new salary was invisible to any session but the batch session itself. Since the OLTP application updated the salary as well, it replaced the new salary with the old one.

For details on lost updates, see the documentation on Data Concurrency and Consistency.

Optimistic Locking using column values

To avoid lost updates, the application has to check during its update that someone else has not changed the data in the meantime.

BATCH MILLER
– B1: SMITH submits a batch job for 10% raises

update emp set sal = sal * 1.1;

– B2: MILLER queries SMITH and sees committed data

select empno, sal, deptno from emp
where ename = ‘SMITH’;

EMPNO SAL DEPTNO
7369 800 20
– B3: MILLER reassigns SMITH: update waits for batch job to commit

update emp set sal = 800, deptno = 30
where empno = 7369
and sal = 800 and deptno = 20;

– B4: the batch job commits

commit;

 -The application sees “0 rows updated.”
and explains to MILLER that someone else
already changed the data.
– B5: SMITH got his raise!

select empno, sal, deptno from emp
where ename = ‘SMITH’;

EMPNO SAL DEPTNO
7369 880 20

 

You may be wondering why Step B3 didn’t find the row it wanted to update. That is because of “restartability”:

  • Oracle did a “consistent get” to get the latest committed values, where SAL was 800
  • but then it did a “current get” to apply MILLER’s change to the current version of the data. This is when it blocked, waiting for the batch job to commit and release the lock on that row.
  • When the “current get” finally happened, Oracle noticed that the SAL value had changed, so it restarted the update: it did another consistent get, at which time it found no rows that matched the WHERE clause – so it did not do the update.

For a detailed explanation of restartability, see these posts by Tom Kyte:

http://tkyte.blogspot.fr/2005/08/something-different-part-i-of-iii.html

http://tkyte.blogspot.fr/2005/08/part-ii-seeing-restart.html

http://tkyte.blogspot.fr/2005/09/part-iii-why-is-restart-important-to.html

Why not use ORA_ROWSCN?

Tom discovered that putting ORA_ROWSCN in the WHERE clause did not trigger a restart. In our scenario:

BATCH MILLER
– C1: SMITH submits a batch job for 10% raises

update emp set sal = sal * 1.1;

– C2: MILLER queries SMITH and sees committed data

select empno, sal, deptno, ora_rowscn from emp
where ename = ‘SMITH’;

EMPNO SAL DEPTNO ORA_ROWSCN
7369 800 20 13857633

 

– C3: MILLER reassigns SMITH: update waits for batch job to commit

update emp set sal = 800, deptno = 30
where empno = 7369 and ora_rowscn = 13857633;

– C4: the batch job commits

commit;

-The application sees “1 rows updated.”

 

No reason to go any further: the update did not see that the batch job had changed the ORA_ROWSCN.

Making ORA_ROWSCN work!

Here is what I stumbled on: in the WHERE clause, referring to ORA_ROWSCN will not trigger a restart, but referring to SAL, which changed, will trigger a restart – and once the restart is triggered, then the new “consistent get” will see the new value of ORA_ROWSCN. We don’t need the old value of SAL, we just need to mention it. For example:

BATCH MILLER
– D1: SMITH submits a batch job for 10% raises

update emp set sal = sal * 1.1;

– D2: MILLER queries SMITH and sees committed data

select empno, sal, deptno, ora_rowscn from emp
where ename = ‘SMITH’;

EMPNO SAL DEPTNO ORA_ROWSCN
7369 800 20 13865770

 

– D3: MILLER reassigns SMITH: update waits for batch job to commit

update emp set sal = 800, deptno = 30
where empno = 7369
and ora_rowscn = 13865770
and coalesce(‘*’,sal||deptno) is not null;

– D4: the batch job commits

commit;

 -The application sees “0 rows updated.”
and explains to MILLER that someone else
already changed the data.

 

I decided to use COALESCE to refer to the column values because it uses short-circuit evaluation. Once it determines that ‘*’ is not null, it doesn’t bother evaluating the concatenation of the columns. Also, concatenation implicitly converts every datatype to VARCHAR2 so I can list strings, dates, timestamps, numbers and intervals without worry.

Conclusion

ORA_ROWSCN can indeed be used as a “version number” that can be checked to avoid lost updates. The UPDATE statement just has to list the appropriate columns in the WHERE clause, but it doesn’t have to check them against values from a prior SELECT statement. The only real value that needs to be checked is ORA_ROWSCN.

There are some limitations and caveats here:

  1. If your table has not been created with ROWDEPENDENCIES, ORA_ROWSCN will change every time any row in the block has changed. The table has to be “recreated” with ROWDEPENDENCIES for this to work.
  2. You cannot use this method with Index Organized Tables! If you create such a table with ROWDEPENDENCIES, then SELECT ORA_ROWSCN… will return NULL.
  3. You might wonder whether a “delayed block cleanout” will cause the ORA_ROWSCN to change without the data having changed. I don’t see how that could be a problem, since the SELECT done to get the data (and the ORA_ROWSCN) will do the cleanout as part of its processing.

Comparing Expression Lists (not tuples)

This post is a long-winded answer to a question Bryn Llewellyn asked me on Twitter about comparing expression lists.

Sabine Heimsath recently tweeted a regret about CASE and “tuples”, and gave an example of what she meant.

This is an example of a “simple” case statement (or expression?) using expression lists, not tuples. Tuples are unordered sets of attribute values, and here order matters. As Sabine discovered, this syntax is not supported.

[Update 2017-05-09: a case statement ends with END CASE and has statements after each THEN keyword. A case expression has expressions after THEN but does not have the CASE keyword at the end. Sabine appears to have mixed the two syntaxes.]

Oren Nakdimon then gave an example of a “searched” case statement expression that handles expression lists:

Sabine thought this meant that IN works and “=” does not. I replied that “=” works, but only in SQL not in PL/SQL, and provided you put an extra pair of parentheses around the expression list on the right hand side.

This led to a question from Bryn Llewellyn:

Since I need more than 140 characters, here is my attempt to answer Bryn.

His example is a simple comparison condition. If you look at the illustration below (taken from the documentation), Bryn is using the lower form with parentheses around the left-hand expressions. Therefore, the expression_list must also be surrounded by parentheses.

Additionally, since there are multiple expressions to the left of the operator, the expression_list must also be in its lower form:

So there must be an additional set of parentheses.

All of the above are from the SQL reference. The PL/SQL language reference has its own documentation on comparisons, and I have found no explicit reference to expression lists.