Optimistic Locking 4: the Good

After discussing lost updates, the SCN and my theoretical use case, here is my idea for doing SCN-based optimistic locking. I’m going to simplify and emphasize the advantages here; limitations and gruesome details will come later.

Context

Here is the situation I am thinking of, before adding my solution.

There is an OLTP application. The “client” part is either a generic middleman like ORDS or a program written in java, PHP, node.js, etc. . The “client” accesses the database via SQL*Net. To support updates, the “data” part of the application provides two APIs in the database, one for querying the data in the first place and another for processing the updates. The query API returns read-consistent data, and the update API is atomic: either all the updates occur or none of them occur.

The dialogue between “client” and “data” uses generic connections and stateless sessions, so optimistic locking is the only solution for avoiding lost updates. I’ll start with a simplistic “lost update” requirement: if the “client” wants to update some rows, and someone else has already updated one of them between query time and update time, then the update must not take place.

In order to avoid the “object-relational impedance mismatch”, without frameworks and without being language-specific, I am going to use JSON as my data-interchange format.

The application scenario is:

  1. Client calls the query API to display data to the user.
  2. The query API returns the data for the user.
    • If any additional data is needed to support optimistic locking, the API has to return it here.
  3. Client calls the update API to update the data.
    • Any additional data from step 2. has to be sent back along with the updates.
  4. The update API uses the additional data to check for intervening updates during its processing.

Adding the SCN solution

My idea is based on comparing:

  1. The “read-consistent SCN”:  it determines what read-consistent version of the data is returned from a query.
  2. ORA_ROWSCN : a pseudocolumn that “reflects the system change-number (SCN) of the most recent change to a row.”

As I showed in an earlier post, the “SCN of the most recent change” may be an upper bound, but the query itself will adjust that upper bound downwards so it will not be greater than the “read-consistent SCN”.

Now I’ll add the SCN to my scenario:

  1. Client calls the query API to display data to the user.
  2. The query API returns the data for the user.
    1. All the rows that might be updated are SELECTed here.
    2. The SELECT ensures that ORA_ROWSCN will be less than or equal to the “read-consistent SCN”.
    3. The data is read-consistent, and the “read-consistent SCN” is returned along with the data.
  3. Client calls the update API to update the data.
    1. The “read-consistent SCN” from step 2. is sent back along with the updates.
  4. The update API compares each ORA_ROWSCN to the”read-consistent SCN”.
    1. If any ORA_ROWSCN is greater, then some intervening update has occurred and the update cannot continue.
      All lost updates have been avoided.

Advantages

I’m concentrating on the “good” here, since I want you to keep reading. Several things tempt me in this approach:

  • No special “version number” or “change timestamp” columns needed in the table.
  • No CPU spent doing checksums or hashes.
  • Only one number, the SCN, sent back and forth between the “client” and the APIs.
  • The data used for optimistic locking (the SCN) can be kept separate from the user data.
  • The “client” is freed from practically all implementation details.

If you base optimistic locking on row version numbers, row timestamps, or checksums of row columns, then the client code has to keep track of that data and send it back with the updates. With the SCN, the data from each row is just user data, not technical data. It can be pivoted, unpivoted, aggregated or denormalized. With JSON, it is trivial to separate the SCN from all the user data.

All the client code has to do is keep the SCN somewhere and include it in the call to the update API.

Example

My query API is going to get the current SCN, then query the DEPT table.

create or replace function queryAPI return varchar2 is
  l_scn number;
  l_result varchar2(4000);
begin
  l_scn := dbms_flashback.get_system_change_number;
  select json_object(
    'SCN' value l_scn,
    'DEPTS' value json_arrayagg(
      json_object(
        'DEPTNO' value DEPTNO, 
        'DNAME' value DNAME, 
        'LOC' value LOC
      )
    )
  ) into l_result
  from dept;
  return l_result;
end queryAPI;
/

Now my “client” (Oracle SQL Developer) will call the API:

var v_result varchar2(4000);
exec :v_result := queryAPI;
select json_query(:v_result, '$' pretty) from dual;


{
  "SCN" : 22422562,
  "DEPTS" :
  [
    {
      "DEPTNO" : 10,
      "DNAME" : "ACCOUNTING",
      "LOC" : "NEW YORK"
    },
    {
      "DEPTNO" : 20,
      "DNAME" : "RESEARCH",
      "LOC" : "DALLAS"
    },
    {
      "DEPTNO" : 30,
      "DNAME" : "SALES",
      "LOC" : "CHICAGO"
    },
    {
      "DEPTNO" : 40,
      "DNAME" : "OPERATIONS",
      "LOC" : "BOSTON"
    }
  ]
}

After changes to DNAME and LOC in the user interface, the client is going to send this data to the update API:

{
  "Action": "Update",
  "SCN": 22422562,
  "DEPTS": [
    {
      "DEPTNO": 10,
      "DNAME": "Admin",
      "LOC": "Seattle"
    },
    {
      "DEPTNO": 20,
      "DNAME": "Marketing",
      "LOC": "Toronto"
    },
    {
      "DEPTNO": 30,
      "DNAME": "Purchasing",
      "LOC": "Seattle"
    },
    {
      "DEPTNO": 40,
      "DNAME": "HR",
      "LOC": "London"
    }
  ]
}

In the update API, I’ll use JSON_TABLE to convert the JSON data to SQL rows and columns, then use FORALL to do the updates in bulk. For any row, if ORA_ROWSCN is greater than the read-consistent SCN then the update will not take place and I raise an exception. That is enough to roll back all the work done in that call to the API.

create or replace procedure updateAPI(p_json_input varchar2) as
  cursor c_input is select 0 SCN, deptno, dname, loc from dept;
  type tt_dept is table of c_input%rowtype;
  lt_dept tt_dept;
  l_rowcount number;
  c_dummy varchar2(1) := substr(p_json_input,1,1);
begin

  -- get data in SQL format. Every row will have the SCN
  select SCN, deptno, dname, loc bulk collect into lt_dept
  from json_table(p_json_input, '$'
    columns(
      action path '$.Action',
      scn number path '$.SCN',
      nested path '$.DEPTS[*]'
      columns(
        deptno number path '$.DEPTNO',
        dname path '$.DNAME',
        loc path '$.LOC'
      )
    )
  )
  where action = 'Update';

  -- Use FORALL to update in bulk
  forall i in 1..lt_dept.count
    update dept set dname = lt_dept(i).dname, loc = lt_dept(i).loc
    where deptno = lt_dept(i).deptno
    and ora_rowscn <= lt_dept(i).scn
    and coalesce(c_dummy,dname||loc) is not null;

  -- Check for row not updated due to ORA_ROWSCN change
  for i in 1..lt_dept.count loop
    if sql%bulk_rowcount(i) = 0 then raise_application_error(
      -20001,
      'DEPT row with DEPTNO = ' 
      || lt_dept(i).deptno 
      ||' already changed by another user. No updates have been made.'
    );
    end if;
  end loop;
end updateAPI;
/

Now suppose that, between the query and the update, someone else has updated department 10. In that case:

declare
  l_scn varchar2(100);
begin
  select json_value(:v_result, '$.SCN') into l_scn from dual;
  updateAPI('{"Action":"Update","SCN":'||l_scn||',"DEPTS":[
          {"DEPTNO":10,"DNAME":"Admin","LOC":"Seattle"},
          {"DEPTNO":20,"DNAME":"Marketing","LOC":"Toronto"},
          {"DEPTNO":30,"DNAME":"Purchasing","LOC":"Seattle"},
          {"DEPTNO":40,"DNAME":"HR","LOC":"London"}
          ]}');
end;
/

ORA-20001: DEPT row with DEPTNO = 10 already changed by another user. No updates have been made.
ORA-06512: at "STEW.UPDATEAPI", line 28
ORA-06512: at line 5

There: that wasn’t too hard, was it?

Too Good to be True?

I hope I have gotten the basic idea across and aroused your interest. In my next few posts, I’ll talk in more detail about limitations or possible objections and how to deal with them. Here are some of the issues:

  • How do we get read-consistent data in the query API if it does more than one SELECT?
  • How do we know the exact value of the read-consistent SCN? My example code is not perfect in that regard.
  • ORA_ROWSCN can be maintained at the block level or the row level. What is the difference and which one should we use?
  • Tom Kyte gave up on this idea because a change in ORA_ROWSCN does not “trigger a restart” of the update, see https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2680538100346782134. How do we get the restart to happen when it should?
  • What happens if an update occurred but the columns we want to update did not change? How can we avoid “false alarms”?
  • How can we provide more information on the intervening update so the user can better decide what to do next?

More later!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s