Optimistic ORA_ROWSCN 2: the SCN

In my previous post, I covered lost updates. Before saying how ORA_ROWSCN can help avoid them, I need to talk about the SCN. It’s hard to be simple and correct about such a central element of the Oracle database, but I’ll try.

Aside from the documentation, I am relying on Oracle Core by Jonathan Lewis, chapter 3, along with this blog post and Jonathan’s answer to my question on OTN.

The Commit SCN

The documentation defines the SCN, or “System Change Number”, as a “database ordering primitive. The value of an SCN is the logical point in time at which changes are made to a database.” It also calls it a “stamp that defines a committed version of a database at a point in time. Oracle assigns every committed transaction a unique SCN.”

In other words:

  • Every change to data in the database is made as part of a transaction.
  • The changes in a transaction become effective when the transaction is committed.
  • Every committed transaction gets a unique SCN.
  • The SCN increments at each commit, which allows it to “order the database” in time.
  • Every “commit SCN” “defines a committed version of the database”, so the SCN is like a version number that applies to the entire database.

The “commit SCN” of a transaction is precisely the SCN that was assigned to a transaction on commit.

[UPDATE 2017-07-24: Jonathan Lewis just pointed me to a blog post by Frits Hoogland where he showed that it is possible for two commits to share the same SCN. To quote Frits: “both commits … have the same SCN. At this point I think it’s fitting to theorise why this is happening. I think this is happening and makes sense because the table is protected for consistency at another level in the database, which is the locking mechanisms (TX locks for rows, TM locks for segments), which guarantees that two transactions at the same time do not compromise data consistency. And because of that, it’s not a problem to batch them with the same SCN.”

Frits seems to have shown an error, or at least a simplification, in the documentation. Either that or “unique” means every transaction has just one “commit SCN”, even if one “commit SCN” can be associated with two simultaneous transactions.

Fortunately for me, this refinement doesn’t invalidate the rest of this post.]

The SCN and Read Consistency

In Oracle, queries are “read consistent”, which means “The data returned by a query is committed and consistent for a single point in time.” This point in time is defined by – you guessed it – an SCN. I’ll call that SCN the “read-consistent SCN”. By default Oracle uses the SCN that was current when the query started, but within read-only or serializable transactions it uses the SCN when the transaction began.

Now comes the hard part.

When we submit a query, Oracle gets the data blocks it needs to read the rows of interest. The problem is that each data block may contain uncommitted data, or data that was committed after the “read-consistent SCN”. To determine that, Oracle consults the ITL (Interested Transaction List) in the block header, and if necessary the UNDO that the ITL entries point to.

There are three case to consider:

  1. All the ITL entries show committed transactions having “commit SCNs” that are less than or equal to the “read-consistent” SCN.
  2. All the ITL entries show committed transactions, but there is at least one “commit SCN” greater than the “read-consistent” SCN.
  3. At least one ITL entry shows a transaction that has not committed.

Case 1. is the most frequent and the simplest: none of the data is more recent than the read-consistent SCN, so the query may use the block as is.

Case 2. is not as simple as you might think. The “commit SCN” in an ITL entry may be an upper bound: it cannot be less than the real “commit SCN”, but it could be greater. In this case Oracle cross-checks with the “transaction table slot” in the UNDO segment header. If it finds that the transaction was actually committed before the read-consistent SCN, it will use the block in the query, and it will lower the “commit SCN” stored in the ITL entry.

Case 3. is similar to case 2. The ITL entry says “transaction not committed”, but that may be because Oracle did not have time to update the ITL entry before the block was flushed from memory. Oracle must cross-check with the “transaction table slot” in the UNDO segment header. If it says “actually, the transaction has committed”, then the ITL entry is updated with an upper bound “commit SCN”. This is known as “delayed block cleanout”.

We can see case 3. and case 2. at work in the following example, using two independent sessions:

  • Session 1
create table T1 (a number, ts timestamp, val number);
create table T2 (a number, ts timestamp, val number);

begin
    insert into T1 values (1, localtimestamp, 0);
    insert into T1 values (2, localtimestamp, 0);
    commit;
end;
/
begin
    insert into T2 values (3, localtimestamp, 0);
    commit;
end;
/
select dbms_flashback.get_system_change_number current_scn from dual;

CURRENT_SCN
-----------
   20031509

update T1 set ts = localtimestamp, val = 1 where a = 2;
alter system flush buffer_cache; -- (1)
commit;

begin
    for i in 1 .. 5000 loop
        update T2 set val = val + 1;
        commit; -- (2)
    end loop;
end;
/
select dbms_flashback.get_system_change_number current_scn from dual;

CURRENT_SCN
-----------
   20036606

(1) Notice that we update T1, then flush the block from memory before the COMMIT. This prevents Oracle from doing the “commit cleanout” in memory that would normally put the real “commit SCN” in the ITL entry.

(2) After the update to T1, we do 5000 transactions to T2 in order to increment the SCN.

  • Session 2
set transaction read only; -- This simulates a long-running query
  • Session 1
begin
    for i in 1 .. 5000 loop
        update T2 set val = val + 1;
        commit;
    end loop;
end;
/
select dbms_flashback.get_system_change_number current_scn,
ora_rowscn from T1 t;

CURRENT_SCN ORA_ROWSCN
----------- ----------
   20041709   20041359
   20041709   20041359

We use the pseudocolumn ORA_ROWSCN to query the “last change SCN” of the block, which corresponds to the highest SCN in the ITL. Notice the value is the same for both rows, even though we updated one row later than the other. Notice also that ORA_ROWSCN shows an SCN value that is clearly less than the current SCN (so we know the data is read consistent with the query) but much greater than the real commit SCN.

What happened here was what I called case 3. above. The data block was flushed from memory before the commit, so Oracle had to go to the UNDO to find out what happened. It discovered the transaction was committed some time ago, and updated the ITL entry to an SCN it could find efficiently and that was “good enough” for the query being executed. This SCN is still almost 10,000 more than the real “commit SCN”.

  • Session 2
select dbms_flashback.get_system_change_number current_scn,
ora_rowscn from T1 t;

CURRENT_SCN ORA_ROWSCN
----------- ----------
   20041714   20036604
   20041714   20036604

This is a example of what I called case 2. above. Oracle got the data block, found an SCN that was too recent for the query, and went to the UNDO to find out if the real commit SCN was lower. Once it found an SCN that was not greater than the read-consistent SCN (at the beginning of the transaction), it updated the ITL entry (together with the “last change SCN”) and used the data block in the query.

  • Session 1
select dbms_flashback.get_system_change_number current_scn,
ora_rowscn from T1 t;

CURRENT_SCN ORA_ROWSCN
----------- ----------
   20041714   20036604
   20041714   20036604

This is just to show that the ITL entry was well and truly changed for all sessions, not just for Session 2.

To summarize what we have seen: whenever a query accesses a data block, Oracle has to determine whether the block has changed earlier or later than the read-consistent SCN of the query. It may have to go to the UNDO to find that out, but if it does it updates one or more ITL entries so the next query doesn’t have to do the same work over again. At the end of the query, all the data blocks that were read consistent now have ORA_ROWSCN values no greater than the read-consistent SCN.

By the way, flashback queries work the same way. Using AS OF SCN nnnn, I can lower an ORA_ROWSCN several times in a row and get pretty close to the real commit SCN.

ROWDEPENDENCIES

I said the pseudocolumn ORA_ROWSCN tells us what is the most recent SCN in the ITL. That is true if the table was created with NOROWDEPENDENCIES, which is the default. If we create the table using ROWDEPENDENCIES, Oracle makes room for an SCN for each row. For this kind of table, ORA_ROWSCN shows us the SCN stored with each row, not the overall SCN we discussed up until now. I’ll have more to say about this in a later blog post.

Summary

  • Oracle orders all transactions in time by assigning each one a unique SCN on commit: this is called the “commit SCN”. [UPDATE 2017-07-24: it may be possible for two transactions to share the same “commit SCN” (and therefore be simultaneous), but no transaction will ever have more than one “commit SCN”.]
  • Every query returns results that were consistent as of an SCN that I call the “read-consistent SCN”: it corresponds to the start of the query (or the start of the transaction).
  • During query processing, Oracle compares the two SCNs and returns results where the “commit SCN” is never greater than the “read-consistent SCN”.

Sometimes a data block has incomplete information in its ITL, and Oracle consults the UNDO to determine whether the “commit SCN” is not greater than the “read-consistent SCN”. If it is, Oracle will update the ITL to save work for the next query.

The end result is that, after the query finishes, every ITL entry for eligible blocks will contain an SCN less than or equal to the “read-consistent SCN” of the query. This is crucial for doing optimistic locking with ORA_ROWSCN.

More later…

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s