Optimistic ORA_ROWSCN 1: Lost Updates

I’ve gotten lots of feedback about my proposal to use ORA_ROWSCN for optimistic locking. I’m going to start over in more detail and try to leave less scope for misinterpretation. Thanks to all for your comments, especially Jonathan Lewis who saved me from some technical doubts.

“Optimistic locking” is a type of solution to the problem of “lost updates”. So what do we mean by “lost update”, anyway?

Same row, same column

The Oracle documentation defines a lost update as: “A data integrity problem in which one writer of data overwrites the changes of a different writer modifying the same data.” The Concepts document gives an example where two sessions update the same column in the same row of the same table:

  • Session 1 updates, but does not commit yet
  • Session 2 selects the row, but does not see the change made by Session 1 because it is not committed yet.
  • Session 2 updates the same column in the same row. In fact, it has to wait on Session 1.
  • Session 1 commits.
  • Session 2 is no longer blocked by Session 1, so the Session 2 update overwrites the Session 1 update.
  • Session 2 commits.
  • Session 1 queries the row and discovers its update is “lost”.

Why say this update is “lost”? Simply because Session 2 did not see it when it queried the data. Session 2 overwrote the update without ever seeing it. The documentation also calls this a “conflicting write”.

I think the following scenario is also a “lost update”.

  • Session 2 selects a row.
  • Session 1 updates a column in the row and commits.
  • Session 2 updates the same column in the same row and commits.
  • Session 1 queries the row and discovers its update is “lost”.

In both scenarios, the Session 1 update is “lost” because Session 2 (that is, the user behind Session 2) did not see the change before overwriting it.

Why should we care?

In real life, what we have here is a failure to communicate. Behind the two sessions are two users who want to put different values into the same place in the database. In collaborative software such as Lotus Notes, Session 1 would create a document and Session 2 would create another document marked as a “conflict”: after the fact, human beings would work together to resolve the conflict.

In the database, we can’t store both values simultaneously, so we generally communicate the existence of a conflict by refusing to execute the second update. Whoever gets the error can query the row again, see the change made by the first update, then decide what to do next.

Update vs. Changed Value

An “update” to a column does not necessarily mean that the value of the column changed!

If I say UPDATE EMP SET SAL = SAL, Oracle will perform an update, executing the same triggers and creating UNDO as it does when the SAL value changes. (There is one difference: if the column is indexed and the value does not change, Oracle doesn’t touch the index.)

Updating columns to the same value may mean two different things: either

  1. the application uses generic UPDATE code and always updates the same columns, whether the user wanted to update all of them or just some of them; or
  2. a user deliberately updates a column with the same value it had previously, as a way of confirming that the value is still valid.

I suspect the first case is much more common than the second, but the second cannot be completely ruled out.

Same row, different columns

In both of the scenarios above, suppose Session 1 updates one column and Session 2 updates another column in the same row. As long as Session 2 leaves the first column alone, there is no conflict, no “lost update”.

Multiple rows and columns

I recently learned of a project where the “lost update” scenario has been extended to a group of rows. Since the rows are related functionally, the developers want to avoid applying changes to one row if any of the others in the group have changed. I’ll call this requirement “detecting changed context” instead of “detecting lost update”.

“Lost updates”, and beyond

Optimistic locking basically means detecting updates or changes that were committed between the time the user queried the data and the time (s)he updated it.

But: different applications may want to detect different things. For example:

  • Detect only changed values in columns the user really wants to change.
  • Detect changed values in any updatable column of the row the user wants to change.
  • Detect changed values in the row, even if the user cannot update them.
  • Detect any update to the row, whether values changed or not.
  • Detect changes in the context, meaning all rows related to the target row(s), even those the user does not intend to update.

Mechanisms for detecting changes / updates

In the old “client / server” architectures, each user kept the same session for querying the data and updating it. To avoid conflicts at the row level, it was sufficient to do SELECT FOR UPDATE instead of a simple SELECT. That way, the current user saw all previous changes and the later users would see all the current changes. “Lost updates” were purely and simply avoided. This is known as “pessimistic locking”, since we assume conflicts will occur and take steps to avoid them.

In stateless architectures today, a user may obtain one session for querying the data and another for updating it. No session will remember anything about what happened when the user accessed the database previously. Therefore, SELECT FOR UPDATE is out of the question. This led to various “optimistic locking” strategies, where the data is to some extent queried again and compared to the values that were originally shown to the user. Some methods are:

  • column compare (all columns or only those of interest)
  • hash compare (hashing all or some columns)
  • timestamp compare (a row has been updated if the timestamp is different)
  • version compare (a rows has been updated if the version number has incremented).

Note that for efficiency, conflicts within a row can be detected directly in the UPDATE statement – a prior SELECT FOR UPDATE is not needed.

Where do I fit in?

My proposal is inspired by Tom Kyte (even though he gave up on ORA_ROWSCN himself). I want to use ORA_ROWSCN to detect conflicting updates to a row, whether any values changed or not. We’ll see in the following posts how this works, what requirements it fits and how much we can rely on it.


5 thoughts on “Optimistic ORA_ROWSCN 1: Lost Updates

  1. My proposal is to accept the guidance of Oracle’s gurus and use a ThickDB approach to application design. Meaning, the middle-tier application code executes no SQL directly. The business rules belong in the database behind PL/SQL APIs called by middle tier code. All DML is done in PL/SQL. That solves the problem of locking instead of the symptom of lost updates.

    • Hi Peter,

      I hear they are changing the name from ThickDB to SmartDB, but whatever the name I generally agree with that approach. It doesn’t solve the problem though.

      The problem is human: you and I query a row and see A; you change it to B, then I change it to C without realizing you already changed it to B. Your update is “lost”.

      If you don’t see things that way, you need to explain your thoughts in a lot more detail. What exactly is your solution and how does it work?

      Best regards, Stew

  2. OK, I will agree that the problem is partly human– excluding application code sending back multiple columns for update when only one is really changing and the rest are just coming along for the ride. I just have not seen a good case in all my years where two users really differ on what the value of a column should be. If the problem is with the humans, no amount of technology is going to fix that.

  3. I will never use ora_rowscn for optimistic locking again. I experienced a bug after an oracle upgrade to 11.2 using transportable tablespaces where the ora_rowscn pseudofunction started returning null for the rowscn until the block was updated. It just have me pain and grief….

    • Hi Mike,

      That does sound horrible! Was the problem just in the ORA_ROWSCN pseudocolumn or was there an underlying problem with handling the SCN in the block itself? Could you post a bug or patch number please?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s