Optimistic ORA_ROWSCN 3: the Use Case

My goal is to show how ORA_ROWSCN can help deal with the problem of “lost updates”. Before that, I need to explain the use case I have in mind: an OLTP application that provides a transactional API in the database, using JSON for data interchange, across a stateless connection.

OLTP versus Batch

Technically, OLTP (On Line Transaction Processing) implies lots of small transactions running concurrently, whereas “batch” implies changing lots of data in one process, using bulk or set-based techniques. There is another important difference: with OLTP, someone is there!

To avoid a “lost update”, someone has to be there to “find” it. Suppose you and I query a row at the same time, then we both try to update it. If your update happens first and commits, I need to know it so I can decide whether my update should overwrite yours. In other words, I need to “find” your update so I can think about how to deal with the conflict.

When a batch program runs, there is no one there to “find” an update made by something else while the batch is running.

A prerequisite of avoiding “lost updates” is the presence of a human being who can be notified in case of a conflict and decide what to do next.

Stateless: Request and Response

When you and I write SQL or PL/SQL, we generally use a tool like SQL Developer, Toad or SQL*Plus. These tools use the same client / server architecture that applications used in the 1980’s: each person has a conversation with the database using a session that keeps track of what is going on. If I query 1000 rows in SQL Developer, the tool will execute the query and fetch 100 rows. If I scroll down, the tool will fetch 100 more rows from the same query – because the cursor was left open in my session.

The fact that the session remembers what is going on during these exchanges is called “state”.

In modern OLTP applications, people access servers using HTTP, then the servers access the database using connections from a pool. Each connection has a session that can by used by you, then me, then you again. When you use the session a second time, the session remembers nothing from your previous use – if it does, there is a huge bug in the application. This absence of “state” is what we call “stateless”.

Being “stateless” makes a big difference. A “stateful” interface can do a SELECT FOR UPDATE, then UPDATE the rows that I change. A “stateless” interface cannot do that, since the session I used for the SELECT and the session I use for the UPDATE are not necessarily the same.

A stateless application must use a pure “Request and Response” protocol with the database:

  • I ask the application to query some data for me. The application gets a connection from the pool, requests the data and shows me the response from the database. The database session forgets about me because the next user will likely be somebody else.
  • I ask the application to update some data. The application gets whatever connection and session are free. If the database needs some information from my first query, the application has to pass it back.

The whole idea of “optimistic locking” is based on this scenario: a stateless OLTP application with a real life user who has to be informed of update conflicts.

JSON and the Object-relational impedance mismatch

As stated in the JSON standard, JSON is a data interchange format. Did I emphasize that enough for you? JSON was not meant for storing data, but for exchanging it between programming languages. Starting with Oracle Database 12c, SQL can query JSON (meaning it can map JSON to SQL rows and columns); starting with version 12.2, SQL can also generate JSON from “normal” database data. In other words, SQL all by itself can now use JSON as a data interchange format.

Ever since Java hype got huge mind share in the minds of enterprise IT architects, developers are supposed to do object-oriented programming. Thinking of data as rows and columns in a SQL database is old hat. Instead of data, there are objects with attributes and methods; several EMP objects can point to the same DEPT object, whereas in a JOIN the same DEPT data is repeated across several rows. The conceptual differences between “object” and “relational” are important enough to have a fancy name: the “object-relational impedance mismatch”. People write object-relational mapping frameworks so that application code can ignore SQL as much as possible, leading to all sorts of wasted resources, time and money.

By using JSON intelligently as a data interchange format, we can map relational data to an “object-friendly” form right in our SQL statements. As long as the application developers and the SQL developers understand the mapping and what underlies it, each can work in their own environment with data everyone can access with relative ease.

Note I am doing the JSON-relational mapping in the database, in SQL. The database client (application or generic tool) accesses the database via SQL*Net and receives data in JSON format already. That data may or may not be a payload for a RESTful web service. JSON in the database can be used in browser-based, #ThinJava applications but also in enterprise applications with thick application layers.

Transactional API: read-consistent, write-coherent

I wish I had a better term than “transactional API”, because I don’t mean the same thing by it as others – sorry!  Let me explain.

Getting back to the “lost update” problem, the user needs one API to query some data and another API to update it. Each API has to include one “request” from the application and one “response” from the database (this is all stateless, remember). In my view, every “query” API should return read-consistent data, and every “update” API should be transactional – at least potentially.

As far as querying is concerned, I am not saying that two different responses from the database should be consistent – only that all the data contained within one response should be internally consistent. If data from two queries are combined in one response, then the queries should be “as of” the same SCN.

With respect to “update” APIs (or any API that changes data), the API should provide everything the database needs to make a coherent set of changes that can reasonably be committed together. This is what we used to call a “logical unit of work”. If a change doesn’t make sense by itself, but only as part of a whole, then there should be no API to make just that change.

Unlike the #SmartDB paradigm, I don’t insist that the API commit before returning. It may be convenient for the application to do a complex transaction that involves two APIs which must succeed together or fail together. It may be that the application accesses multiple data sources and does two-phase commits. I do insist that the API is “atomic”: the changes either entirely succeed or entirely fail, and if they fail the database returns to precisely the state it was in before the API was called. For want of a better phrase, I’ll call this API a “write-coherent” API.

When is there a lost update?

Let me add one final touch to my use case. Optimistic locking schemes are often row-based: if you change any column in a row, I get notified of a “lost update” even if I am changing another column. My approach will start with that idea, but then I will make it more granular so that only updates to the same column will cause a conflict.

Next up: an overview of the ORA_ROWSCN solution.

Advertisements

7 thoughts on “Optimistic ORA_ROWSCN 3: the Use Case

  1. Hello Stew,

    With the risk of being accused of having an “old-fashioned thought”, I still cannot entirely comprehend the behavior of the stateless paradigm.
    I do understand that in a stateless environment there are probably different sessions servicing the successive database calls issued by the
    same user, but this should not prevent an application to perform its logic correctly, including data locks.

    For example, what happens if a database call DOES issue a SELECT FOR UPDATE while calling an API ?

    The data lock belongs to the data, so, as long as a COMMIT or ROLLBACK was not issued, the lock is still there.

    As we know, Oracle also keeps track of the various transactions that perform data locks, with the various details recorded in V$LOCK.

    What will happen if, in a stateless environment, a session DOES perform a data lock and then “it leaves” the user whose activity did issue
    that lock ?
    How long will that lock be active and how/when will it be released ?

    Or, otherwise put, when a new (next) session will continue to service the same user’s application session,
    and that new session will be asked to perform a COMMIT, how will it “know” that it should commit changes performed by
    and release locks acquired by another session ?

    I am by no way an APEX specialist, but, from the very few that I have read about this product, I understand that the APEX environment,
    which is also stateless, does perform a kind of “session state preservation” by recording the session state transparently in the database,
    so, when a new database session is assigned to service the next call of a user’s session, it can continue from the point at which the previous
    database session has left off.

    Session state is a great feature of Oracle, if we only think of the possibilities of caching that we can perform at session levels,
    using packages.
    Having to stop using all those benefits just because of the “more modern” stateless environments is a huge disadvantage in my opinion.

    Coding “statelessly” practically says that an interactive session, conducted by “a person who is there” as you called it,
    is in fact performing nothing else but “small batch-like API calls”, similar to those performed by batch jobs,
    aka, self-contained database calls, that are NOT part of any logical flow (including logic of data locking)
    and that, ideally, should probably also contain complete transactions, without leaving any changes and any locks behind.

    I don’t think that using or not using JSON for passing data (parameters) is the topmost problem in this context,
    but, rather, understanding how the various actions performed statelessly are treated on the database’s side,
    or, in other words, to switch a “normal” developer’s thinking from the stateful paradigm of development to the stateless one.

    The entire Oracle documentation, in presenting the locking concepts, is strongly based on the stateful thinking paradigm,
    at least as I am able to understand it.

    So, just saying “DON’T use SELECT FOR UPDATE …” because we “live” in a stateless environment, does NOT cover
    the entire problem.

    Maybe these concepts require a deeper insight ?

    Thanks a lot for these threads & best Regards,
    Iudith Mentzel

    • Hello Iudith,

      Thanks for another thoughtful comment. I’ll reply bit by bit.

      You said: “I do understand that in a stateless environment there are probably different sessions servicing the successive database calls issued by the same user, but this should not prevent an application to perform its logic correctly, including data locks. For example, what happens if a database call DOES issue a SELECT FOR UPDATE while calling an API ? The data lock belongs to the data, so, as long as a COMMIT or ROLLBACK was not issued, the lock is still there.”

      The data lock also belongs to the transaction, which belongs to one and only one session. Only the session that did the DML can do the COMMIT or ROLLBACK. That is the problem.

      You said: “What will happen if, in a stateless environment, a session DOES perform a data lock and then “it leaves” the user whose activity did issue that lock ? How long will that lock be active and how/when will it be released ?”

      The session doesn’t leave the user, the user leaves the session. By “user” I mean the application that calls the database on behalf of the user.

      I think your question is rhetorical. If locks are left hanging then a DBA will probably have to kill the session at some point.

      You said: “I am by no way an APEX specialist, but, from the very few that I have read about this product, I understand that the APEX environment, which is also stateless, does perform a kind of “session state preservation” by recording the session state transparently in the database, so, when a new database session is assigned to service the next call of a user’s session, it can continue from the point at which the previous database session has left off.”

      No, I think not. Web servers and web applications use the concept of “user session” which is not the same as a database session. I believe every time APEX receives a request from the user to change data, it commits before returning control to the user. The APEX “session” may be stateful, but the database session is stateless – at least it doesn’t preserve state across user interactions.

      You said:”Session state is a great feature of Oracle, if we only think of the possibilities of caching that we can perform at session levels,
      using packages. Having to stop using all those benefits just because of the “more modern” stateless environments is a huge disadvantage in my opinion.”

      We don’t have to stop caching. All the technical benefits of cursor reuse, cached packages and such still remain. It is only user-related state that is not preserved.

      You said: “Coding “statelessly” practically says that an interactive session, conducted by “a person who is there” as you called it, is in fact performing nothing else but “small batch-like API calls”, similar to those performed by batch jobs, aka, self-contained database calls, that are NOT part of any logical flow (including logic of data locking) and that, ideally, should probably also contain complete transactions, without leaving any changes and any locks behind.”

      In web applications, there is a web server process or thread between the browser and the database; code runs inside it which I call the “application”. This code can get a connection with a session, make several calls, then commit and release the connection. There can be some logical flow, and some data locks, during that time, as long as everything happens after the session is obtained and before it is released.

      Even if the “application” can make several calls to the database, it must commit or roll back and release the connection before returning control to the user. So from the user’s point of view, I think you are basically correct.

      You said: “I don’t think that using or not using JSON for passing data (parameters) is the topmost problem in this context, but, rather, understanding how the various actions performed statelessly are treated on the database’s side, or, in other words, to switch a “normal” developer’s thinking from the stateful paradigm of development to the stateless one.”

      I agree there, but why not JSON too? If you had to deal with frameworks like Hibernate you would be happy to have an alternative that gives you back control of your SQL.

      You said: “The entire Oracle documentation, in presenting the locking concepts, is strongly based on the stateful thinking paradigm, at least as I am able to understand it.
      So, just saying “DON’T use SELECT FOR UPDATE …” because we “live” in a stateless environment, does NOT cover the entire problem.”

      I don’t understand what you mean here. I was immersed in stateless architectures long before I became an Oracle fanboy and I had no problem with the documentation. Your last sentence has no relation to anything I wrote: I said there are stateful architectures that can use SELECT FOR UPDATE, and stateless architectures that can’t. Yes, the latter are more common today.

      My point is that optimistic locking goes with stateless architectures. If your architecture is stateful, you are free to do pessimistic locking with SELECT FOR UPDATE.

      What is the “entire problem” that isn’t covered?

      Best regards, Stew

  2. Hello Stew,

    Thanks a lot for your fast feedback.

    I understand the implications of coding statelessly.
    The phrase that best explains what happens is that, if such a stateless API call performs a data lock without finishing the transaction,
    then the ONLY way to release that lock will be by having the DBA kill the session.
    This says ALL about the very strong limitations of such an environment.

    Caching of data at package levels, with the purpose of having the cached data available across database calls is NOT possible
    in such an environment.
    It is only possible within the boundaries of a single database call.

    It is theoretically possible that several successive database calls for a single user (application) session will be serviced by the same database session, but there is no possibility to rely application logic on this, because the business application has no control upon which database
    session will service each database call.
    The entire control is in the hands of the application server (the middle-tier), for which the business application logic is transparent.

    Well … my thinking is basically influenced from the “hybrid” 3-tier environment of web-deployed Oracle Forms,
    which I used along many, many years, and which, along with being a deployed as a web application, it is still completely stateful,
    and, when it comes to locking, it allows the developer the choice of using either a “half-pessimistic” locking (the default),
    or an optimistic locking, or, even to implement a totally custom-designed locking scheme.

    I still believe that what we call “stateless” environment was in fact a “low-hanging-fruit” implementation of what an ideal multi-threaded
    environment would be.
    Maybe this is the only possible implementation at this moment, but it is far from being ideal.

    The purpose of inventing this paradigm was basically that of allowing a limited number of database sessions to service a high number
    of user ( application ) sessions, and it comes at the price of strong limitations imposed on the features that those database sessions
    are practically able to use.

    The problem with the dedicated sessions environment was that, by its nature, a user interactive application session is idle most of the time,
    and then the database session dedicated to that user session is also idle most of the time.

    The ideal environment will be one in which a users’ application session still does have its own database session, including session state,
    locks, and so on, and, on top of all that high number of database sessions (most of the time idle, but still having a state preserved),
    have a limited number of (active) processes that alternatively switch their activity from one database session to another, when a database
    session becomes idle.

    Regarding the documentation … maybe I was not completely clear.
    What I intended to say is that, the documentation explains everything in terms of a “session” being the same as a “database session”.
    A good sample case are the examples that explain the locks and concurrency issues, usually with two sessions shown side by side,
    similarly to your previous posts on optimistic locking.
    In other words, the documentation is entirely for “inside the database”.
    Each behavior or feature explained this way is only reproducible by a database client application that is stateful,
    but almost meaningless for a stateless application, aka one that is allowed to only issue “closed API calls” (closed transaction calls)
    and not individual SQL statements to the database.

    I even wonder that, in light of the proliferation of the stateless environments, there is no dedicated Oracle manual
    ( or at least an extensive chapter in the various Developer Guides ) that will explain the differences between a stateful and a stateless
    environment and teach the coding practices and limitations that should be followed for stateless development.

    From a practical point of view, I think that it is practically impossible to port an application from a stateful to a stateless environment,
    while the opposite is almost trivial.

    I was never exposed to any stateless environment, but, instead, the stateful environment allowed me the freedom to implement
    all the features available for improving application quality.
    It is kind of weird to teach various development best practices as long as SQL or PL/SQL is concerned,
    and afterwards having to come and say:
    “Well, boys, if you are developing for a stateless environment, then please forget about using all these techniques …”.

    Ok … this discussion might still go on … probably each of us is strongly devout to the development paradigm that looks more familiar to him,
    after many years of work.
    Habits are the most difficult ones to change, and this is especially true for the good and useful habits :)

    Thanks a lot again, I will closely follow the upcoming posts :)

    Best Regards,
    Iudith

    • Iudith,

      Your comments are interesting and insightful. I don’t mean to have the “last word”, but just a few clarifications:

      – You said: “Caching of data at package levels, with the purpose of having the cached data available across database calls is NOT possible
      in such an environment. It is only possible within the boundaries of a single database call.”

      I misunderstood what you meant by “caching” in packages. I meant package code remaining in memory for quick reuse, you meant caching of data across calls.

      Now that I understand what you mean, the “boundaries” are not a single database call, but the opening and closing of a connection. Suppose I am a Java program running within a web application server using a connection pool. When I “open” a connection, I am in fact getting a free, already open connection from the pool. I now have access to a database session and I can do whatever I want, until I “close” the connection, which in reality simply frees the connection for the next use.

      So there can be multiple, related database calls within those boundaries, and there usually are. The purpose of the #SmartDB paradigm is to move those multiple calls into stored procedures, so that the single database call (to a transactional API) replaces those multiple calls.

      – You say: “The ideal environment will be one in which a users’ application session still does have its own database session, including session state, locks, and so on, and, on top of all that high number of database sessions (most of the time idle, but still having a state preserved), have a limited number of (active) processes that alternatively switch their activity from one database session to another, when a database session becomes idle.”

      Isn’t that called “shared server”? Oracle implemented that a long time ago and I imagine you are using it.

      – You said, with respect to the documentation and two sessions shown side by side: “Each behavior or feature explained this way is only reproducible by a database client application that is stateful, but almost meaningless for a stateless application, aka one that is allowed to only issue “closed API calls” (closed transaction calls) and not individual SQL statements to the database.”

      Same thing I said before: stateless applications can issue individual SQL statements to the database within the boundaries of a connection. In fact, if there are “closed API calls” then the stored procedure still issues individual SQL statements within a session. Because of that, the features explained are not always “meaningless”, even though the illustrations use stateful tools like SQL*Plus.

      You are so right to underscore the fundamental difference between the two architectures and the “cultural change” required for developers to move from one to the other. However, I insist that I am not “devoted” to stateless architectures. I am simply describing a problem (lost updates) and proposing a solution (optimistic locking using ORA_ROWSCN in a new way). The problem is more acute in stateless architectures, that is all.

      Best regards, Stew

  3. Hello Stew,
    Since you already mentioned “shared servers” (or multithreaded architecture, as it is called),
    then now it is not clear for me whether they are the same as “the ideal solution” (as I imagine it )
    or they are rather another term for a “server pool” meant and only able to service a stateless architecture.

    If the shared server architecture behaved as I imagined the ideal solution,
    wouldn’t this mean that we already have the best of the two worlds, that is, both “few processes” and “many user sessions”,
    so why at all would we need a stateless architecture ?

    If a Java program does acquire a connection and KEEPS IT dedicated to that program for several database calls, before releasing it,
    this in fact is the same as using a dedicated server architecture, though maybe for a shorter time, because the program itself
    is like a batch job, without idle periods.

    What makes the difference is whether the application itself does or does not have control upon how long a database session
    (and server) remains dedicated to that application (program) only.
    That is why I said that in a stateless architecture, the session (or, better, the server process) is “leaving the user”,
    whether or not that user “would have preferred” to keep it dedicated to it.

    In all our developments we always used only dedicated server architecture, never used neither shared servers nor stateless architectures.

    So, yes, that “cultural change” as you so nicely called it, does require a more thorough education phase.

    Thanks a lot & Best Regards,
    Iudith

    • Iudith,

      Now we are not understanding each other. “Shared server” is a possible architecture of Oracle Net Services: when an application connects to the database, a dispatcher gives it a database foreground process from a pool, but the process uses the same session the application used previously. There are as many sessions as logged on users, but the number of processes is limited. I thought this was exactly what you were describing, and that as an Oracle Forms person you would have run across this architecture. See http://docs.oracle.com/database/122/CNCPT/application-and-networking-architecture.htm#CNCPT904

      This architecture requires more memory and is less scalable than an architecture where the application server uses a connection pool with a limited number of dedicated processes *and* sessions that are always paired and do not require a dispatcher to put them together.

      I see why you say the server process is “leaving the user” – but you should stop saying it anyway. It is misleading for three reasons:
      – to be “stateful”, the client application does not need to use the same process, it needs to access the same *session*
      – the client application decides when it “leaves” the server process.
      – the user does not connect to the database, the client application does.

      Don’t give the impression that the database process decides or knows or cares whether the application architecture is stateful or stateless. If the application architecture is stateless, the burden is entirely on the client application to leave the session in the same shape as when it arrived. The client can do SELECT FOR UPDATE and leave locks all over the place and the database will obey.

      Keep in mind that in stateless architectures the application code doesn’t stick around. The user hits a button, the web server calls a program, the program connects to the database and does some work, then the program returns something to the user’s screen and goes away. When the user hits another button, the program is called again from scratch. There is no long-lasting program to keep hold of a long-lasting session!

      Yes, in a stateless application architecture then dedicated servers are used, not shared servers.

      Best regards, Stew

  4. Hello Stew,

    We used Oracle Forms for many applications, but NEVER with shared servers architecture.
    Also, other Oracle (non-Forms) applications also used dedicated servers only.

    But, yes, as by your description, using shared servers accomplishes the same as the dedicated servers,
    aka, allowing full usage of ALL the features, locks included.
    As a result, for the end user and client application this appears identical to a dedicated server architecture.
    Among the other things, when the user logs on (ex. by hitting a button), the application connects for him to the database
    and that single session is kept alive for him, until he logs out.
    That is, from the user’s perspective, he “connects” to the database and “leaves” the database.

    On the other hand, a stateless application only “gives the impression” that the user is “working online against a database”.
    He is in fact performing a seria of independent and “closed” batch API calls, and, behind the scenes,
    each such call connects to the database, performs its action and then disconnects
    (aka, the session “leaves” the user unconnected after it has finished executing a database call).

    What we are discussing in these threads, namely “optimistic locking”, should maybe better be called in this context “optimistic update”,
    because the stateless application will never perform a separate database call for locking and another one for updating.

    The problem of avoiding lost updates is essentially the same, it is only more difficult to solve when pessimistic locking is not available.

    Thanks a lot & Best Regards,
    Iudith

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s