Ranges with NULLs 01: starting over

I have written a lot about ranges, mostly based on dates, and I have tried my best to avoid NULLs in “from” and “to” columns. I give up: NULLs are a fact of life and must be dealt with. This means revisiting all my previous work!

Current list of posts: 

What do I mean by “range”?

Let’s start with the SQL:2011 definition of PERIOD. A PERIOD identifies two datetime columns. The end datetime is greater than the start datetime. In the Oracle implementation, both start and end can be null.

A datetime value falls within the period if:

  • it is greater than or equal to the start datetime
    OR the start datetime is NULL
  • and it is less than the end datetime
    OR the end datetime is NULL.

A period that starts with NULL and ends with NULL will include any datetime value, including NULL!

My notion of “range” is basically an extension of this definition to more data types:

  • A range identifies a pair of values of the same data type.
  • The data type can be any scalar capable of equal, greater-than and less-than comparisons,
    though datetime types and numbers are by far the most common.
  • I tend to use “from” and “to” because “start” and “end” make me think of datetime ranges.
  • “From” is less than “to”
  • “From” and / or “to” can be NULL
  • A value falls within the range if:
    • it is greater than or equal to “from”
      OR “from” is NULL
    • and it is less than “to”
      OR “to” is NULL
  • In table definitions, I strongly recommend constraints
    • “from” < "to". This constraint should always be there.
    • “from” NOT NULL (unless a use case necessitates NULL).

Closed-Open ranges

When defining ranges, we always want to allow the possibility for two ranges to “meet”, meaning there is no gap between them yet they do not overlap.

The “closed-open” model allows this, because the “from” value belongs to the range but the “to” value does not.

If we have a range 1 to 2 and another range from 2 to 3:

  • The value 2 is not in the first range, but it is in the second.
  • There is no value from 1 to 3 that is in both ranges, so there is no overlap.
  • There is no value from 1 to 3 that is in neither range, so there is no gap.
  • In other words, every value from 1 to 3 is in exactly one of the two ranges.

This model will work will all types of range, and it makes some queries easier to write. It’s all good.

Objections to NULL in ranges

I once wrote SQL and date ranges: don’t make NULL mean something, where I gave five arguments against NULL. Here’s what I think as of today.

  1. “NULL should not mean anything”.
    I still agree with that, but what can you do?
  2. “We can use real values instead of NULL”.
    That may be the case for “from”, but not for “to” because there is no valid “to” value that is strictly greater than every valid “to” value!
  3. “Oracle doesn’t index NULLs”.
    Oracle does not index a row if all of the columns to be indexed are NULL. When this is a problem, it can be worked around.
  4. “Queries on NULLable ranges are hard”.
    We’ll see if I am up to the task…
  5. “Unique constraints don’t work with NULL”.
    Again, when this is a problem there are workarounds.

Next up: test data! If the queries are hard, I must try to test them thoroughly.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s