Why NULLs don't work the way you think in SQL
Answer: Ternary logic.
If you’ve learned programming through a normal programming language, say, Java, Javascript, C++, Python, … etc., you’ve been trained that null
is a value, a special value. In your training, it means the absence.
However, in SQL, the situation is different: NULL
doesn’t mean the absence, it means UNKNOWN
in the so-called ternary/three-valued logic.
We’ll see what that means. But first, let’s see how NULL
breaks our expectations in SQL.
5 != NULL
is notTRUE
! (but notFALSE
, either. We’ll see)NULL = NULL
is notTRUE
! (but notFALSE
either)5 < NULL
,5 = NULL
, and5 > NULL
are all notTRUE
. (but notFALSE
either)
In SQL, NULL
is not a special value. In SQL
, NULL
means we don’t know the value of this column for this row. That’s how SQL is specified.
For example, a user’s address
being NULL
in SQL doesn’t mean this user is homeless/doesn’t have an address; it means this user’s address is unknown in the system.
So, if you ask SQL if a user’s address
being NULL
is not equal to 221B Baker Street
, SQL won’t answer with TRUE
since it doesn’t know this user’s address to be able to tell that it’s not 221B Baker Street
. However, it won’t answer with FALSE
either; it doesn’t know the user’s address
to tell you that it’s equal to 221B Baker Street
. So, what answer will it give? … SQL’s answer: Don’t know!, which is NULL
in SQL.
I.e., 5 != NULL
, 5 = NULL
, NULL = NULL
, 5 < NULL
, 5 > NULL
, … etc. all evaluate to NULL
in SQL because NULL
means UNKNOWN
and if you compare something to an unknown value, you can’t know the comparison result – hence, the NULL
result.
This is because SQL is designed to implement the so-called ternary/three-valued logic instead of Boolean logic. Instead of a proposition being TRUE
or FALSE
, it can be TRUE
, FALSE
, or UNKNOWN
. And it has its truth tables accordingly. The UNKNOWN
I’ve been highlighting is UNKNOWN
as in ternary logic. I.e., NULL
in SQL is the equivalent of UNKNOWN
in ternary logic.
Sample truth table in ternary logic:
A | B | A OR B |
---|---|---|
TRUE | TRUE | TRUE |
TRUE | UNKNOWN | TRUE |
TRUE | FALSE | TRUE |
UNKNOWN | TRUE | TRUE |
UNKNOWN | UNKNOWN | UNKNOWN |
UNKNOWN | FALSE | UNKNOWN |
FALSE | TRUE | TRUE |
FALSE | UNKNOWN | UNKNOWN |
FALSE | FALSE | FALSE |
However, let’s be real: that’s now what we mean when we use NULL
is SQL. We usually mean absence of value and/or that NULL
is a special value.
For example, what we mean with a tweet’s reply_to
being NULL
is this tweet is a brand new tweet, not a reply to another tweet, not we don’t know what this tweet is replying to as SQL thinks.
So, how do we survive in this reality?
- Operators
IS NULL
andIS NOT NULL
are your friend. You probably already know what they mean so won’t waste your time here. But the conclusion: always check withIS NULL/IS NOT NULL
first if the column is nullable. Example:address IS NULL OR address != '221B Baker Street
. - If you find yourself in a situation where you’re using some custom wrapper that doesn’t allow you to write full equivalents of SQL, e.g., one that just takes
filters: Array<{columnName: string, filterValue: string}>
and the column is nullable, make sure to check theNULL
cases by hand and see if they’ll evaluate to what you want.
Forgot to tell you: If the result of the WHERE
clause for a row is NULL
, SQL treats that as FALSE
and thus won’t include this row in the result set.
Also, don’t fully depend on the ternary-logic thing in SQL, unfortunately. For example, UNION(NULL, NULL)
should be {NULL, NULL}
but it’s just one NULL
…
I.e., use SQL
‘s NULL
as we mean it: absence of value. But make sure to use IS NULL
/IS NOT NULL
properly.
Ternary logic might be useful and might be a good design paradigm for some use-cases, but I wouldn’t depend on SQL being a ternary-logic system – I’d implement my own on top of it/use another library.
Resources
- Oracle DB’s page on NULLs, very nice and to the point.
- Wikipedia’s section on SQL being three-valued logic application.
- Julia Evans‘s NULL surprises, a page from her zine Become a SELECT star!.
Disclaimers
- For literary purposes (or otherwise), I may not have described everything accurately. Or may have made mistakes.
- I’m not responsible for anything in this article. Read on your own risk.
Privacy notes
- Buttondown tracks who opens, clicks, or forwards my emails, and shares this data with me. If you want to delete this data, email me and I’ll try my best!
- Buttondown adds “UTM sources” to my links to let websites I’ve linked to know that the source of the new traffic is my newsletter (as far as I understand).
- Buttondown gives me the choice to opt out of both points above, but I’ve chosen to opt in.
Copyright
This is an All-Rights Reserved work. Please, don’t rehost it.
Linking to it is naturally allowed (and appreciated!), but rehosting is not. Reasonable sharing like forwarding to a few friends or a quote screenshot is fair use, I believe.
You can reply directly to this email.