Snowflake Has Two Nulls
SQL
SQL's Null handling is confusing. Any value of any type can be Null, unless you impose a not null
constraint. Aggregations like count()
might include Nulls, or might not — it depends how you ask:
create or replace table null_demo ( val integer ); insert into null_demo values (123), (null); select count(*) as c1, count(val) as c2 from null_demo
Results
C1 C2 2 1
Null is not equal to anything, even itself, but it's not not equal. All equality checks return Null:
select 123 = null as c1, 123 != null as c2, null = null as c3, null != null as c4, null is null as c5
Results
C1 C2 C3 C4 C5 null null null null TRUE
SQL's "boolean" operations have three values, not two. Their truth tables can be surprising:
select (true and true) as tt, (true and false) as tf, (false and false) as ff, (true and null) as tn, (false and null) as fn, (null and null) as nn select (true or true) as tt, (true or false) as tf, (false or false) as ff, (true or null) as tn, (false or null) as fn, (null or null) as nn
Truth Table - AND
TT TF FF TN FN NN TRUE FALSE FALSE null FALSE null
Truth Table - OR
TT TF FF TN FN NN TRUE TRUE FALSE TRUE null null
For a rough mental model, replace Null with "Unknown". true or x
is true for any value of X. false and x
is always false. But true and x
could be either true or false, depending on whether X is true. If X is unknown, the result is also unknown, which means the truth value of true or null
is Unknown, or Null. 123 = x
could also be either true or false, depending on X, so 123 = null
is also Null. null = null
asks "Are these two unknown values equal?" We don't know, so the answer is Null!
This mental model isn't perfect. join
does not match Nulls with each other, but group by
does combine Nulls, as if Null were a discrete value instead of a placeholder for the unknown. Aggregates like max()
ignore Nulls altogether. SQL's three-value logic is not always logical:
select val, (val = 0 or val != 0) as tautology, (val = val) as tautology_2 from null_demo
Results
VAL TAUTOLOGY TAUTOLOGY_2 123 TRUE TRUE null null null
We might not know which integer this Null represents, but we know it must be either zero or nonzero, and we know it must equal itself. In SQL, tautologies aren't always true.
Weird!
Javascript
Javascript's Null handling is confusing. There are two ways to represent nonexistence: Null and Undefined. These are distinct values with distinct types, but the language sometimes pretends otherwise, often to stay compatible with decades-old bugs, oversights, and hacks. It's fun to read the history behind these behaviors. It's less fun to program around them. Open your browser's developer console and follow along!
typeof undefined; // undefined typeof null; // object ? undefined === undefined; // true typeof undefined; // undefined typeof document.all; // undefined document.all === undefined; // false ?
A nonexistent object property is Undefined, but a nonexistent HTML element is Null:
let data = {a: 1}; data.a; // 1 data.b; // undefined document.getElementByID('DoesNotExist'); // null
Null is a valid JSON value, but Undefined is not. JSON.stringify
sometimes throws Undefined values away:
let data = {'a': null, 'b': undefined}; JSON.stringify(data); // '{"a":null}' JSON.parse(JSON.stringify(data)); // {"a": null}
Sometimes converts them to Null:
let data = [1, 2, undefined]; JSON.stringify(data); // '[1,2,null]' JSON.parse(JSON.stringify(data)); // [1, 2, null]
And sometimes returns invalid JSON:
let null_json = JSON.stringify(null); JSON.parse(null_json); // null let undefined_json = JSON.stringify(undefined) JSON.parse(undefined_json); // Uncaught SyntaxError: "undefined" is not valid JSON
NaN, everyone's favorite number that isn't a number, is also invalid in JSON, but instead of throwing an error it's converted to Null:
let a = NaN; a == null; // false JSON.stringify(a); // null a == JSON.parse(JSON.stringify(a)); // false
Null can be coerced to 0. Undefined can be coerced to NaN, which is serialized as Null, which can be coerced to 0:
Number(); // 0 Number(null); // 0 Number(undefined); // NaN Number(JSON.stringify(undefined)); // NaN Number(JSON.parse(JSON.stringify(Number(JSON.stringify(undefined))))); // 0 ?
This means that depending on context, and which type coercions Javascript tries under the hood, Null and Undefined might be equal to each other, but also might not be.
null == undefined; // true null >= undefined; // false
Weird!
Snowflake
Snowflake's Null handling is confusing, because Snowflake is a SQL database that integrates with Javascript.
Snowflake lets you define Javascript functions which can be called inside SQL queries. Snowflake also supports Array, Object, and Variant data types, which make it easy to work with JSON. These are great features, and they help make Snowflake the nicest database software I've ever used, by far. For the most part, these systems work well enough that you can't see the seams. But once you hit an edge case, the scales will fall from your eyes.
Snowflake has two null values: SQL Null and JSON Null. When interfacing with Javascript or JSON, the familiar SQL Null maps to Undefined, while the new JSON Null maps to Javascript's Null.
create or replace function get_javascript_null() returns VARIANT language JAVASCRIPT as $$ return null; $$; create or replace function get_javascript_undefined() returns VARIANT language JAVASCRIPT as $$ return undefined; $$; select get_javascript_null() as javascript_null, parse_json('null') as json_null, get_javascript_undefined() as javascript_undefined, null as sql_null, javascript_null = json_null as null_equality, (javascript_undefined is null and sql_null is null) as undefined_equality
Results
JAVASCRIPT_NULL JSON_NULL JAVASCRIPT_UNDEFINED SQL_NULL NULL_EQUALITY UNDEFINED_EQUALITY NULLᴶ NULLᴶ null null TRUE TRUE
These two nulls look slightly different in the Snowflake UI. To make the distinction more obvious, I'll use lower case "null" to mean SQL Null, and upper-case "NULLᴶ" with a J superscript to represent JSON Null. This will cause zero confusion. Trust me.
Snowflake provides the functions to_json
and parse_json
, which are SQL versions of the Javascript functions JSON.parse
and JSON.stringify
. In Javascript, JSON.parse(undefined)
returns Undefined, so Snowflake's corresponding SQL expression parse_json('undefined')
returns SQL Null, which is Snowflake's equivalent of Undefined. What did I tell you? Zero confusion. None.
select null as sql_null, parse_json('undefined') as also_sql_null, parse_json('null') as json_null
SQL_NULL ALSO_SQL_NULL JSON_NULL null null NULLᴶ
When Snowflake creates JSON objects, it tries to match Javascript's serialization logic, kind of. This works for objects, which drop SQL Null just like Javascript drops Undefined:
select object_construct( 'a', 'hello', 'b', null, 'c', parse_json('null'), 'd', 'goodbye' ) as c1
Results
C1 { "a": "hello", "c": NULL, "d": "goodbye" }
However, Snowflake does not always match Javascript. Snowflake lets Undefined values slip through if they're in arrays, and it leaves NaN values alone:
select to_json(parse_json('[1, 2, undefined, null]')) as c1, to_json(parse_json('{"a": NaN}')) as c2
Results
C1 C2 '[1, 2, undefined, null]' '{"a": NaN}'
This means Snowflake can write JSON that Javascript refuses to read:
create or replace function parse_json_with_javascript(INPUT_STR VARCHAR) returns VARIANT language JAVASCRIPT as $$ return JSON.parse(INPUT_STR); $$; select parse_json_with_javascript(to_json(parse_json('[undefined]'))); -- Uncaught SyntaxError: Unexpected token 'u', "[undefined]" is not valid JSON
JSON Null can show up even if you're not touching Javascript / JSON. Snowflake has a phenomenally useful function called parse_url
, which breaks a URL into its hostname, path, query parameters, etc. If a URL has no query parameters, it will express this with JSON Null. This represents the fact that some URLs do have parameters, but this specific URL does not. On the other hand, asking for the URL's "fakeField" returns SQL Null. This represents the fact that no URL will ever have values for fakeField
, because fakeField
does not exist!
select parse_url('https://www.example.com') as url, url:"parameters" as parameters, url:"fakeField" as fake_field;
Results
PARAMETERS FAKE_FIELD NULLᴶ null
If you've ever used Snowflake, you may wonder why you haven't suffocated under the weight of two incompatible Nulls. It's surprisingly easy to use Snowflake for years and never notice JSON Null, because it's fragile. At the drop of a hat, or the slightest hint of type conversion, it will metamorphose into SQL Null.
select parse_json('null') as json_null parse_json('null')::text as sql_null, parse_json('null') and true as sql_null_2, parse_json('null') + 1 as sql_null_3
Results
JSON_NULL SQL_NULL SQL_NULL_2 SQL_NULL_3 NULLᴶ null null null
JSON Null is not SQL Null. It's distinct from SQL Null, and of unknown (null) equality with SQL Null. Unlike SQL Null, JSON Null does equal itself.
select parse_json('null') as json_null parse_json('null') is null as c1, parse_json('null') is distinct from null as c2, parse_json('null') = null as c3, parse_json('null') = parse_json('null') as c4
Results
JSON_NULL C1 C2 C3 C4 NULLᴶ FALSE TRUE null TRUE
SQL Null's type is... null, but JSON Null's type is NULL_VALUE
. Snowflake offers a SQL function called is_null_value
to test whether something is JSON Null. This function returns null when applied to SQL Null, since, after all, SQL Null could be anything — it could even be JSON Null!
select parse_json('null') as json_null, typeof(123) as c1, typeof(parse_json('null')) as c2, typeof(null) as c3, is_null_value(123) as c4, is_null_value(parse_json('null')) as c5, is_null_value(null) as c6
Results
JSON_NULL C1 C2 C3 C4 C5 C6 NULLᴶ INTEGER NULL_VALUE null FALSE TRUE null
Interestingly, while NULL_VALUE is its own (non-Null) type, you can't create a column of that type.
create or replace table null_demo ( c1 NULL_VALUE ); -- ERROR: Unsupported data type 'NULL_VALUE'
JSON Nulls are sometimes ignored in aggregations, just like SQL Nulls:
create or replace table null_demo ( val VARIANT ); insert into null_demo ( select 123 ); insert into null_demo ( select null ); insert into null_demo ( select parse_json('null') ); select count(*) as c1, count(val) as c2 from null_demo
Results
C1 C2 3 1
But this behavior interacts strangely with a not null
constraint. That constraint does not prevent loading JSON Nulls into a table. Instead, it changes the logic of count()
to always count them!
create or replace table null_demo ( val variant not null ); insert into null_demo ( select 123 ); --Fails insert into null_demo ( select null ); --Succeeds insert into null_demo ( select parse_json('null') ); select count(*) as c1, count(val) as c2 from null_demo
Results
C1 C2 2 2
Surprisingly, JSON Null is implemented as a string under the hood, disguised with enough special case logic to give engineers night terrors. In most cases, this is invisible. But when you combine SQL's confusing null handling, two null values, a hacky implementation, and implicit type coercion, there are times when all intuition is meaningless. Only madness remains.
select zeroifnull(123) as a1, ifnull(123, 0) as b1, ifnull(123, 'hello') as c1, zeroifnull(null) as a2, ifnull(null, 0) as b2, ifnull(null, 'hello') as c2, zeroifnull(parse_json('null')) as a3, ifnull(parse_json('null'), 0) as b3, ifnull(parse_json('null'), 'hello') as c3
Results
A1 B1 C1 A2 B2 C2 A3 B3 C3 123 123 123.0 0 0 hello 0 NULLᴶ hello
Weird!