Back to lab
deepdivesprogramming

NULL: How Do You Define Nothing? And Why Would You?

NULL: How Do You Define Nothing? And Why Would You?

Null pointers: they quietly do nothing at all - and that’s exactly the problem. We use NULL (or null, nil, nullptr, etc.) in almost every program to signify “no value” or “nowhere to point to.” On the surface, it’s a simple concept. Yet these deceptive little placeholders have a long history of causing confusion, bugs, and crashes. In fact, the inventor of the null reference once apologized for creating it, calling it his “billion-dollar mistake”. Why is something so seemingly trivial also so troublesome? Let’s journey through the wild world of NULL to uncover its origins, how it works in memory, and how different programming languages have wrangled this notion of “nothing” into safer designs.

Abstract dark background by raffaele brivio on Unsplash

The “Billion-Dollar Mistake”: Origins of NULL

The story of NULL begins in the 1960s. Sir Tony Hoare, a British computer scientist known for inventing Quicksort, introduced the idea of a null reference in 1965 while designing the language ALGOL W. At the time, it seemed like an easy fix - a simple way to indicate that a pointer or reference doesn’t actually point to a valid object. This convenience had a huge catch: decades later, Hoare reflected on that decision and infamously dubbed it “my billion-dollar mistake”. He attributed countless system crashes, vulnerabilities, and hours of debugging to this single choice of allowing references to point to “nothing.”

Hoare’s null reference was quickly adopted in many languages (C, C++, Java, and more) because it was useful and easy to implement. But it also opened the door to a whole new class of errors. By the 2000s, null pointer bugs were recognized as one of the most common software weaknesses. Simply put, a null reference introduced the possibility that any pointer or object reference might be invalid, and attempting to use it would blow up at runtime. Despite these risks, NULL survived and proliferated, becoming a standard tool to represent the absence of a value.

The Many Uses (and Misuses) of NULL

Why did Hoare (and everyone after him) find NULL so essential? In practice, having a representation of “no value” or “nowhere” is extremely useful. Programs routinely use null pointers and references to mark special conditions - for example, the end of a linked list (the last node’s next pointer is NULL), or as a signal that a search or function call failed to return a valid result. Instead of inventing a dummy object or using out-of-band communication, a NULL return or value cleanly indicates “nothing here.”

Some common situations where NULL is used include:

  • Sentinels in data structures: e.g. a tree node with a null child pointer represents “no subtree,” or a NULL terminator in a C string ('\0' character) marks the end of text.
  • Optional values: When a value may or may not be present. For instance, a function that tries to find an element might return a pointer to the element or NULL if not found.
  • Resource handles: If an API returns an object or handle, NULL can indicate failure to acquire the resource (file not opened, memory not allocated, etc).

In these cases, NULL is essentially a stand-in for “not applicable” or “not available.” It’s a placeholder rather than a real object. However, NULL can’t be used everywhere - and this is where the trouble often starts. Where can’t we use NULL? In contexts where a value is always expected:

  • Primitive values and stack variables: For example, in C or Go, you cannot assign NULL to a plain int or float - those types always hold a valid number. You also can’t use NULL for something like a struct or a value type directly; it only applies to pointers or references.
  • References that must refer to something: Some languages have reference types that disallow nulls. C++ references (e.g. int&) are supposed to always bind to a valid object. Similarly, in Rust, a normal reference &T cannot be null by design - if you have a reference, the language guarantees it’s valid.
  • Low-level hardware operations: At the hardware level, there is no explicit concept of a “null” address separate from any other address. You can’t mark a CPU register as “null” - it will always hold some binary value. This means certain contexts, like direct hardware registers or certain languages’ value types, simply have no room for an out-of-band “null” marker.

In summary, NULL is essential as a concept of “nothing” in many algorithms and systems, but it only makes sense for pointers or references. You can’t have a truly “null integer” or “null float” in C - you’d use 0 or some sentinel value instead. Likewise, some modern type-safe languages choose to avoid NULL entirely for regular references, offering different patterns to represent optional data. To appreciate those designs, we first need to understand what NULL is really under the hood.

Under the Hood: Memory, Pointers, and “No True NULL”

At the machine level, a pointer is just a number - specifically, a memory address. A pointer variable holds bits that correspond to a location in memory. There is no special “null pointer” bit pattern mandated by hardware. I once read in a discussion thread that summed it up perfectly: “On a hardware level, there is no such thing as a null pointer. A pointer is just a word, and a word always holds a number. So by convention, we picked zero.” That stuck with me. There’s no magic “null” in hardware - just the number zero, which we’ve collectively agreed means “points to nothing.” And that choice of zero isn’t a fluke either - there’s a very practical reason why address zero was chosen.

Why choose 0? There’s some elegance in zero: if a pointer is just a number, zero is a natural “no object here” value (it’s a clearly invalid address on most systems, as we’ll see). Using 0 also means we can conveniently check a pointer in code with a simple comparison to 0. For example, in C, you often see code like if (ptr != NULL) {...} - under the hood, this is just checking if the address stored in ptr is zero or not. Many architectures even have machine instructions that can directly test for zero efficiently.

Memory layout perspective: Even when a pointer is NULL, the variable that holds that pointer still occupies memory. Imagine a simple scenario in C:

int *p = NULL;
int x = 42;
int *q = &x;

Here, p and q are pointer variables typically 8 bytes each (on a 64-bit system). We’ve set p to null and q to point to x. In memory, it might look like this:

Stack:
    [0x7FFEEFD0] p = 0x00000000      (NULL, points to nothing)
    [0x7FFEEFD8] q = 0x601060        (address of x)
Heap: (not used in this example)
    ...

Data (global/static memory):
    [0x601060] x = 42

In this example, p’s value is 0x00000000 - a bit pattern of all zeros - indicating null. q contains 0x601060 (an example address) which is the location of the integer x in memory. Notice that p still takes up space (8 bytes) to store that 0 value, even though it doesn’t point to a valid object. If you had an array of 1000 null pointers, that array would still occupy 1000 × N bytes in memory just to store all those “zero” values. NULL doesn’t mean “nothing stored here” - it means “stored here is a token that signifies no target.”

No true NULL at hardware level: Hardware doesn’t have a dedicated notion of a null pointer. The CPU simply sees the address 0x0 like any other number. What makes 0 special is how software (and the operating system) treats it. In most operating systems, address 0 resides in a protected region of memory. The OS deliberately does not map any valid memory to address 0. This means if a program tries to read or write memory at address 0, the CPU will raise a fault and the program will crash. This design decision - not mapping address 0 - turns the value 0 into a reliably invalid pointer value. It’s “safe” in the sense that using it causes an immediate failure, rather than accidentally corrupting some random memory.

Why not some other value? In theory, any reserved address could represent null. Some systems did use other patterns (like -1 or even 0xDEADBEEF in debugging contexts) to mark invalid pointers. But zero became the standard. It’s simple, and hardware naturally zero-initializes memory in some scenarios which makes it convenient. POSIX systems even specify that NULL in C should be defined as (void*)0. The C standard allows NULL to be defined as 0 or (void*)0 (and in C23, even as nullptr), but in practice it’s almost always zero-valued.

Null checks and pointer operations: At runtime, checking for null is straightforward. High-level code if (p == NULL) turns into a machine instruction that compares the pointer’s value to 0. If we were to peek at some x86-64 assembly corresponding to using a pointer, it might look like:

cmp    QWORD PTR [rbp-8], 0       ; compare pointer p to 0 (NULL?)
je     .Lskip                     ; jump if equal (if p == NULL, skip dereference)
mov    rax, QWORD PTR [rbp-8]     ; load p's value (address) into RAX
mov    DWORD PTR [rax], 42        ; write 42 to *p (assumes p was not NULL)
.Lskip:

In this snippet, the program checks if p is zero and skips the write if so. If p were NULL, jumping to .Lskip avoids a catastrophe. If the check were absent and p = 0x0, the instruction mov DWORD PTR [rax], 42 would attempt to write to memory address 0, which would trigger a segmentation fault on a modern OS. On a lower level, the CPU doesn’t “know” the concept of null; it only knows it was asked to access address 0 and that address is off-limits, so it raises an exception.

It’s worth noting that not all environments enforce this. In some bare-metal or embedded systems, address 0 might actually be a valid memory location (sometimes holding important vectors or data). In those cases, a NULL pointer dereference might not crash - it could read or write some unintended hardware register or ROM. This is one reason why dereferencing null in C/C++ is officially undefined behavior - the language makes no guarantees about what will happen. But in conventional user-space applications, a null dereference reliably results in a crash.

The Danger Zone: Null Dereferencing and Memory Safety

The biggest problem with NULL is what happens when you forget to check it. A pointer or reference that’s supposed to refer to a valid object might, due to a bug, be NULL - and if the program tries to use it, things go south quickly. In C and C++, this often manifests as the dreaded segmentation fault. In higher-level languages like Java or C#, it throws a runtime exception (NullPointerException in Java). Either way, your program is halted at the point of misuse.

Let’s illustrate with a Java example. Consider:

String s = null;
System.out.println(s.length()); // Attempt to call method on null

When you run this, the JVM detects that s is null and throws a NullPointerException. Under the hood, what happens is akin to the earlier assembly - the JVM tries to load the object reference and then access a field or method. In simplified JIT-compiled machine code, it might do something like:

MOV EAX, [s]       ; load the address stored in s into EAX
MOV ECX, [EAX+4]   ; try to read something at offset 4 (e.g., the length field)

If s was null, EAX now contains 0x00000000. The second instruction tries to read from memory address 0x00000004 (offset 4 from 0), which is not a valid memory address. The CPU traps, but the JVM catches this trap and turns it into a high-level NullPointerException instead of letting the whole process segfault. Whether it’s a native crash or a managed exception, the cause is the same - you tried to use a pointer/reference that was NULL and the system intervened.

Memory safety concerns: Null dereferencing is one of the most common programming errors. Unlike buffer overflows, a null dereference doesn’t overwrite memory or continue execution in some weird state; it typically stops execution immediately. In security terms, a null dereference in user-space is usually a denial-of-service (the program crashes) rather than a direct exploit. However, in kernel or system software, a null pointer bug can sometimes be exploited - attackers have found ways to map a fake page at address 0 to trick the kernel into writing to or jumping to that memory. Thus, eliminating null-related bugs has been a priority for both reliability and security.

Billion-dollar mistake, indeed: From lost spacecraft to countless application crashes, null references have caused trouble. A significant fraction of production bugs are due to “forgot to check for null” scenarios. This is why Tony Hoare’s regret is so widely quoted. It took the industry decades of null-pointer-induced pain to really appreciate safer approaches.

One Concept, Many Languages: Null and Its Alternatives

Every programming language needs to handle the concept of “no value” in one way or another, but they don’t all call it NULL or treat it the same. Let’s compare how six languages approach the absence of a value: C, C++, Rust, Go, Java, and Swift - each offering a unique perspective on our favorite nothing-value.

C: NULL as a Pointer to Nowhere

C was one of the early adopters of the null pointer concept. In C, NULL is a macro that represents a null pointer constant - typically defined as ((void*)0) or just 0 in standard headers. Writing ptr = NULL; in C is the same as ptr = 0; (with a cast to the appropriate pointer type). There is no special runtime support for NULL in C; it’s purely a convention. The responsibility is entirely on the programmer to check for NULL before dereferencing pointers.

#include <stdio.h>
#include <stdlib.h>

int main() {
    int *p = NULL;               // p is a null pointer (points nowhere)
    if (p == NULL) {
        printf("p is null\n");
    }
    // *p = 42;  // (uncommenting this would cause a crash!)
    return 0;
}

Here we set p to NULL and explicitly check it. C provides no automatic protection - it trusts the programmer. The definition of NULL as 0 has a handy side effect: you can use if (p) or if (!p) as a shorthand to check non-null or null, since in C an expression like if (p) is true if p is not zero.

One quirk: NULL in C is only used for pointer types. You cannot assign NULL to non-pointer variables (like an int or a struct). There is no “nullable int” in C - you’d use a separate flag or a special value (like -1) to signify something like “no valid number.”

C++: nullptr and the Evolution of Null

C++ inherited C’s NULL macro, but for many years, it was just as unsafe and implicit as in C. A constant 0 could be a null pointer, which sometimes led to ambiguity (e.g. calling an overloaded function f(0) - is that f(int) or f(pointer)?). In 2011, C++11 introduced a new keyword nullptr to improve things. nullptr is a literal of type std::nullptr_t that can convert to any pointer type but not to integers. This gives C++ a type-safe null pointer value.

#include <iostream>
#include <vector>
int main() {
    int *p = nullptr;             // modern C++ null pointer
    if (p) {
        std::cout << *p;
    } else {
        std::cout << "p is null\n";
    }

    // int x = nullptr;  // Error: nullptr can't convert to int
    return 0;
}

In this snippet, we use nullptr to initialize p. The if (p) check works similarly to C (it checks for non-null). We also show that you cannot assign nullptr to an int (unlike the old NULL macro, which was 0 and might accidentally be used in the wrong context). Internally, nullptr still ends up as a 0 address when converted to a pointer, but the compiler keeps track of its special type.

Interesting tidbit: The C language itself has caught up a bit - in the C23 standard, C introduced nullptr as well (borrowing from C++), so that C code can opt into a typed null pointer constant. But in everyday C++ usage, prefer nullptr over NULL or 0 for clarity and type safety.

Rust: Options Instead of Null Pointers

Rust took a drastic approach: it completely eliminates null references in safe code. In Rust, you cannot have a null ordinary reference. A reference like &T is guaranteed to point to a valid object of type T. Rust achieves this by pushing the concept of “nullable” to the type system level through the Option<T> enum. Option<T> can be either Some(value) or None, effectively representing “maybe a T, or nothing.” This is very similar to the Option/Maybe types in functional languages.

fn get_even(number: i32) -> Option<i32> {
    if number % 2 == 0 {
        Some(number)
    } else {
        None
    }
}

fn main() {
    let x = get_even(4);
    let y = get_even(5);
    // x is Some(4), y is None
    if let Some(val) = y {
        println!("y is even: {}", val);
    } else {
        println!("y is nothing!");
    }
}

In this snippet, get_even returns an Option<i32> - it returns Some(number) if the input is even, or None if not. There is no way to mistakenly treat y as an integer and use it without unwrapping the option - the Rust compiler won’t let you. By forcing the programmer to consider the None case, Rust eliminates the risk of null dereferences in safe code.

Under the hood, Rust’s Option<&T> (an optional pointer/reference) is represented very efficiently. Rust knows that real references can never be null, so it can use a null value internally to represent the None case of an Option. This means an Option<&T> is essentially the size of a pointer; None is implemented as a null pointer value, and Some(&T) is a non-null pointer. But crucially, you can’t use that value without matching on the option - no unexpected null-dereferences.

This approach draws inspiration from Tony Hoare’s experience - by making the absence of a value part of the type system, Rust ensures you can’t forget to handle it.

Go: nil for Pointers (and More) with Run-Time Checks

Go is closer to C in spirit here, but with some twists. In Go, nil is a predeclared identifier that represents the zero-value for pointers, interfaces, maps, slices, channels, and function types. An uninitialized pointer in Go is nil by default. You cannot use nil with value types like integers or booleans - those have their own zero values (0, false, etc.).

package main
import "fmt"

func main() {
    var p *int = nil
    if p == nil {
        fmt.Println("p is nil")
    }
    // fmt.Println(*p)  // would trigger a runtime panic if uncommented
}

We declare a pointer p to int, and it defaults to nil. If we tried to do *p (dereference), Go would panic at runtime - a controlled panic message, rather than a low-level segmentation fault. In most cases, though, a Go nil dereference will crash your program just like a C one - the difference is mainly that it comes with a stack trace.

One interesting aspect is how broad the idea of nil is in Go: it’s not just pointers. Slices, maps, channels, etc., all have nil as a valid state. For instance, a nil slice behaves like an empty slice (you can len() it, you get 0). A nil map can’t be written to (it will panic), but reading from it acts like empty. Go doesn’t prevent you from making mistakes, but it gives a consistent paradigm across multiple types.

Java: Universal null with Exceptions

Java inherited the concept of null references from earlier object-oriented languages. In Java, any object reference can be null. By default, uninitialized object variables are null. Primitive types (int, boolean, etc.) are not references and cannot be null - Java makes a clear distinction there.

Java’s approach to null is to mitigate its dangers at runtime. If you call a method or access a field on a null reference, the JVM throws a NullPointerException (NPE). Consider:

String text = null;
if (text == null) {
    System.out.println("No text provided.");
}
System.out.println(text.length());  // This will throw NullPointerException

The first if check is fine, it detects null. The second line will throw an NPE because text is null and we’re trying to call length(). Either avoid such calls by checking, or handle the NPE via try-catch (though catching NPE is generally considered bad practice).

Java 8 introduced java.util.Optional<T> as a library class to encourage a more Rust/Swift-like handling of absent values. However, Java’s Optional is not used everywhere and is not as integrated into the language as Rust’s or Swift’s options. Modern Java IDEs and code analyzers also use annotations like @Nullable and @NotNull to help catch potential NPE issues statically. And Kotlin (which runs on the JVM) took null-safety further by making references non-nullable by default, with a special ? type for nullable references.

Swift: Optionals for Safer Code

Swift was designed with safety in mind, and one of its features is Optionals. In Swift, any variable that could be “no value” must be declared as an Optional type. Optional is written with a ? suffix on the type (e.g. String? is “optional string”). If a variable is a plain String (not optional), the compiler ensures it can never be nil. You can’t just assign nil to a non-optional value - it’s a compile-time error.

var title: String? = nil    // title can be nil or a String
if title == nil {
    print("No title yet.")
}
title = "Hello"
print(title?.uppercased() ?? "No title")

In this example, title is an optional string. We initialize it to nil, check it, and later assign a real string. When printing, we use optional chaining: title?.uppercased() means “call uppercased() if title is not nil, otherwise produce nil,” and the ?? "No title" provides a default. Swift forces you to either safely unwrap the optional (using if let, optional chaining, etc.) or explicitly say “I know this isn’t nil” by using ! to force unwrap. If you force unwrap a nil optional, it crashes with a runtime error.

Optionals in Swift are implemented as an enum under the hood (much like Rust’s Option), with cases .some(T) and .none. By designing the language this way, Swift essentially banishes the unchecked null reference - making handling the absence of a value an explicit part of the code, catching issues at compile time before they can cause trouble.

Conclusion: From Nothing to Something Learned

The humble NULL value - that seemingly simple stand-in for “nothing” - turns out to have a rich story behind it. We’ve seen how a quick fix in the 1960s led to decades of debugging headaches, earning the moniker “billion-dollar mistake.” Understanding how NULL works at a low level - as just a 0 address that the hardware/software treat specially - gives us insight into why forgetting a single check can crash an entire program. We delved into memory layouts and saw that there’s no magic to null: it’s literally nothing (zero bits), and it relies on operating system conventions to be safe. We also saw the flip side: without nulls, specific data structures and algorithms would be clunkier. It’s a design trade-off that language creators have grappled with.

Crucially, the evolution of NULL across languages teaches us about language design for safety. C and C++ give us power and rope (sometimes to hang ourselves) - they assume we’ll use NULL wisely. Higher-level and newer languages don’t assume that; they either sandbox the problem (Java catching NPEs) or eliminate it from the type system (Rust, Swift, etc.). The trend in programming language design has been to make “null mishaps” less likely. Rust’s Option, Swift’s Optionals, Kotlin’s non-null types, and even newer C# and Java features all aim to prevent that moment where a program tries to use a big fat nothing as if it were something.

For the curious systems programmer, NULL is a reminder that simple ideas in high-level coding can hide complexity in the machine. Every time you check a pointer for null, you’re connecting back to that hardware reality of memory addresses. And every time you design a data structure, you’re thinking: how do I represent the absence of a value? There’s a certain beauty in how newer languages have tackled this - not by ignoring the problem, but by integrating it into the code we write, making “nothing” a first-class citizen with Option/Optional types. It’s a great example of how understanding a low-level detail (like pointer values and memory) can lead to better high-level practices.

In the end, NULL values are everywhere in computing, and they can bite if we’re not careful. But by appreciating their origins and the nuances of their implementation, we become better equipped to avoid the pitfalls. Whether you’re carefully checking a NULL in C, using nullptr in C++, or unwrapping an Option in Rust or Swift, you’re engaging with the legacy of that wild world of null. It’s a world where nothing matters - and by knowing how nothing works, we can make our software something to be proud of, with fewer crashes along the way.

Let's build

Build
better things.

Small team, full stack, real results. If you have an interesting engineering problem, we want in.