RAGRETS.md: Bad default()s

This post is part of my continuing series on language regrets, where I discuss things I regret about C# and would avoid in a reboot. I previously wrote about covariant arrays, which is a widely regretted feature among the C# design team. This time I'd like to talk about my personal biggest regret, default(T).

First, a brief description of the feature for those not familiar. In C# (and .NET in general), all types have a default or "zero" value. For reference types (classes) this value is null. For value types (structs), this is a value where all of the fields of the struct are the default value. For the primitive types¹ (int, double, et al.), the default value is 0 (hence "zero value"). In C# default() is an expression legal in all expression contexts which evaluates to the default value of the type provided between the parenthesis.

There are two reasons why I dislike the feature. The first is fairly simple: it requires a particular value to be in the domain of the type which may not be desirable or legal. Consider reference types: whether or not you want it, null is a valid value of every reference type. This is a plus when you want to store an "uninitialized" reference, but instance members don't support null as a receiver -- so they throw a NullReferenceException instead. This can be pretty frustrating if you don't ever want to store an unitialized reference and the cases where it is not properly initialized are simply bugs. The same thing can happen with structs. ImmutableArray<T> is a struct that contains a single reference to an array. If the ImmutableArray is created using one of the standard creation functions, it's not possible for the internal array to be null. If, however, you use default(), the array field will be null and the index will throw, just like a reference type. This probably all sounds familiar—these are the same reasons why null is considered "Tony Hoare's Billion Dollar Mistake" and the motivation around building the C# 8 nullable reference types feature. That feature tries to reverse some previous decisions by no longer considering null a valid value of all reference types. Of course, this means that non-null reference types do not have a default value, and indeed default() will produce a warning if used where a non-null reference type is expected. As you can see, null is actually just a particularly annoying case of the the default() problem.

The second reason why I dislike the feature is that worsens a fundamental relationship with the type system. Consider a program which needs to take in user input and produce escaped output for some other service. This is a basic example but it represents a broad class of problems, namely one where you have to parse input. In this example, you might write the function like this:

string Escape(string input) { ... }

One problem you might see with this function is that the type system doesn't know that you've called the function. This means that it can't help prevent calling it twice. It's probably unlikely that you'll make the mistake of calling

Escape(Escape(input))

but it's certainly possible to accidentally call

var x = Escape(input);
...
var y = Escape(x);

if x and y are separated by sufficient distance in the code. Even worse, when written this way the type system also can't protect you from forgetting to escape the input string. Consider the function Output() which is meant to present an escaped string and has the signature

void Output(string s) { ... }

Because everything is a string, the type system is of no help. However, we can change that. Imagine a new type that pairs with the Escape method:

struct EscapedString
{
    private string Escaped { get; }
    private EscapedString(string escaped) { Escaped = escaped; }
    public static EscapedString Escape(string input)
    {
        // do escaping
        return new EscapedString(result);
    }
}

By encoding knowledge of escaping into a type, we've brought the power of the type system to bear on our problem. The original double escaping bug would now look like

var x = Escape(input);
...
var y = Escape(x); // type error, Escape takes a string not an EscapedString

You would now have to go out of your way to re-escape the string, it wouldn't happen accidentally.

Similarly, if Output() is defined

void Output(EscapedString s) { ... }

We've made it impossible to avoid escaping the string. The only way to call the EscapedString constructor is by calling EscapedString.Escape, which escapes the string.

Well, except for default(). You can always pass Output(default()) and avoid running the preconditions of the type constructor.

And this is the main problem with default()—it subverts the guarantees of the type system. When the user can control exactly which paths can produce types, they can use the type system to construct proofs that one of those paths must have been taken. But if the user can't control all those paths, the proof doesn't hold. That's a huge weakening of the entire system.

Now, things aren't completely broken. It's true that in many situations default wouldn't sneak in, or if it did it would quickly cause run time exceptions. But default is common enough that I've seen these problems occur in practice. Overall, it feels like a bad language trade-off.

As to what I would replace it with—that will have to wait for another post.

Which are value types, but special ones. ↩

social