United Kingdom: +44 (0)208 088 8978

Don’t use ToString() on discriminated unions in F#!

How to call ToString() on a discriminated union case

Let's look at some ways that we can call the ToString() method, directly and indirectly, on a discriminated union (DU) value in F#.

type MyDU =
    | Case1
    | Case2

string Case1
sprintf "%O" Case1

In F#, the ToString() method is overridden to work like sprintf "%A", which uses reflection to find the DU case name. These all return "Case1".

Why do this?

I see this feature used quite often in F#. Each case name in a DU might also represent something meaningful that gets displayed to a user or written to some output. For example, a DU representing a column in a table might actually have the DU case name shown to the user as a column heading:

type Column =
    | Name
    | Date
    | Cost

(string Name) // returns "Name"

What's the alternative?

The alternative is to write a function that takes the DU value and returns a string. This function is boring and repetitive, especially when there many cases.

let columnDisplayString column =
    match column with
    | Name -> "Name"
    | Date -> "Date"
    | Cost -> "Cost"

columnDisplayString Name // returns "Name"

You can also implement this as a instance member on the DU so that you can write Name.ColumnDisplayString etc.

Don't use ToString(). Write the boring function instead.

What are the downsides of using ToString() compared to a plain function?

It's slow.

I'm not a performance obsessed developer, and most of the code that I write is not very performance sensitive, but this has caused problems for me in the past. Calling ToString() on the DU is much, much slower. According to a very unscientific test in FSI, about 10,000x slower! It can become noticeable quite quickly in several use cases.

Renaming a DU in code could change runtime behaviour and even break it.

If you start using a code identifier like a DU case name during runtime, then something that should be a safe refactoring, like renaming it, may cause unexpected behaviour.

It doesn't allow for different string representations for different use cases.

There may be multiple valid ways to represent a DU case depending on the context. For example, different user types may have different terminology for the same concept, or they may speak different languages.

What about overriding ToString() instead...? No. Don't do that either.

What if we override ToString() instead and put the long boring function in the override? That would solve some of these problems, like performance, but it still has quite a big problem of its own. Note that this one applies to overriding ToString() on any type.

You can't see where the override is used.

Since there are so many in-built ways to call ToString() indirectly, you can't tell where your override is being called from just by finding all references to it in your code editor.

This means you can't really tell what the effect will be of changing it. Will it just change this part of your app? Or will it change some other parts too?

Exceptions to the rule

I'm writing this from the perspective of someone writing production code for an application or library. There will always be cases where it is fine to use ToString() for its convenience or its direct representation of code, such as in debug code, logging, or exploratory script code that will be thrown away.

However, for any sizeable production codebase where correctness is important, I strictly adhere to my rules above. This comes from the painful experience and regret of running up against all of the problems listed above.

Beware! 😱