United Kingdom: +44 (0)208 088 8978

Custom Equality and Comparison in F#

F# Records support full equality and comparison for free. Sometimes, though, this isn't what you want! Isaac Abraham explains when this might be, and how to optimise for your use cases.

F# records generally fit what we need when working with data in terms of comparison and equality:

  • They compare on value, not reference. Two different instances of a record of the same type with the same contents will be equal.
  • They perform deep equality checking by default. Each field in a record is itself compared for equality; if that field itself implements equality, it too will participate in the check.
  • Records support automatic comparison as well, by comparing each field one at a time until it encounters a field which is greater or less in one record than the other.

This behaviour works well for many, many situations. However, there are times when you may not want this behaviour. For example, imagine that you only care about a single field for equality e.g. CustomerId, or you only want to compare against a subset of fields on the record. A second case may be if you want to optimise for performance and can guarantee that data won't ever change in a record - for example, static datasets which will not change for the lifetime of the application - in other words, long-term immutable datasets.

Implementing Manual Custom Equality

In such cases, you can opt to manually "switch" to custom equality in code at time of comparison. Let's assume a simple record:

type Customer =
    { CustomerId : int
      Name : string
      Age : int
      Town : string }
let areTheSame =
    customerA.CustomerId = customerB.CustomerId // Equality check against the ID field only.

Or within a dictionary:

// Create a lookup of orders against CustomerId
customers
|> Array.map(fun c -> c.CustomerId, loadOrders c.CustomerId)
|> readOnlyDict

Of course, this means extra coding, the risk of accidentally using comparison or equality fields inconsistently, as well as not explicitly stating in your domain what the comparison fields really are.

Implementing Automatic Equality Checking

F# allows you to override the equality checking in a few ways that you should take care of - in other words, whenever you do a .Equals or a = check. To do this, you can apply what is known as custom equality and custom comparison. Start by applying the following:

[<CustomEquality>]
type Customer =
    { CustomerId : int; Name : string; Age : int; Town : string }

This will initially lead to a compiler error:

A type with attribute 'CustomEquality' must have an explicit implementation of
at least one of 'Object.Equals(obj)', 'System.IEquatable<_>' or
'System.Collections.IStructuralEquatable

In other words, you need to specify some custom equality implementation. The reality is actually somewhat more complicated, as we'll see. Let's start by providing a custom Object.Equals implementation:

[<CustomEquality>]
type Customer =
    { CustomerId : int; Name : string; Age : int; Town : string }

     // custom check - compare against CustomerId only
     override this.Equals other =
        match other with
        | :? Customer as p -> p.CustomerId.Equals this.CustomerId
        | _ -> false

Now, you'll get another compiler error:

The 'CustomEquality' attribute must be used in conjunction with the 'NoComparison' or 'CustomComparison' attributes

For now, we can fix this by adding NoComparison:

[<CustomEquality; NoComparison>]
type Customer =
...

Still not done! We also now get a warning that we've not implemented GetHashCode - it's good practice to ensure that this behaves using the same fields as Equals does:

[<CustomEquality; NoComparison>]
type Customer =
    { CustomerId : int; Name : string; Age : int; Town : string }
    ...
    override this.GetHashCode () = this.CustomerId.GetHashCode() // custom hash check

Observe the behaviour below for two records that have the same ID but different data:

let a = { CustomerId = 1; Name = "Test"; Age = 12; Town = "Town" }
let b = { CustomerId = 1; Name = "Test"; Age = 12; Town = "TownTwo" }

a = b // true
a.Equals b // true

let lookup = [ a, "Hello"; b, "Goodbye" ] |> readOnlyDict
lookup.Count // 1

For efficiency, we also provide an implementation of IEquatable<'T>, so that when you know both sides are of 'T, you can avoid the unnecessary pattern match / type check. The final implementation looks as follows:

[<CustomEquality; NoComparison>]
type Customer =
    { CustomerId : int; Name : string; Age : int; Town : string }

    interface IEquatable<Customer> with
        member this.Equals other = other.CustomerId.Equals this.CustomerId

    override this.Equals other =
        match other with
        | :? Customer as p -> (this :> IEquatable<_>).Equals p
        | _ -> false
    override this.GetHashCode () = this.CustomerId.GetHashCode()

Implementing Custom Comparison

If you also want to take part in custom "default" sorting - so ordering Customers in a list, or within an F# Set, you'll need to add the following:

[<CustomComparison; CustomEquality>]
type Customer =
    { CustomerId : int; Name : string; Age : int; Town : string }
    interface IComparable with
        member this.CompareTo other =
            match other with
            | :? Customer as p -> (this :> IComparable<_>).CompareTo p
            | _ -> -1

    interface IComparable<Customer> with
        member this.CompareTo other = other.CustomerId.CompareTo this.CustomerId

Now that we've implemented this fully, we can simply use Customer values themselves to handle equality and comparison - so, for example, as a key in a dictionary.

Performance of Custom Comparison

Let's now compare the performance of four different data structures for the following test:

Given a collection of items of a given type, put them into a Dictionary. Then, look up every item in that dictionary.

This test should check the performance for both insertions and lookups.

The four structures were:

  1. A Customer record with four simple fields (int and strings) using default equality.
  2. The same record but with custom equality / comparison implementations that uses just a single int field, CustomerId.
  3. A record just with a single CustomerId field on it.
  4. A raw integer.

Even with a simple record of just four primitive fields, we can observe that implementing an optimised equality & comparison implementation gives us benefits of between 50% and 70% compared to the default comparer. If your record is more complex - using nested records, discriminated unions etc. then this benefit will be even more pronounced.

As in the previous article in this series, you can find the full source code here.

Summary

F#'s default record equality / comparison implementation is a great starting point. However, if you need to change these semantics, you can do so with a couple of attributes and interfaces. Doing so not only can reduce code and complexity, but can yield significant performance boosts.