United Kingdom: +44 (0)208 088 8978

Benefits of pure functions

In the second post of our functional programming fundamentals series, Matt discusses the benefits of pure functions.

Pure function benefits

In the last post in this series, pure functions were defined as functions that are detemerministic and have no side effects. I demonstrated that this means that pure functions are simpler than impure ones (they have only direct input and output) and are isolated from the rest of the program.

In this post, I'll show some of the resulting benefits, some of which I'll present as drawbacks of impure functions.

Impure functions are harder to use correctly

Consider the functions f and g below.

let mutable counter = 0

let f () =
    counter <- counter + 1
    "f"

let g () =
    if counter < 3 then "All good!"
    else failwith "counter should never be 3 or more!"

f changes state that g uses. Because of this, the following sequence of function invocations results in an exception, which might not be obvious at a glance:

g () // "All good!"
f ()
f ()
g () // "All good!"
f ()
g () // System.Exception: counter should never be 3 or more!

To avoid the bug, a programmer working with f and g needs to understand the indirect inputs and outputs of both functions. Of course, that's not too hard in this example, but it gets harder as more impure functions interacting with the same state are added to a codebase.

A pure function has no reliance on program state, so this kind of bug can't happen when using a pure function.

Pure functions are easier to test

Because it's simple enough to understand easily but complex enough to feel like a real-world problem, I'm going to borrow Mark Seemann's restaurant reservation example from his blog series From depedency injection to dependency rejection for this section.

type Reservation =
    { Date: System.DateTime
      Quantity: int
      IsAccepted: bool }

let tryAcceptImpure capacity readReservations createReservation reservation = async {
    let! existing = readReservations reservation.Date
    let reservedSeats = existing |> List.sumBy (fun x -> x.Quantity)

    if reservedSeats + reservation.Quantity <= capacity then
        let! reservation = createReservation { reservation with IsAccepted = true }
        return Some reservation
    else
        return None
}

let tryAcceptPure capacity reservations reservation =
    let reservedSeats = reservations |> List.sumBy (fun x -> x.Quantity)
    if reservedSeats + reservation.Quantity <= capacity then
        Some { reservation with IsAccepted = true }
    else
        None

To check that the impure function is working as expected when there is enough available capacity, a test needs to check both the return value and the side effect. This requires setting up test doubles:

let testImpure () = async {
    let proto = { Date = System.DateTime(2021, 09, 01); Quantity = 0; IsAccepted = true }
    let existing =
        [ { proto with Quantity = 4 }
          { proto with Quantity = 6 }
          { proto with Quantity = 2 } ]
    let readReservations _ = async.Return existing
    let mutable newlyCreated = []
    let createReservation r =
        newlyCreated <- r :: newlyCreated
        async.Return r
    let request = { proto with Quantity = 4; IsAccepted = false }

    let! actual = tryAcceptImpure 20 readReservations createReservation request

    let expected = { request with IsAccepted = true }
    return actual = Some expected && newlyCreated = [ expected ]
}

To test the pure function, there is no need for test doubles and no need to check side effects. The test is both easier to write and easier to read:

let testPure () =
    let proto = { Date = System.DateTime(2021, 09, 01); Quantity = 0; IsAccepted = true }
    let existing =
        [ { proto with Quantity = 4 }
          { proto with Quantity = 6 }
          { proto with Quantity = 2 } ]
    let request = { proto with Quantity = 4; IsAccepted = false }

    let actual = tryAcceptPure 20 existing request

    actual = Some { request with IsAccepted = true }

Functions with side effects are harder to parallelise

You may have a function that you want to speed up.

let getItem i = async {
    do! Async.Sleep 100
    return i
}

// Slow
let sumUntilMoreThanTenImpureSequential =
    let mutable acc = 0
    let impureConditionalAdd i = async {
        let! next = getItem i
        acc <-
            if acc > 10 then acc
            else acc + next
    }

    [ 1 .. 10 ]
    |> List.map impureConditionalAdd
    |> Async.Sequential
    |> Async.RunSynchronously
    |> ignore

    acc

If it relies on side-effects, naively trying to parallelise it may give different results (changed to use Async.Parallel; answer varies since the addition may be done in any order):

// Faster but almost always gives a different answer
let sumUntilMoreThanTenImpureParallel =
    let mutable acc = 0
    let impureConditionalAdd i = async {
        let! next = getItem i
        acc <-
            if acc > 10 then acc
            else acc + next
    }

    [ 1 .. 10 ]
    |> List.map impureConditionalAdd
    |> Async.Parallel
    |> Async.RunSynchronously
    |> ignore

    acc

If you avoid using side effects altogether, parallelisation is safe:

// Faster and same answer
let sumUntilMoreThanTenPure =
    let conditionalAdd acc next =
        if acc > 10 then acc else acc + next

    [ 1 .. 10 ]
    |> List.map getItem
    |> Async.Parallel
    |> Async.RunSynchronously
    |> Array.reduce conditionalAdd

Pure functions can be memoised

Because pure functions are deterministic, you can cache their results for later. This costs space (memory) but saves time. For example, evaluating fib 45 below is slow on my computer.

let rec fib n =
    if n = 0 || n = 1 then 1
    else fib (n - 2) + fib (n - 1)

fib 45 // Slow.
fib 45 // Slow.

Evaluating fibCached 45 is slow the first time, but fast on subsequent evaluations.

let fibCached =
    let d = System.Collections.Generic.Dictionary ()
    let cached n =
        if d.ContainsKey n then
            d.[n]
        else
            let result = fib n
            d.Add(n, result)
            result
    cached

fibCached 45 // Slow.
fibCached 45 // Fast.

This technique is known as memoisation.

Impurity is necessary

At this point, you may be saying that pure functions look great, but wondering how code that doesn't read from an input stream or emit output to an output stream can be useful. The answer is, of course, that it can't: you need I/O somewhere in your codebase. But, by making the functions that require impurity as simple as possible and pushing them to the edges of your program, you can minimise the drawbacks caused by impurity and maximise the benefits of purity in your codebase. Mark Seemann has a good blog post on this subject: Impureim sandwich.

Summary

Pure functions have some benefits over impure ones. We covered the following in this post:

  • They are easier to use correctly
  • They are easier to test
  • They are easier to parallelise
  • They can be memoised

A useful program has to do some I/O, but can be composed of mostly pure functions with a few simple impure ones at the edges. This allows the codebase to make the most of the benefits of pure functions.