Pure function benefits
In the last post in this series, pure functions were defined as functions that are detemerministic and have no side effects. I demonstrated that this means that pure functions are simpler than impure ones (they have only direct input and output) and are isolated from the rest of the program.
In this post, I'll show some of the resulting benefits, some of which I'll present as drawbacks of impure functions.
Impure functions are harder to use correctly
Consider the functions f
and g
below.
let mutable counter = 0
let f () =
counter <- counter + 1
"f"
let g () =
if counter < 3 then "All good!"
else failwith "counter should never be 3 or more!"
f
changes state that g
uses. Because of this, the following sequence of function invocations results in an exception, which might not be obvious at a glance:
g () // "All good!"
f ()
f ()
g () // "All good!"
f ()
g () // System.Exception: counter should never be 3 or more!
To avoid the bug, a programmer working with f
and g
needs to understand the indirect inputs and outputs of both functions. Of course, that's not too hard in this example, but it gets harder as more impure functions interacting with the same state are added to a codebase.
A pure function has no reliance on program state, so this kind of bug can't happen when using a pure function.
Pure functions are easier to test
Because it's simple enough to understand easily but complex enough to feel like a real-world problem, I'm going to borrow Mark Seemann's restaurant reservation example from his blog series From depedency injection to dependency rejection for this section.
type Reservation =
{ Date: System.DateTime
Quantity: int
IsAccepted: bool }
let tryAcceptImpure capacity readReservations createReservation reservation = async {
let! existing = readReservations reservation.Date
let reservedSeats = existing |> List.sumBy (fun x -> x.Quantity)
if reservedSeats + reservation.Quantity <= capacity then
let! reservation = createReservation { reservation with IsAccepted = true }
return Some reservation
else
return None
}
let tryAcceptPure capacity reservations reservation =
let reservedSeats = reservations |> List.sumBy (fun x -> x.Quantity)
if reservedSeats + reservation.Quantity <= capacity then
Some { reservation with IsAccepted = true }
else
None
To check that the impure function is working as expected when there is enough available capacity, a test needs to check both the return value and the side effect. This requires setting up test doubles:
let testImpure () = async {
let proto = { Date = System.DateTime(2021, 09, 01); Quantity = 0; IsAccepted = true }
let existing =
[ { proto with Quantity = 4 }
{ proto with Quantity = 6 }
{ proto with Quantity = 2 } ]
let readReservations _ = async.Return existing
let mutable newlyCreated = []
let createReservation r =
newlyCreated <- r :: newlyCreated
async.Return r
let request = { proto with Quantity = 4; IsAccepted = false }
let! actual = tryAcceptImpure 20 readReservations createReservation request
let expected = { request with IsAccepted = true }
return actual = Some expected && newlyCreated = [ expected ]
}
To test the pure function, there is no need for test doubles and no need to check side effects. The test is both easier to write and easier to read:
let testPure () =
let proto = { Date = System.DateTime(2021, 09, 01); Quantity = 0; IsAccepted = true }
let existing =
[ { proto with Quantity = 4 }
{ proto with Quantity = 6 }
{ proto with Quantity = 2 } ]
let request = { proto with Quantity = 4; IsAccepted = false }
let actual = tryAcceptPure 20 existing request
actual = Some { request with IsAccepted = true }
Functions with side effects are harder to parallelise
You may have a function that you want to speed up.
let getItem i = async {
do! Async.Sleep 100
return i
}
// Slow
let sumUntilMoreThanTenImpureSequential =
let mutable acc = 0
let impureConditionalAdd i = async {
let! next = getItem i
acc <-
if acc > 10 then acc
else acc + next
}
[ 1 .. 10 ]
|> List.map impureConditionalAdd
|> Async.Sequential
|> Async.RunSynchronously
|> ignore
acc
If it relies on side-effects, naively trying to parallelise it may give different results (changed to use Async.Parallel
; answer varies since the addition may be done in any order):
// Faster but almost always gives a different answer
let sumUntilMoreThanTenImpureParallel =
let mutable acc = 0
let impureConditionalAdd i = async {
let! next = getItem i
acc <-
if acc > 10 then acc
else acc + next
}
[ 1 .. 10 ]
|> List.map impureConditionalAdd
|> Async.Parallel
|> Async.RunSynchronously
|> ignore
acc
If you avoid using side effects altogether, parallelisation is safe:
// Faster and same answer
let sumUntilMoreThanTenPure =
let conditionalAdd acc next =
if acc > 10 then acc else acc + next
[ 1 .. 10 ]
|> List.map getItem
|> Async.Parallel
|> Async.RunSynchronously
|> Array.reduce conditionalAdd
Pure functions can be memoised
Because pure functions are deterministic, you can cache their results for later. This costs space (memory) but saves time. For example, evaluating fib 45
below is slow on my computer.
let rec fib n =
if n = 0 || n = 1 then 1
else fib (n - 2) + fib (n - 1)
fib 45 // Slow.
fib 45 // Slow.
Evaluating fibCached 45
is slow the first time, but fast on subsequent evaluations.
let fibCached =
let d = System.Collections.Generic.Dictionary ()
let cached n =
if d.ContainsKey n then
d.[n]
else
let result = fib n
d.Add(n, result)
result
cached
fibCached 45 // Slow.
fibCached 45 // Fast.
This technique is known as memoisation.
Impurity is necessary
At this point, you may be saying that pure functions look great, but wondering how code that doesn't read from an input stream or emit output to an output stream can be useful. The answer is, of course, that it can't: you need I/O somewhere in your codebase. But, by making the functions that require impurity as simple as possible and pushing them to the edges of your program, you can minimise the drawbacks caused by impurity and maximise the benefits of purity in your codebase. Mark Seemann has a good blog post on this subject: Impureim sandwich.
Summary
Pure functions have some benefits over impure ones. We covered the following in this post:
- They are easier to use correctly
- They are easier to test
- They are easier to parallelise
- They can be memoised
A useful program has to do some I/O, but can be composed of mostly pure functions with a few simple impure ones at the edges. This allows the codebase to make the most of the benefits of pure functions.
In the next post, we'll look at immutability and expressions.