United Kingdom: +44 (0)208 088 8978

Benchmarking F# code

Continuing his look at performance tests in F#, Isaac shows how to quickly use Benchmark .NET to start profiling F# code.

Profiling code is a common requirement, but often we'll resort to some pretty simplistic ways of comparing "before / after" performance:

  • Console logging with start / stop times.
  • Using the Stopwatch type to make timings.
  • Simply running your app and seeing "what happens".
  • Just going on "gut feel".

None of these are particularly effective or useful. There are also profiling tools that are either standalone or built-in to popular IDEs that you can look at. However, they generally still require you to manually compare before / after and don't statistically prove one way or another e.g. making multiple runs of the same piece of code several times.

Creating your own Benchmarker

You may decide to write a piece of "timing" code:

let withTiming func arg =
    let sw = System.Diagnostics.Stopwatch.StartNew()
    func arg |> ignore
    sw.Stop()
    sw.Elapsed

let runNTimes times func arg =
    Array.init times (fun _ -> func arg)

let benchmark func arg =
    let func = withTiming func
    runNTimes 500 func arg
    |> Array.averageBy(fun r -> r.TotalSeconds)
    |> System.TimeSpan.FromSeconds

Now you can benchmark some code as follows:

// three different implementations of the same code...
let firstVersion x = ...
let secondVersion x = ...
let thirdVersion x = ...

// run all three functions and get average run times to compare
let firstTime = benchmark firstVersion "test"
let secondTime = benchmark secondVersion "test"
let thirdTime = benchmark thirdVersion "test"

Congratulations - you've just created a benchmark runner! Instead, it might be better to use a dedicated tool for this - such as Benchmark .NET. Benchmark .NET is a dedicated "runner" from which you can quickly create your own benchmarks and run them in a repeatable manner to generate statistically significant results.

A sample benchmark

As Benchmark .NET is simply a NuGet package, we can run it through a console application. Let's create a sample benchmark that tests creating a collection and then performing a simple map against all the elements. We'll compare Sequence, Array and List with collection sizes between 100 and 1,000,000 elements:

open System

open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open BenchmarkDotNet.Jobs

[<SimpleJob (RuntimeMoniker.NetCoreApp50)>]
type Benchmarks() =
    [<Params(100, 1000, 10000, 100000, 1000000)>]
    member val size = 0 with get, set

    [<Benchmark(Baseline = true)>]
    member this.Array () = [| 0 .. this.size |] |> Array.map ((+) 1)
    [<Benchmark>]
    member this.List () = [ 0 .. this.size ] |> List.map ((+) 1)
    [<Benchmark>]
    member this.Seq () = seq { 0 .. this.size } |> Seq.map ((+) 1) |> Seq.length // force evaluation

BenchmarkRunner.Run<Benchmarks>() |> ignore

You can run the application as follows: dotnet run -c Release.

You must run your benchmark in Release configuration.

The runner will perform warmups, repeated tests for each data set etc. and at the end you'll get a result set printed to the console (as well as outputs in text files for you to analyse later).

Method size Mean Error StdDev Ratio RatioSD
Array 100 807.6 ns 16.11 ns 21.50 ns 1.00 0.00
List 100 1,530.2 ns 29.12 ns 32.37 ns 1.90 0.06
Seq 100 1,150.7 ns 16.38 ns 14.52 ns 1.42 0.06
Array 1000 6,673.4 ns 132.92 ns 186.34 ns 1.00 0.00
List 1000 14,014.3 ns 272.78 ns 241.81 ns 2.08 0.07
Seq 1000 10,655.3 ns 207.50 ns 213.09 ns 1.59 0.06
Array 10000 60,775.7 ns 1,181.36 ns 1,264.04 ns 1.00 0.00
List 10000 193,136.3 ns 3,347.61 ns 3,287.80 ns 3.18 0.08
Seq 10000 95,063.3 ns 1,598.60 ns 1,495.33 ns 1.57 0.04
Array 100000 1,293,077.8 ns 9,469.96 ns 8,394.87 ns 1.00 0.00
List 100000 3,928,031.6 ns 25,723.13 ns 20,082.93 ns 3.04 0.03
Seq 100000 922,725.2 ns 7,162.96 ns 6,349.78 ns 0.71 0.01
Array 1000000 16,954,461.7 ns 333,122.39 ns 798,139.97 ns 1.00 0.00
List 1000000 133,597,745.5 ns 2,633,701.96 ns 4,177,330.40 ns 7.82 0.51
Seq 1000000 9,851,197.8 ns 106,198.69 ns 99,338.32 ns 0.57 0.03

Note that we also specified to treat the Array test as the "Benchmark" against which the other tests are measured, so the Ratio column shows whether List of Seq is faster or slower compared to Array. This is useful for testing competing alternatives against an existing piece of code.

What's also impressive is that you can have multiple input parameters and Benchmark .NET will test against all combination of all parameters. In addition to timing you can also add extensions to test e.g. memory allocation rather than just timings.

Summary

If you're trying to do some performance testing, don't build your own benchmark framework or library - use something like Benchmark .NET. Although it doesn't have an "F# first" API, it is still relatively easy to work with from F#, and the benefits you'll gain from this far outweigh the minor effort of using an OO-styled API to define your benchmarks in.