F# 4.7 introduced a couple of interesting additions for allowing throttling when working with async workflows that allows you to restrict how many workflows to execute at any given time. In this post, I'll take you through the "why" and "how" of working with them.
Note: If you're using VS2019, be aware that earlier versions came with a flawed implementation of this. It's now been fixed - make sure you've upgraded to at least 16.4!
F# and Fork / Join today
F# already has an implementation of the fork / join pattern over async { }
blocks, using the existing Async.Parallel
method:
Async.Parallel : seq<Async<'a>> -> Async<'a []>
In other words, given a sequence of Async computations, each returning some 'a
, Parallel will return back a single Async computations returns a array of 'a
. Observe an example that "simulates" making 10 network round trips to a remote server to double a number:
let doWork number = async {
do! Async.Sleep 100 // simulate some network latency...
return n * 2
}
[ 1 .. 10 ] // [ 1; 2; 3 ... ]
|> List.map doWork // [ Async 2; Async 4; Async 6 ... ]
|> Async.Parallel // Async [ 2; 4; 6 ]
|> Async.RunSynchronously // block and unwrap to [ 2; 4; 6 ... ]
The issue comes when you need to introduce some form of throttling into your asynchronous work. For example, assume that you needed to make 100 async web calls, but the server rate-limited you to a maximum of 3 requests at any one time. With the standard Async.Parallel
implementation, you would send all 100 requests simultaneously, quickly leading to a server error.
Async Throttling
Whilst you can implement your own throttling algorithm, F# now comes with two new functions in the Async module that do this for you:
Async.Sequential
, which enforces sequential processing of multiple workflows.- An overload for
Async.Parallel
that gives you fine-grained control of the degree of throttling.
To illustrate some differences in performance characteristics, we ran the same ten asynchronous workflows against four different fork/join algorithms and visualised them here, so that you can see the differences between them. Each workflow simply waits a random time between 1-5 seconds (importantly: we re-used the exact same ten workflows across each algorithm, so the timings across each simulation is identical).
1. Sequential
Async.Sequential
allows you to run one workflow at a time. There's no parallelisation here, and it completes in the sum of the time all the workflows need in order to complete (40 seconds):
Use this if you want to complete workflows one-at-a-time and achieve a kind of "for each" behaviour.
2. Parallel
Async.Parallel
allows you to run all workflows simultaneously. There's no throttling though, so you run the risk of e.g. being rate limited or similar. It will complete in the same time as the longest workflow takes to execute:
In other words, if throttling isn't an issue, Async.Parallel will maximise resources as much as possible and return as quickly as possible (in our simulation, around 5 seconds).
3. Hand-rolled Throttling
Before F# 4.7, a common way to achieve throttling was to combine the use of a hand-rolled "sequential" Async accumulator, plus Seq.chunkBySize
and Seq.concat
to split into workflows into "chunks", before combining the workflows in each chunk, once all chunks had completed.
As you can see, it executes each "chunk" of workflows in parallel before sequentially executing the next chunk.
4. Parallel with Throttling
This is the new overload supplied in F# core. It's a tupled method, rather than curried, so you may wish to hand-roll your own curried version to make pipelining easier.
module Async =
/// A wrapper around the throttling overload of Parallel to allow easier pipelining.
let ParallelThrottle throttle workflows =
Async.Parallel(workflows, throttle)
It behaves as per the "hand-rolled" throttling example per above, with one important distinction which makes it more efficient: Observe that in simulation 3, each "chunk" waits until all workflows in that chunk are completed before moving to the next chunk. This means that there will be time within each chunk when only 2 or even only 1 workflow is running.
You can observe that in the timeline graph above between seconds 2 and 5, where only two workflows (Workers 0 and 1) are active, or seconds 14 and 15, where only one workflow (Worker 7) is running. Compare this with the behaviour below:
Notice how as soon as Worker 2 completes, Worker 3 begins. Indeed, if you look vertically along the timeline, you will see that at any given second, there will always be three workers active! This means it completes more quickly than simulation 3 (particularly when there's a wide variance between timings of individual workflows): in this case, 14 seconds instead of 18 seconds.
Conclusion
Async workflows in F# are a powerful tool. The existing parallelisation capabilities are good, but occasionally not fit for purpose. Thankfully, these capabilities have now been enhanced to allow throttling that is simple and efficient to use.
Here's the code snippet that you can experiment with yourself that allows you to visualise the four different behaviours of fork-join with async workflows in F# yourself.
Hope you enjoyed this. Have (fun _ -> Ok)
!
Isaac