United Kingdom: +44 (0)208 088 8978

F# with Python pt 2 – TensorFlow binding

This week Ryan is back with a deeper dive into F# - Python interop using Fable 4, creating a rudimentary TensorFlow binding.

We're hiring Software Developers

Click here to find out more

I recently posted a blog detailing my adventures generating Python code from F# using the preview of Fable 4.

I showed how to get set up with the tools in VS Code, and even got as far as importing the popular machine learning framework TensorFlow to query the host machine's CPU info.

This was a tantalising glimpse at what is possible with Fable. Just as its JavaScript compilation brought the world of front-end web development to F#, Python support opens a massive ecosystem of tools and libraries, particularly for machine learning.

It also provides a tempting proposition for Python developers who are wishing for a more type-safe experience but need their bread-and-butter tools available. Indeed, at CIT we have worked with at least as many JS developers moving to F# as .NET developers moving into web client programming.

I decided that a fun challenge would be to start a proper typesafe F# binding for TensorFlow, rather than continue to access it in the unsafe dynamic way we did last time.

The great thing about creating bindings like this is that you don't need to try to wrap the whole library. With something the size of TensorFlow, that is a serious undertaking! Rather, you can just wrap the bits you want to use. If it is a popular community tool, perhaps other people will help you fill in the gaps!

With that in mind, I decided to bind just enough to recreate the standard beginner sample of handwritten digit recognition which I previously put together for my TensorFlow.NET blog.

Getting Started

Setting up was essentially the same as in the previous blog, however I needed to update the Fable dotnet tool to 4.0.0-theta-003 and the Fable.Core nuget package to 4.0.0-theta-001 (note the different versions - this caught me out!).

We start by opening up Fable.Core and importing TensorFlow.

#r"nuget: Fable.Core, 4.0.0-theta-001"

open Fable.Core

[<ImportAll("tensorflow")>]
let tensorflow: obj = nativeOnly

Last time, we accessed the CPU info like so:

tensorflow?config?list_physical_devices("CPU")

This time, rather than ? into the tensorflow object, we will define a proper type for it.

Providing our abstract method names match the underlying library, they will be automatically bound.

#r"nuget: Fable.Core, 4.0.0-theta-001"

open Fable.Core

type IPhysicalDevice =
    abstract name: string
    abstract device_type: string

type IConfig =
    abstract list_physical_devices : string -> IPhysicalDevice array

type ITensorFlow =
    abstract config : IConfig

[<ImportAll("tensorflow")>]
let tensorflow : ITensorFlow = nativeOnly

tensorflow.config.list_physical_devices("CPU")

Now we just need to run the fable build command in our console:

dotnet fable --lang Python "Fable 4 Python Example.fsx"

This generates the following Python script:

from __future__ import annotations
from abc import abstractmethod
import tensorflow
from typing import Protocol
from fable_modules.fable_library.types import Array

class IPhysicalDevice(Protocol):
    @property
    @abstractmethod
    def device_type(self) -> str:
        ...

    @property
    @abstractmethod
    def name(self) -> str:
        ...

class IConfig(Protocol):
    @abstractmethod
    def list_physical_devices(self, __arg0: str) -> Array[IPhysicalDevice]:
        ...

class ITensorFlow(Protocol):
    @property
    @abstractmethod
    def config(self) -> IConfig:
        ...

tensorflow.config.list_physical_devices("CPU")

Running the script in Python Interactive gives the expected output:

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

Remember to check you are running Python Interactive 3.9.13!

After this I followed a similar pattern to create types allowing me to navigate to and load the mnist handwritten digit data set using Keras.

Going further : ndarray

TensorFlow makes frequent use of the NumPy ndarray type.

This is an advanced and optimised array type with many helpful functions.

The first step in the sample is to load our test and training data, along with their labels, from the Keras datasets. These are loaded as a tuple of tuples of ndarrays , i.e. (image_train, label_train), (image_test, label_test).

I initially tried binding ndarray as just a normal array. This worked, and I could use the F# Array module functions to access items.

type INDArray<'a> = array<'a>

let ((image_train:INDArray<INDArray<obj>>, label_train:INDArray<obj>), (image_test:INDArray<INDArray<obj>>, label_test:INDArray<obj>)) = 
   tensorflow.keras.datasets.mnist.load_data()

printfn $"Image train image 1 pixel row 1: %A{image_train |> Array.head |> Array.head}"
Image train image 1 pixel row 1: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

This was great, but when I decided I wanted to implement the ndarray shape method I realised that you are not allowed to extend the array type.

I decided to create a dedicated NDArray type to which to add the member functions, and include toArray and fromArray methods to allow moving back and forth if necessary.

A second issue was that the shape function returns a tuple of dimension sizes, which can be any length. Since tuples have to declared with a fixed length in F# I tried using an array return type, which Fable seemed happy with.

If you want to override the default Fable-generated binding, you can use the Emit attribute to customise it. I used this with toArray and fromArray to trick the type system and just emit the input straight back out again:

type NDArray =
    abstract shape : int array
    [<Emit("$0")>]
    abstract fromArray: 'a[] ->  NDArray
    [<Emit("$0")>]
    abstract toArray: unit -> array<'a>

let ((image_train, label_train), (image_test, label_test)) = 
   tensorflow.keras.datasets.mnist.load_data()

printfn $"Image train shape: {image_train.shape}"
Image train shape: (60000, 28, 28)

This all works almost exactly the same way as JS Fable bindings, which are well documented

Reshape and divide

The ndarray.reshape function takes a tuple of arbitrary length determining the size of each output dimension.

Checking the Fable docs again I found there is support for this using the ParamArray attribute. This however meant I needed to make the abstract class concrete.

I also took the opportunity to add the / operation at the same time, which allows you to divide an ndarray by a scalar.

Now we can flatten each image into a single array of pixel values instead of an array of arrays, and normalise them from the range 0-255 to 0-1:

type NDArray =
    [<Emit("$0.reshape($1...)")>]
    member this.reshape([<ParamList>] args : int[]): NDArray = nativeOnly
    [<Emit("$0.ndim")>]
    member this.ndim : int = nativeOnly
    [<Emit("$0.shape")>]
    member this.shape : int array = nativeOnly
    [<Emit("$0 / $1")>]
    member this.``/``(arg : obj) : NDArray = nativeOnly
    [<Emit("$0")>]
    static member fromArray(arg : 'a[]): NDArray = nativeOnly
    [<Emit("$0")>]
    member this.toArray<'a>() : array<'a> = nativeOnly

//.....

let ((image_train, label_train), (image_test, label_test)) = 
   tensorflow.keras.datasets.mnist.load_data()

printfn $"Image train shape: {image_train.shape}"

let image_train_flat = (image_train.reshape [| 60000; 784 |]).``/`` 255

printfn $"Image train flat shape: {image_train_flat.shape}"

Compiling to Python and running in interactive shows us the result:

Image train shape: (60000, 28, 28)
Image train flat shape: (60000, 784)

Named parameters

The final challenge I faced was the requirement for named parameters when calling some functions.

Thankfully after reaching out to the project maintainers I found that this is supported with another handy attribute.

This meant I could do things such as the following to define the shape of the Keras model inputs:

type IKeras =
    [<NamedParams(fromIndex = 0)>]
    abstract Input: 
        // it would be nice if this could be a ParamList
        shape:int[] ->
            IKerasTensor

// ...

let inputs = tensorflow.keras.Input(shape = [| 784 |])

Conclusion

I managed to get the rest of the binding finished off pretty quickly, You can find the full sample on our GitHub.

As you can see, whilst bindings aren't particularly complex there are interesting challenges and choices to be made about how to represent things - not least how closely you fit to the original API vs tailoring to F#'s strengths like DUs and pipelining. Here I have kept almost exactly to the Python spec in order to make it familiar to people and compatible with the existing docs.

I might get around to actually publishing this as the start of a proper open-source binding project. I'd love to get other experienced Fable developer's feedback on ways I could have done it better / differently so feel free to comment on the repo.

Image by Craiyon