United Kingdom: +44 (0)208 088 8978

Working with OpenAI and F# – Prompting with larger datasets

Isaac continues his exploration of OpenAI and F#, this time looking at some larger dataset processing challenges.

We're hiring Software Developers

Click here to find out more

In the previous post in this series, I looked at the basics of the OpenAI SDK and how to set up a basic conversation with it, using local state to manage the conversation history. In this post I would like to look at a slightly more complex example that uses some more complex, semi-structured input data to create a fuzzy answer - the sort of thing that AI can be a good fit for.

Let's imagine you're in a band playing a small set at the same venue every week. You have a number of songs the band has learned, and would like to create an effective setlist for next week's gig. How could OpenAI help us here?

Here's an example setlist:

setlist.csv

Title,Artist,Style,Tempo,Popularity,Difficulty
God Bless The Child,Billie Holiday,Jazz,Slow,3,Easy
Mercy Mercy Mercy,Cannonball Adderley,Instrumental / Jazz,Medium,2,Easy,
The Chicken,Pee Wee Ellis,Instrumental / Jazz,High,3,Medium
Long Train Runnin',Doobie Brothers,Soul / Funk / Rock,High,5,Easy,
Cantaloupe Island,Herbie Hancock,Instrumental / Jazz,Medium,2,Easy,
To Love Somebody,Bee Gees,Pop / Rock,Medium,5,Easy,
Wind Cries Mary,Jimi Hendrix,Blues / Rock,Slow,3,Easy,
All Blues,Miles Davis,Instrumental / Blues,Medium,3,Easy,
Dock of the Bay,Otis Redding,Soul,Slow,5,Very Easy,
Roxanne,The Police,Pop / Rock,Medium,5,Easy

A basic example:

Here's a simple starting point that we can prompt OpenAI with:

let setList = System.IO.File.ReadAllText "setlist.csv"
let gigPlanner = Conversation.StartNew()
gigPlanner.SendMessage $"Please generate a possible setlist for me for my next gig. The gig should have 3-4 songs in it. Here's our setlist: {setList}"

Observe how we simply use string interpolation to inject the CSV directly into the question text for OpenAI. We don't need to worry about providing the data in a specific format - it's already structured enough in the sense that it's a CSV (and even if it wasn't, OpenAI can parse and make sense of unstructured text easily enough). I got back a response like this:

Here's a suggested setlist for your next gig based on the songs you provided:

1. **Long Train Runnin'** - Doobie Brothers  
   *Style: Soul / Funk / Rock*  
   *Tempo: High*  
   *Popularity: 5*  
   *Difficulty: Easy*

2. **Mercy Mercy Mercy** - Cannonball Adderley  
   *Style: Instrumental / Jazz*  
   *Tempo: Medium*  
   *Popularity: 2*  
   *Difficulty: Easy*

3. **Dock of the Bay** - Otis Redding  
   *Style: Soul*  
   *Tempo: Slow*  
   *Popularity: 5*  
   *Difficulty: Very Easy*

4. **Roxanne** - The Police  
   *Style: Pop / Rock*  
   *Tempo: Medium*  
   *Popularity: 5*  
   *Difficulty: Easy*

This setlist provides a nice variety of tempos and styles, starting off with an upbeat groove before transitioning to a soulful classic and ending on an engaging rock tune. It should keep the audience entertained!

More data, better results

By providing some more context and data, you can get better results from an AI, so let's provide a history of sets we've done - that way, we can ask the AI to ensure we get a fair distribution of songs over time.

gigs.csv

Date,Song
1st June 2024,God Bless The Child
1st June 2024,The Chicken
1st June 2024,Long Train Runnin'
1st June 2024,Wind Cries Mary

etc.
let gigPlanner = Conversation.StartNew()
gigPlanner.SendMessage $"Please generate a possible setlist for me for my next gig. The gig should have 3-4 songs in it. Here's our setlist: {setList}. Also, here's the history of our previous gigs: {sampleHistory}."

OpenAI now returns a different response:

Based on your previous gigs and the provided setlist, here’s a suggested setlist for your next performance that avoids repetition while still keeping a good mix of styles:

### Setlist for Next Gig:

... [setlist details ommitted for brevity]

### Rationale:
- **"Mercy Mercy Mercy"** opens the set with a jazzy, upbeat feel, setting a lively mood.
- **"All Blues"** then shifts listeners into a classic instrumental blues, showcasing musicianship.
- **"The Chicken"** includes an energetic instrumental that maintains the upbeat tempo and adds excitement.
- **"Wind Cries Mary"** closes out the set on a more reflective note, creating a nice contrast with the previous songs.

This setlist offers variety while avoiding songs you've recently played, ensuring that your audience hears something new this time around!

OpenAI gives us a setlist, and also explains why it has chosen what it has done. In addition, it's explicitly avoided songs we've played recently.

Prompting

One of the challenges of AI is to effectively "prompt" the system to better explain what you want to receive. In this case, it might be to clearly specify the rules that we want the AI to adhere to when building the setlist; in addition, we might like to show an example of the output format that we want.

Here's an example of how we might better formulate the prompt. I'm using an F# list here - of course, you could retrieve these rules (and the data) from remote systems, the file system, a database etc. - it doesn't have to be static (which in this case is one of the distinguishing factors from using the ChatGPT app):

let rules = [
    "There should be at least one jazz number that we would start with"
    "We would end on a 'big' number - a song that gets peoples attention, typically something a bit louder and more rock"
    "We want at least one blues song"
]

let concatenatedRules = String.Concat "; " rules

gigPlanner.SendMessage
    $"Please generate a possible setlist for me for my next gig. The gig should have 3-4 songs in it. Here's our setlist: {setList}. Also, here's the history of our previous gigs: {sampleHistory}. I have some specific rules for how you can help pick the next setlist: {concatenatedRules}. Please summarise how the setlist fulfills each of the rules once you present the setlist to me. Give both the setlist and rule summary in tabular format."

This is a more explicit set of guidelines for the AI - not only are we giving it more context with two different datasets, but we also provide a clear set of rules as well as some indication of the required output.

Now we get the following back from OpenAI:

Here's a possible setlist based on your requirements and previous gig history:

| Order | Title                | Artist               | Style                  | Tempo | Popularity | Difficulty |
|-------|----------------------|----------------------|------------------------|-------|------------|------------|
| 1     | Cantaloupe Island    | Herbie Hancock       | Instrumental / Jazz    | Medium| 2          | Easy       |
| 2     | All Blues            | Miles Davis          | Instrumental / Blues   | Medium| 3          | Easy       |
| 3     | Dock of the Bay      | Otis Redding         | Soul                   | Slow  | 5          | Very Easy  |
| 4     | Roxanne              | The Police           | Pop / Rock             | Medium| 5          | Easy       |

### Rule Summary

| Rule                                                                 | Fulfillment     |
|----------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| At least one jazz number to start with.                              | "Cantaloupe Island" by Herbie Hancock is chosen as the opening jazz number.  |
| End on a 'big' number, typically something louder and more rock.     | "Roxanne" by The Police is selected as the final song; it is upbeat and well-known, capturing the audience's attention.  |
| At least one blues song included.                                    | "All Blues" by Miles Davis is included as the second song, fulfilling the requirement for a blues number in the setlist.  |

This setlist maintains variety while adhering to your specific performance guidelines."

Summary

It's possible - and quite easy - to use OpenAI with datasets. Although the chatbot OpenAI only really accepts strings for data, it is able to parse and understand data embedded in strings, and to use that to shape the responses it gives back.
In the next post of this mini-series, I'll be showing how we can provide functions to OpenAI so that it can interact with systems that exist outside of the OpenAI platform.