Programming against OpenAI
The OpenAI platform is a set of developer APIs to program against the same AI models (and more) that power ChatGPT. It includes things such as image generation, code generation and speech recognition, as well as (of course) the GPT models that can be used in the same way that you might use Chat GPT, except from code. In this series of posts, I would like to give a gentle introduction to working with it and to illustrate some ways that you can start to use AI in your applications today. To get started with OpenAI, you'll first need a paid account with some credits - contrary to what you might find on some of the documentation, there doesn't appear to be a free tier any longer. However, paying just a few dollars will give you enough credits for a large number of interactions with the service (the charging model can be complex, and is impacted on many factors such as what service you're using, the model being used, and the amount of content being transmitted).
The Power of scripts
As I've shown in the past many times, F# scripts are a great way to start working with a new API. OpenAI is no different - and especially for working with chat bots, where you have an interactive conversation and may need to "break out" of the conversation temporarily before resuming the conversation. Let's start by simply getting a handle to the OpenAI SDK and creating a client:
#r "nuget: OpenAI, 2.0.0"
open OpenAI.Chat
let client = ChatClient("gpt-4o-mini", "API-KEY-GOES-HERE")
let response = client.CompleteChat([ UserChatMessage "Please send an friendly greeting back to me!" :> ChatMessage ])
response.Value.Content[0].Text
(*
{
Content = seq [Hello! I hope you're having a wonderful day! How can I assist you today?];
ContentTokenLogProbabilities = seq [];
CreatedAt = 13/10/2024 15:34:32 +00:00;
FinishReason = Stop;
FunctionCall = null;
Id = "chatcmpl-....";
Model = "gpt-4o-mini-2024-07-18";
Role = Assistant;
SystemFingerprint = ".....";
ToolCalls = seq [];
Usage = OpenAI.Chat.ChatTokenUsage;
}
*)
let secondResponse = client.CompleteChat([ UserChatMessage "Please send a funny greeting back to me!" :> ChatMessage ])
secondResponse.Value.Content[0].Text
(*
Sure thing! Here you go: "Why did the computer go to therapy? Because it had too many bytes of emotional baggage! Hope your day is full of laughter and zero glitches!"
*)
Note the extra
:> ChatMessage
on the call toCompleteChat
, without which the type inference engine will infer the list to be ofUserChatMessage
, which won't compile due to F#'s lack of list covariance. This is typical when interacting with libraries which use inheritance type hierachies, such as the OpenAI one. In F#, it's much more common to use discriminated unions for this kind of "is-a" relationship.
Conversations with the OpenAI ChatBot
Obviously for one-off questions, this sample works, but when trying to make conversations, things get more complicated: OpenAI is a stateless service, which means that you need to explicitly store the full conversation with the service and on every call send the entire history of the conversation, stating which were the user and chat assistant messages.
Let's see an example of this:
let responseOfConversation = client.CompleteChat [
UserChatMessage "Please send a funny greeting back to me!" :> ChatMessage
AssistantChatMessage
"Sure thing! Here you go: Why did the computer go to therapy? Because it had too many bytes of emotional baggage! Hope your day is full of laughter and zero glitches!"
UserChatMessage "Now send me a short one."
]
responseOfConversation.Value.Content[0].Text // "Why don’t scientists trust atoms? Because they make up everything! Have a great day!
In this case, I've hard-coded the list, but you'll obviously want a way to interactively engage with the service. In a more reliable implementation, you might use a fold
, but for the purpose of today we'll use a couple of mutable ResizeArray
values - one of user input and one for assistant responses, which allows us to write code such as the following:
type Conversation = {
Messages: ResizeArray<UserChatMessage>
Responses: ResizeArray<ChatCompletion>
} with
/// Initialise a new Conversation
static member StartNew() =
/// Interleave both Messages and Responses into a single set of ChatMessage values
member this.History() =
/// Given a message and a conversation, send it all to OpenAI, recording the response and returning the text content.
let sendMessage (message: string) conversation : string =
let conversation = Conversation.StartNew()
conversation |> sendMessage "Please send a funny greeting back to me!"
conversation |> sendMessage "Now send a short, unfunny one."
conversation |> sendMessage "Yes, I'm having a great day thanks! How about you?"
conversation.History() // return the entire history
This looks a little unusual for F# code - encapsulated state and hidden mutation. However, for an interactive scripting session, this is not an inappropriate solution.
Basic interaction with .NET
Let's finish this post with some basic data interaction that we can program against rather than simply text back-and-forwards. In this case, I want to get a list of points about what F# is and do something with it programmatically:
let fsConversation = Conversation.StartNew()
fsConversation |> sendMessage "What is the F# programming language?"
fsConversation |> sendMessage "Give me those points as some JSON that I can parse, with properties Name and Description. Don't include any other information or headings - just plain JSON please that can be directly parsed by a machine."
fsConversation |> sendMessage "No markdown formatting. PLAIN JSON ONLY."
The first prompt returned me 8 items about F#; the second gave back some JSON but enclosed in markdown quotes (despite my clear request); a third - more strongly worded prompt - did the trick, and I can now program against that data:
let data =
JsonSerializer.Deserialize<{| Name: string; Description: string |} array>(
fsConversation.Responses |> Seq.item 2 |> _.Content[0].Text
)
(*
[|{ Description =
"F# emphasizes immutability, first-class functions, and higher"+[107 chars]
Name = "Functional Programming" };
{ Description =
"F# has a powerful type inference system, which allows the com"+[121 chars]
Name = "Type Inference" };
...
|]
*)
And just for fun, I ended with this prompt:
fsConversation
|> sendMessage "Now return that as a fully-featured HTML page with styles built on top of the Bulma CSS framework."
open System.Diagnostics
open System.IO
File.WriteAllText("sample.html", fsConversation.Responses |> Seq.item 3 |> _.Content[0].Text)
Process.Start(ProcessStartInfo("sample.html", UseShellExecute = true)) |> ignore
And lo and behold, the browser opens with the following - pretty neat!
Summary
The OpenAI SDK allows you to programmatically interact with the same models that run ChatGPT. In conjunction with F# scripting, you can start to interactively experiment with the service and incorporate the data it creates with .NET tools. In the next post in this series, I'd like to look at how to start to introduce your own custom .NET functionality into OpenAI using its Functions API. You can play around with this sample here.