Interfacing with large language models

Demo

Demo 1
Author

Emi Tanaka

Published

October 15, 2024

The image depicts a futuristic scene with a humanoid robot at the center.

This demo assumes that you have all the software requirements.

To access the LLM API via curl, we can use httr2 R package (as shown in Section 2), but the elmer R package (Section 3) provides a more user-friendly interface so we use the latter as our primary method. Finally, for those interested in producing a Shiny web app, shinychat and Shiny Assistant will be briefly demonstrated in Section 4.

Warning

The elmer R package is in very active development. There may be breaking changes introduced in the near future (hopefully not during the workshop).

I will demonstrate using two methods via: Open AI and Ollama. You can choose to code along with one or both, or none! As mentioned before, there will likely not be time to debug individual technical issues.

1 Setting up to use Open AI

First to use Open AI, you will need to get the Open AI API key. Register an account with Open AI and go here to generate a new API key.

You will need to store the API key as OPENAI_API_KEY in your .Renviron file. The easiest way to find this file is to use usethis::edit_r_environ() which should open this file. In this file you will need to enter your key like below.

.Renviron
OPENAI_API_KEY=<YOUR API KEY>

You will need to restart your R session for the above environmental variable to take effect.

To check if this worked, try running the command below to see if you see your API key.

Sys.getenv("OPENAI_API_KEY")

2 httr2

The httr2 package gives a method to create, modify, and perform HTTP requests. This approach requires more work from the user so you will probably prefer to use elmer package instead.

library(httr2)
content <- "All models are wrong, but some models are"

2.1 Using Open AI

json <- request(base_url = "https://api.openai.com/v1/chat/completions") |> 
  req_headers('Authorization' = paste('Bearer', Sys.getenv("OPENAI_API_KEY")),
              'Content-Type' = 'application/json') |> 
  req_body_json(data = list(model = "gpt-4o-mini",
                            messages = list(list(role = "user",
                                                 content = content)))) |> 
  req_perform() |> 
  resp_body_json()

cat(json$choices[[1]]$message$content)
useful. This phrase, often attributed to the statistician George E. P. Box, emphasizes that while no model can perfectly represent reality, many can still provide valuable insights and predictions. The key is to understand the limitations of a model and to use it appropriately within its context. It's about leveraging the model's strengths while recognizing its shortcomings.

2.2 Using Ollama

json <- request(base_url = "http://localhost:11434/api/chat") |> 
  req_body_json(data = list(model = "llama3.1:8b",
                            messages = list(list(role = "user",
                                                 content = content)),
                            stream = FALSE)) |> 
  req_perform() |> 
  resp_body_json()

cat(json$message$content)
A famous quote from George Box!

"Essentially, all models are wrong, but some are useful."

― George E. P. Box, statistician and mathematician

Box was a renowned expert in statistics and modeling, and his quote highlights the importance of understanding the limitations of any model or prediction system.

In essence, it means that no mathematical model can perfectly capture reality, as there will always be some degree of error or uncertainty involved. However, by acknowledging these limitations and selecting models based on their usefulness and applicability to a specific problem, we can still gain valuable insights and make informed decisions.

This quote is often cited in the context of data science, machine learning, and statistics, serving as a reminder that:

1. **Models are simplifications**: They reduce complex phenomena to manageable mathematical expressions.
2. **Data is noisy**: There will always be some degree of error or variability in observations.
3. **Uncertainty exists**: It's impossible to capture all the nuances and complexities of reality with absolute precision.

Despite these limitations, models can still provide valuable predictions, insights, and guidance for decision-making, as long as they are carefully selected, validated, and used within their realm of applicability.

What inspired you to ask about this quote? Do you have any specific context or interest in modeling, statistics, or data science?

3 elmer

library(elmer)

3.1 Using AI vendor’s API

elmer currently supports the AI vendors in Table 1. To use the API, you need to sign up for an account with the vendor and obtain an API key (you may need to pay for some small credit to use this). We will only focus on Open AI where I have put a credit of US$5 (so far) to use their services. The API key can be set in a similar manner to Open AI API key (described in Section 1), but using the corresponding environmental variable name.

Table 1: AI vendors
AI vendor Name Link Environmental Variable
Anthropic claude https://www.anthropic.com/claude ANTHROPIC_API_KEY
Google gemini https://gemini.google.com/ GOOGLE_API_KEY
GitHub (waitlist only) github https://github.com/marketplace/models GITHUB_PAT
Groq groq https://groq.com GROQ_API_KEY
Ollama ollama https://ollama.com/ N/A
Open AI openai https://platform.openai.com/ OPENAI_API_KEY
Perplexity AI perplexity https://www.perplexity.ai PERPLEXITY_API_KEY

Once you have set up the API key, you can use the elmer package to interact with the AI vendor’s API.

chat_openai <- chat_openai(model = "gpt-4o-mini", seed = 1, echo = TRUE)

chat_openai$chat("Tell me a statistics joke!")
Why did the statistician bring a ladder to the bar?

Because he heard the drinks were on the house!

3.2 Using a local LLM

chat_ollama <- chat_ollama(model = "llama3.1:8b", seed = 1, echo = TRUE)

chat_ollama$chat("Tell me a statistics joke!")
A stats joke has to be "number one" in your book, right?

Here's the joke:

Why did the mean go to therapy?

Because it was feeling a little "off-average"!

(Sorry, I know it's a bit of a statistician's pun... but that's just a p-value 
chance, right?)

Which joke was funnier? 😄

3.3 Interactive usage

You can use a chatbot via console:

chat_console(chat_ollama) # live_console(chat_ollama) if you have a newer version

Or alternatively, via the browser:

chat_browser(chat_ollama) # live_browser(chat_ollama) if you have a newer version

4 Shiny

Shiny helps you to write web apps easily with R. This has proved to be massively useful for fast deployment of and proof of concept web apps for data science. You can integrate a chatbot interface using shinychat or use Shiny Assistant to help you write Shiny apps.

4.1 shinychat

You can easily make a Shiny web app that incorporates a nice chat user interface using shinychat.

library(shiny)
library(shinychat)

ui <- fluidPage(
  chat_ui("chat")
)

server <- function(input, output, session) {
  chat <- elmer::chat_openai(system_prompt = "You're a helpful statistics tutor.")
  
  observeEvent(input$chat_user_input, {
    stream <- chat$stream_async(input$chat_user_input)
    chat_append("chat", stream)
  })
}

shinyApp(ui, server)

4.2 Shiny Assistant

Shiny Assistant is an AI assistant to help you build Shiny apps! See the blog post about it here and to use it, go to https://gallery.shinyapps.io/assistant.