Function calling (tool use)
Use an LLM to select amongst provided tools
Function calling requires an LLM deployed using the TensorRT-LLM Engine Builder.
If you want to try this function calling example code for yourself, deploy this implementation of Llama 3.1 8B.
To use function calling:
- Define a set of functions/tools in Python.
- Pass the function set to the LLM with the
tools
argument. - Receive selected function(s) as output.
With function calling, itβs essential to understand that the LLM itself is not capable of executing the code in the function. Instead, the LLM is used to suggest appropriate function(s), if they exist, based on the prompt. Any code execution must be handled outside of the LLM call β a great use for chains.
Define functions in Python
Functions can be anything: API calls, ORM access, SQL queries, or just a script. Itβs essential that functions are well-documented; the LLM relies on the docstrings to select the correct function.
As a simple example, consider the four basic functions of a calculator:
def multiply(a: float, b: float):
"""
A function that multiplies two numbers
Args:
a: The first number to multiply
b: The second number to multiply
"""
return a * b
def divide(a: float, b: float):
"""
A function that divides two numbers
Args:
a: The dividend
b: The divisor
"""
return a / b
def add(a: float, b: float):
"""
A function that adds two numbers
Args:
a: The first number
b: The second number
"""
return a + b
def subtract(a: float, b: float):
"""
A function that subtracts two numbers
Args:
a: The number to subtract from
b: The number to subtract
"""
return a - b
These functions must be serialized into LLM-accessible tools:
from transformers.utils import get_json_schema
calculator_functions = {
'multiply': multiply,
'divide': divide,
'add': add,
'subtract': subtract
}
tools = [get_json_schema(f) for f in calculator_functions.values()]
Pass functions to the LLM
The input spec for models like Llama 3.1 includes a tools
key that we use to pass the functions:
import json
import requests
payload = {
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "What is 3.14^3.14?"},
],
"tools": tools, # tools are provided in the same format as OpenAI's API
"tool_choice": "auto", # auto is default - the model will choose whether or not to make a function call
}
MODEL_ID = ""
BASETEN_API_KEY = ""
resp = requests.post(
f"https://model-{MODEL_ID}.api.baseten.co/production/predict",
headers={"Authorization": f"Api-Key {BASETEN_API_KEY}"},
json=payload,
)
tool_choice: auto (default) β may return a function
The default tool_choice
option, auto
, leaves it up to the LLM whether to return one function, multiple functions, or no functions at all, depending on what the model feels is most appropriate based on the prompt.
tool_choice: required β will always return a function
The required
option for tool_choice
means that the LLM is guaranteed to chose at least one function, no matter what.
tool_choice: none β will always return a function
The none
option for tool_choice
means that the LLM will not return a function, and will instead produce ordinary text output. This is useful when you want to provide the full context of a conversation without adding and dropping the tools
parameter call-by-call.
tool_choice: direct β will return a specified function
You can also pass a specific function directly into the call, which is guaranteed to be returned. This is useful if you want to hardcode specific behavior into your model call for testing or conditional execution.
"tool_choice": {"type": "function", "function": {"name": "subtract"}}
Receive function(s) as output
When the model returns functions, theyβll be a list that can be parsed as follows:
func_calls = json.loads(resp.text)
# In this example, we execute the first function (one of +-/*) on the provided parameters
func_call = func_calls[0]
calculator_functions[func_call["name"]](**func_call["parameters"])
After reading the LLMβs selection, your execution environment can run the necessary functions. For more on combining LLMs with other logic, see the chains documentation.