Function calling (tool use)
Use an LLM to select amongst provided tools
Function calling requires an LLM deployed using the TensorRT-LLM Engine Builder.
If you want to try this function calling example code for yourself, deploy this implementation of Llama 3.1 8B.
To use function calling:
- Define a set of functions/tools in Python.
- Pass the function set to the LLM with the
tools
argument. - Receive selected function(s) as output.
With function calling, itβs essential to understand that the LLM itself is not capable of executing the code in the function. Instead, the LLM is used to suggest appropriate function(s), if they exist, based on the prompt. Any code execution must be handled outside of the LLM call β a great use for chains.
Define functions in Python
Functions can be anything: API calls, ORM access, SQL queries, or just a script. Itβs essential that functions are well-documented; the LLM relies on the docstrings to select the correct function.
As a simple example, consider the four basic functions of a calculator:
These functions must be serialized into LLM-accessible tools:
Pass functions to the LLM
The input spec for models like Llama 3.1 includes a tools
key that we use to pass the functions:
tool_choice: auto (default) β may return a function
The default tool_choice
option, auto
, leaves it up to the LLM whether to return one function, multiple functions, or no functions at all, depending on what the model feels is most appropriate based on the prompt.
tool_choice: required β will always return a function
The required
option for tool_choice
means that the LLM is guaranteed to chose at least one function, no matter what.
tool_choice: none β will always return a function
The none
option for tool_choice
means that the LLM will not return a function, and will instead produce ordinary text output. This is useful when you want to provide the full context of a conversation without adding and dropping the tools
parameter call-by-call.
tool_choice: direct β will return a specified function
You can also pass a specific function directly into the call, which is guaranteed to be returned. This is useful if you want to hardcode specific behavior into your model call for testing or conditional execution.
Receive function(s) as output
When the model returns functions, theyβll be a list that can be parsed as follows:
After reading the LLMβs selection, your execution environment can run the necessary functions. For more on combining LLMs with other logic, see the chains documentation.