What Is LLM Function Calling? A Guide to Models & Best Practices

Have you ever noticed how using large language models feels like a chat with a good friend? They understand you, recall details from previous interactions, and even make your life easier by performing tasks for you. However, that intimate conversation can quickly turn chaotic in more complex scenarios, especially when multiple goals are at play. This is where the multimodal LLM function call comes in. LLM function call allows large language models to talk to external functions or applications to streamline operations and keep users on track. This article will explore function calling, its benefits, and how to implement it into your applications.

Suppose you want to integrate LLM function calling into your product to enable seamless automation, enhance model interactivity, and improve efficiency in your application’s AI-driven features. In that case, Lamatic’s generative AI tech stack can help. Our tools simplify the process of getting LLMs to work for you so you can focus on your business instead of getting bogged down in the technical details.

What is LLM Function Calling, and What Problem Does It Solve?

illustration of code runnign - LLM Function Calling

LLM function calling, tool usage, or API calls allow large language models to access external functions or tools to improve their performance on complex tasks. Instead of merely generating text, LLMs can now execute actions, control devices, retrieve database information, and perform various functions by leveraging external tools and services. Due to their unique capabilities, LLMs have a fundamental weakness: they can’t do anything. They read a sequence of input tokens (the prompt) and produce a sequence of output tokens (one at a time) known as the completion. There are no side effects—just inputs and outputs. So, something else, such as the application you are building, has to take the LLM’s output and do something useful.

Ensuring Reliable Outputs: Using Function Calls to Align LLM Responses with Application Requirements

But how can we get an LLM to generate output that reliably conforms to our application’s requirements? Function calls, also known as tool usages, make it easier for your application to do something useful with an LLM’s output.

What Problem Does LLM Function Calling Solve?

Function calls, also known as tool use or API calls, is a technique that allows LLMs to interface with external systems, APIs, and tools. By providing the LLM with a set of functions or tools, along with their descriptions and usage instructions, the model can intelligently select and invoke the appropriate functions to accomplish a given task. This game-changer capability enables LLMs to break free from their text-based limitations and interact with the real world. Instead of merely generating text, LLMs can now execute actions, control devices, retrieve database information, and perform various tasks by leveraging external tools and services.

Evaluating Function-Calling LLMs: Capabilities, Benchmarks, and Practical Applications

Only some LLM are capable of utilizing function-calling capabilities. Those LLMs that are exclusively trained or fine-tuned possess the ability to determine whether the prompt demands function calling. The Berkeley Function-Calling Leaderboard provides insight into how LLMs perform across various programming languages and API scenarios, showing the versatility and robustness of function-calling models in handling multiple, parallel, and complex function executions.

This versatility is crucial for developing AI agents that can operate across different software ecosystems and handle tasks that require simultaneous actions.

How Does LLM Function Calling Work?

Applications typically invoke the LLM with function-calling capabilities twice:

Once to map the prompt into the target function name and its input arguments
Again to send the output of the invoked function to generate the final response.

The workflow below shows how the application, function, and LLM exchange messages to complete the cycle.

The user sends a prompt that may demand access to the function. For example, “What’s the current weather in New Delhi?”
The application sends the prompt along with all the available functions. In our example, this may be the prompt and input schema of the function get_current_weather(city). The LLM determines whether the prompt requires function calling. If yes, it looks up the provided list of functions and their respective schemas and responds with a JSON dictionary populated with the set of functions and their input arguments.
The application parses the LLM response. If it contains the functions, it invokes them sequentially or in parallel.
The output from each function is then included in the final prompt and sent to the LLM. Since the model now has access to the data, it responds with an answer based on the factual data provided by the functions.

Function Calling Misnomers and Misconceptions

The name ‘function call’ is somewhat misleading because it sounds like the LLM will actually do something on your behalf (and thereby cause side effects). But it doesn’t. When the LLM decides to ‘call’ a function, it will generate output that represents a request to call that function.

It’s still the responsibility of your application to handle that request and do something with it, but now you can trust the shape of the payload.

Conceptual Function Definitions: Enhancing LLM Flexibility and Application Alignment

For this reason, an LLM function doesn’t need to be directly mapped to any true function or method in your application or any real API. Instead, LLM functions can (and probably should) be defined as more conceptual from the perspective of the LLM.

What are the Use Cases of Function Calling?

The primary use cases of function calling fall into two major categories.

1. Agents

Function calling allows you to create intelligent AI agents capable of interacting with external systems and performing complex tasks:

Fetch data: AI agents can perform tasks such as:
- Web searches
- Fetch data from APIs
- Browse the Internet to provide real-time information

For example, ChatGPT can retrieve the latest news or weather updates by invoking external services.

Execute complex workflows: By leveraging function calling, agents can execute complex workflows that involve:
- Multiple
- Parallel
- Sequential tasks

For example, all within a single multi-turn interaction, an agent might:

Book a flight
Update a calendar
Send a confirmation email
Implement agentic RAG: Function calling allows dynamically generating SQL queries or API calls, which means LLMs can fetch relevant information from databases or knowledge repositories.

2. Data extraction

Function calling requires the LLM used to execute the function call in an exact format defined by the function’s signature and description. For example, JSON is often used as the representation for the function call, making function calling an effective tool for extracting structured data from unstructured data. For example, an LLM with function-calling capabilities can process large volumes of medical records, extracting key information such as:

Patient names
Diagnoses
Prescribed medications
Treatment plans

Converting unstructured data into structured JSON objects allows healthcare providers to easily parse the information for further analysis, reporting, or integration into EHR systems.

When to Use Function Calling for LLMs

What are some examples of situations where using function calling for LLMs makes sense?

When you need to send the output of a LLM call to another function or API. In general, the function calling feature was designed for situations where you need to extract information from a document and pass that information to another function or API call in mind. That means these situations are significant cases for use in function-calling features.
Function calls can be used when you need to extract structured data from a freeform text document. For example, if you want to extract a specific name or date from an unstructured text, this is a great case for function calls.

You should use function calls when operating in a situation where your outputs must have a consistent format. While it is sometimes possible to get an LLM to return an output in a structured format without using function calling, there are fewer guarantees about the consistency of the output format.

When Not to Use Function Calling for LLMs

Here are some examples of situations in which you may be better off investigating other techniques for optimizing LLMs.

Function Calling for Structured Output

When your goal is only to improve predictive performance. The magic of function calling comes from its ability to convert unstructured text to structured data. It is primarily used to ensure that models produce an appropriately formatted output rather than to enhance the accuracy of the information in the production itself. Suppose your main goal is to improve the predictive performance of your model rather than to ensure that the output is formatted correctly.

In that case, you may be better off looking into techniques like:

Retrieval augmented generation
Prompt chaining
Basic prompt engineering

Avoiding Vendor Lock-In with Function Calling

When you want to avoid vendor lock-in. Function calling capabilities are only built into a few models provided by one or two specific vendors. If you want to avoid landing yourself in a situation where you are tightly locked into using one specific vendor, then you may not want to leverage these capabilities.

Open-Source vs. Proprietary Models for Function Calling

discussion between developers - LLM Function Calling

Function calling is a powerful capability that enhances the utility of LLMs for automation tasks. Proprietary LLMs like GPT-4 have built-in support for function calling, making integration more straightforward. By contrast, using open-source models like Mistral or Llama 3.1 can be more complex and require additional customization. So, why choose open-source models for function calling? Let's take a look.

Data Security and Privacy

Many enterprises prioritize keeping sensitive data and processes in-house. Open-source LLMs provide greater control over data handling and privacy. This means organizations can feel more comfortable using these models for function calling with sensitive information.

Proprietary Integration

Many function-calling use cases involve accessing:

Proprietary systems
Data sources
Knowledge bases

For example, an LLM agent responsible for automating a business-critical process may need access to sensitive, organization-specific workflows and data. Private deployment with open-source models can keep proprietary information secure and private.

Customization

Open-source models can be fine-tuned to adapt to specific formats or domain-specific languages, often using proprietary and sensitive training data. This level of customization can prove invaluable for organizations that require tailored solutions to address their unique function-calling tasks.

Challenges in Function Calling with Open-Source LLMs

While open-source LLMs offer advantages in terms of flexibility and customization, you may find it difficult to implement function calling with these models given the following challenges: Constrained Output One of the primary difficulties is getting open-source LLMs to produce outputs in specific formats for function calls.

Structured Outputs

Function calling often relies on well-structured data formats like JSON or YAML that match the expected parameters of the function being called. This is essential to making function calling reliable, especially when the LLM is integrated into automated workflows.

Open-source LLMs can sometimes deviate from the instructions and produce poorly formatted outputs or contain unnecessary information. Several tools and techniques, such as Outlines, Instructor, and Jsonformer, have been developed to address this challenge.

Specialized Formats

Specific tasks require LLMs to generate more specialized outputs, such as SQL queries or domain-specific languages. This is often implemented using grammar or regex modes, restricting the LLM’s output to match a predefined format.

LLM Capabilities

Different open-source LLMs have varied capabilities and limitations regarding supporting multiple types of function calls, such as:

Single
Parallel
Nested (sequential)

Building and Scaling Compound AI Systems

An LLM application with function calling typically includes multiple components, such as:

Model inference
Custom business logic
User-defined functions
External API integrations

Choosing LLMs With Function Calling Support

Function calling puts LLMs on steroids. By allowing the models to call external functions, the capabilities of LLMs are automatically enhanced. Selecting the right LLM for your function-calling project is critical for success.

LLM Function Calling Basics

Function calling lets LLMs interact with custom applications and external data sources to retrieve information and perform computations. This capability significantly enhances the performance of LLMs for many tasks. With function calling, LLMs can go beyond their trained knowledge and reduce hallucinations by accessing real-time data. The more robust an LLM’s function calling capabilities are, the better it will perform for your specific use case.

Models with Advanced Function Calling Features

Not all LLMs support function calling, and those that do can vary dramatically in performance. OpenAI’s GPT-4 and GPT-3.5 Turbo models are the most well-known commercial LLMs that support function calling. This allows developers to define custom functions that the LLM can call during inference to retrieve external data or perform computations. The LLM outputs a JSON object containing the function name and arguments.

The developer’s code can execute this and return the function output to the LLM. Google’s Gemini LLM also supports function calling through the Vertex AI and Google AI Studio. Developers can define functions and descriptions, which the Gemini model can invoke during inference by returning structured JSON data. Anthropic’s Claude 3 family of LLMs has an API that enables function-calling capabilities similar to OpenAI’s models.

Exploring Leading LLMs with Function-Calling Capabilities: From Cohere to Open-Source Models

Cohere’s Command R and Command R+ LLMs also provide an API for function calling, allowing integration with external tools and data sources. The open-source Mistral 7B LLM has demonstrated function-calling capabilities, allowing developers to define custom functions the model can invoke during inference.

NexusRaven is an open-source 13B LLM specifically designed for advanced function calling, surpassing even GPT-4 in some benchmarks for invoking cybersecurity tools and APIs. The Gorilla OpenFunctions model is a 7B LLM fine-tuned on API documentation. It can generate accurate function calls and API requests from natural language prompts.

Comparing Advanced Function-Calling Models: FireFunction V1, Hermes 2 Pro, and Orion-14B-Chat-Plugin

Fireworks FireFunction V1 is an open-source function calling model based on the Mixtral 8x7B model. It achieves near GPT-4 level quality for real-world use cases of structured information generation and routing decision-making. Hermes 2 Pro is a 7B parameter model that excels at function calling, JSON-structured outputs, and general tasks. It achieves 90% accuracy on function-calling evaluation and 81% on structured JSON output evaluation built with Fireworks.ai.

Hermes 2 Pro is fine-tuned on the Mistral 7B and Llama 3 8B models, offering developers a choice. The Orion-14B-Chat-Plugin stands out in function calls due to its specialized design and impressive performance. It’s an excellent choice for those looking to leverage the power of large language models for plugin and function call tasks.

Evaluation Criteria: What to Look For

Function calling support is essential for creating agentic workflows and retrieval-augmented generation agents. The next step is to evaluate LLM options and choose the one that best fits your use case. Key criteria to consider include:

Ease of Use

Look for LLMs with function calling features that are easy to use and customize. A straightforward setup and execution process will save your team time and frustration.

Customization

Every use case is unique. Choose an LLM that lets you create tailored functions to suit your specific needs. This will help improve the overall performance of your function calling LLM.

JSON Output Structure

Function calling inherently relies on structured outputs to work effectively. Evaluate LLMs based on their ability to generate structured outputs, specifically in JSON format. The better an LLM is at this task, the more seamless your integration will be.

Model Performance

Consider the base performance of the LLM you choose. Before introducing function calling, read up on benchmarks and testing to assess how well the model performs for your intended use case.

Start Building GenAI Apps for Free Today with Our Managed Generative AI Tech Stack

Lamatic offers a managed Generative AI Tech Stack.

Our solution provides:

Managed GenAI Middleware
Custom GenAI API (GraphQL)
Low Code Agent Builder
Automated GenAI Workflow (CI/CD)
GenOps (DevOps for GenAI)
Edge deployment via Cloudflare workers
Integrated Vector Database (Weaviate)

Lamatic empowers teams to rapidly implement GenAI solutions without accruing tech debt. Our platform automates workflows and ensures production-grade deployment on the edge, enabling fast, efficient GenAI integration for products needing swift AI capabilities.

Start building GenAI apps for free today with our managed generative AI tech stack.

What is LLM Function Calling, and What Problem Does It Solve?

Ensuring Reliable Outputs: Using Function Calls to Align LLM Responses with Application Requirements

What Problem Does LLM Function Calling Solve?

Evaluating Function-Calling LLMs: Capabilities, Benchmarks, and Practical Applications

How Does LLM Function Calling Work?

Function Calling Misnomers and Misconceptions

Conceptual Function Definitions: Enhancing LLM Flexibility and Application Alignment

What are the Use Cases of Function Calling?

1. Agents

2. Data extraction

When to Use Function Calling for LLMs

When Not to Use Function Calling for LLMs

Function Calling for Structured Output

Avoiding Vendor Lock-In with Function Calling

Related Reading

Open-Source vs. Proprietary Models for Function Calling

Data Security and Privacy

Proprietary Integration

Customization

Challenges in Function Calling with Open-Source LLMs

Structured Outputs

Specialized Formats

LLM Capabilities

Building and Scaling Compound AI Systems

Related Reading

Choosing LLMs With Function Calling Support

LLM Function Calling Basics

Models with Advanced Function Calling Features

Exploring Leading LLMs with Function-Calling Capabilities: From Cohere to Open-Source Models

Comparing Advanced Function-Calling Models: FireFunction V1, Hermes 2 Pro, and Orion-14B-Chat-Plugin

Evaluation Criteria: What to Look For

Ease of Use

Customization

JSON Output Structure

Model Performance

Start Building GenAI Apps for Free Today with Our Managed Generative AI Tech Stack

Related Reading

Related Articles

By-Step Guide on How to Build AI and AI Systems From Scratch

Ultimate Gen AI vs AI Comparison Guide

7-Step Generative AI Customer Experience Strategy & 11 Best Tools for Success

Flow Control, Crawler Modes & Usability Boosts

Top 24 Gemini Alternatives for Superior GenAI App Deployment

33 Best GenAI Tools for Product Managers and Product Teams