Building a universal function call adapter for LLM orchestration

At 2501.ai, we’re dedicated to pushing the boundaries of Large Language Model (LLM) capabilities, to maximize performance. One significant challenge we’ve repeatedly encountered while building our native LLM orchestration framework is the instability and lack of supporting functions to call between multiple platforms and inference methods. This inconsistency restricts integration and utilization of LLMs in complex environments.

Today, we're going to share how we overcame this barrier: how we built our in-house universal function call adapter to enhance compatibility and reliability of the different models and inference platforms we're using.

The pain point - function calling instability

Function calling mechanisms in LLMs are powerful tools that enable dynamic interactions and extend the models’ capabilities. However, we noticed:

1. Inconsistencies between platforms

Platforms implement function calling differently into their models, leading to unpredictable behavior and unstable performance. We had to figure out how to ensure all models behave in a similar way, or ideally to find a reliable function calling implementation.

2. Lack of standardized configurations

Despite the efforts of big players to standardize the function calling methodology for implementation, there wasn’t yet a universally accepted standard for function calls, causing compatibility issues. Most LLM providers or inferences are using considerably different standards when it comes to configuration for their API calls or SDK.

OpenAI's gpt-4o function calling schema and its use

3. Limited support across models and inferences

Not all LLMs or inference engines support function calling natively. This limitation restricts the deployment of advanced functionalities and affects the overall user experience with the limited usage potential. It's a shame to see a high performance model penalized by the lack of function calling with native SDK/API features.

The need for a universal solution

To provide the best possible service and ensure that our LLM orchestration is strong and flexible, we needed a system that:

Works universally and is compatible with as many models and inference methods as possible.
Adheres to standards and follows established JSON schema for ease of parsing and consistency.
Is lightweight and efficient at minimizing overhead and avoiding system burden.

Our answer: A universal function call adapter

We developed a universal function call adapter that operates within the system prompt of the LLM. Here’s how it addresses the challenges:

1. System prompt integration

Embedding the adapter directly into the system prompt means that it doesn’t rely on external functions or platform-specific features. This approach increases compatibility and reduces dependencies.

2. Optimized and lightweight

Our adapter is designed to be minimalistic. It provides essential instructions without adding unnecessary complexity, ensuring efficient operation.

Implementation details

Below is the core template of our function call adapter; an add-on template that we're going to inject into our final system prompt:

System prompt add-on template used to parse the schema

How does it work?

Following lightweight principles, our template includes on purpose only the minimum needed to perform properly:

‍LLM guidance - the template provides clear instructions to the LLM on how to format function calls.‍
JSON structure - it enforces a consistent JSON structure that can be easily parsed by any system.‍
Example included - by providing an example, we help the LLM understand the expected output format.

Parsing the response

To extract the function calls from the LLM’s response, we use a regular expression:

Regular expressions used to extract the functions

This regex captures everything between ##FUNCTIONS_JSON## and ##END_FUNCTIONS_JSON##, allowing us to isolate the JSON containing the function calls.

Benefits of our universal adapter

After consistently testing the system prompt add-on template, we're happy to share our key achievements:

‍Enhanced compatibility - it works with a wide range of models and inference methods.‍
Consistency - it standardizes function calling, reducing errors caused by inconsistent implementations.
‍Simplicity - the lightweight nature of the adapter means it doesn’t complicate the system or slow down operations.

Conclusion

Implementing a universal function call adapter has significantly improved the stability and compatibility of our LLM orchestration at 2501.ai. By adhering to JSON standards and integrating directly with the system prompt, we developed a solution that’s both efficient and widely compatible.

This approach not only solves our immediate challenges, but also sets a foundation for future developments. We hope that sharing our experience can help others facing similar issues in the LLM space.

At 2501.ai, we’re committed to continuous innovation and collaboration. If you’re interested in learning more about our work or have any questions, feel free to reach out. Together, we can push the boundaries of what’s possible with AI.

Incidence Auto-Resolution

Maintenance Automation

Universal Gateway

MCP / A2A plugins

Finance and Insurances

Heavy industries

Managed Services Providers