Function Calling#

LightLLM supports function calling for multiple mainstream models. Provides OpenAI-compatible API.

Supported Models#

Qwen2.5/Qwen3#

Parser: qwen25

Format:

<tool_call>
{"name": "function_name", "arguments": {"param": "value"}}
</tool_call>

Startup:

python -m lightllm.server.api_server \
    --model_dir /path/to/qwen2.5 \
    --tool_call_parser qwen25 \
    --tp 1

Llama 3.2#

Parser: llama3

Format: <|python_tag|>{"name": "func", "arguments": {...}}

Startup:

python -m lightllm.server.api_server \
    --model_dir /path/to/llama-3.2 \
    --tool_call_parser llama3 \
    --tp 1

Mistral#

Parser: mistral

Format: [TOOL_CALLS] [{"name": "func", "arguments": {...}}, ...]

DeepSeek-V3#

Parser: deepseekv3

Format:

<｜tool▁calls▁begin｜>
<｜tool▁call▁begin｜>function<｜tool▁sep｜>func_name
```json
{"param": "value"}
```
<｜tool▁call▁end｜>
<｜tool▁calls▁end｜>

DeepSeek-V3.1#

Parser: deepseekv31

Format: Simplified V3 format, parameters directly inlined without code blocks

Basic Usage#

Define Tools#

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["city"]
            }
        }
    }
]

Non-Streaming#

import requests
import json

url = "http://localhost:8088/v1/chat/completions"
data = {
    "model": "model_name",
    "messages": [
        {"role": "user", "content": "What's the weather in Beijing?"}
    ],
    "tools": tools,
    "tool_choice": "auto"  # "auto" | "none" | "required"
}

response = requests.post(url, json=data).json()
message = response["choices"][0]["message"]

if message.get("tool_calls"):
    for tc in message["tool_calls"]:
        print(f"Tool: {tc['function']['name']}")
        print(f"Args: {tc['function']['arguments']}")

Streaming#

data = {
    "model": "model_name",
    "messages": [{"role": "user", "content": "Check weather for Beijing and Shanghai"}],
    "tools": tools,
    "stream": True
}

response = requests.post(url, json=data, stream=True)
tool_calls = {}

for line in response.iter_lines():
    if line and line.startswith(b"data: "):
        chunk = json.loads(line[6:])
        delta = chunk["choices"][0]["delta"]

        if delta.get("tool_calls"):
            for tc in delta["tool_calls"]:
                idx = tc.get("index", 0)
                if idx not in tool_calls:
                    tool_calls[idx] = {"function": {"name": "", "arguments": ""}}

                if tc["function"].get("name"):
                    tool_calls[idx]["function"]["name"] = tc["function"]["name"]
                if tc["function"].get("arguments"):
                    tool_calls[idx]["function"]["arguments"] += tc["function"]["arguments"]

Multi-Turn Conversation#

# 1. User question
messages = [{"role": "user", "content": "How's the weather in Beijing?"}]

# 2. Model calls tool
response1 = requests.post(url, json={
    "messages": messages,
    "tools": tools
}).json()

tool_call = response1["choices"][0]["message"]["tool_calls"][0]
messages.append(response1["choices"][0]["message"])

# 3. Return tool result
weather_result = {"temperature": 15, "condition": "sunny"}
messages.append({
    "role": "tool",
    "tool_call_id": tool_call["id"],
    "name": tool_call["function"]["name"],
    "content": json.dumps(weather_result)
})

# 4. Generate final answer
response2 = requests.post(url, json={"messages": messages}).json()
print(response2["choices"][0]["message"]["content"])

Advanced Features#

Parallel Tool Calls#

data = {
    "messages": messages,
    "tools": tools,
    "parallel_tool_calls": True  # Enable parallel calls
}

Force Specific Tool#

data = {
    "tools": tools,
    "tool_choice": {
        "type": "function",
        "function": {"name": "get_weather"}
    }
}

Integration with Reasoning Models#

data = {
    "model": "deepseek-r1",
    "tools": tools,
    "chat_template_kwargs": {"enable_thinking": True},
    "separate_reasoning": True  # Separate reasoning content
}

response = requests.post(url, json=data).json()
message = response["choices"][0]["message"]

print("Reasoning:", message.get("reasoning_content"))
print("Tool calls:", message.get("tool_calls"))

Common Issues#

Tool calls not triggered: Check --tool_call_parser parameter and tool descriptions
Parameter parsing errors: Confirm correct parser is used, check model output format
Incomplete streaming: Process all chunks correctly, use index field to assemble multiple calls
Integration with reasoning models fails: Use latest version, configure separate_reasoning and chat_template_kwargs

Technical Details#

Core Files: - lightllm/server/function_call_parser.py - Parser implementation - lightllm/server/api_openai.py - API integration - lightllm/server/build_prompt.py - Tool injection - test/test_api/test_openai_api.py - Test examples

Related PRs: - PR #1158: Function call in reasoning content support

References#

OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
JSON Schema: https://json-schema.org/
LightLLM GitHub: ModelTC/lightllm

Function Calling

Contents