LightLLM OpenAI API Usage Examples

Contents

LightLLM OpenAI API Usage Examples#

LightLLM provides an interface that is fully compatible with OpenAI API, supporting all standard OpenAI features including function calling. This document provides detailed information on how to use LightLLM’s OpenAI interface.

Basic Configuration#

First, ensure that the LightLLM service is started:

# Start LightLLM service
python -m lightllm.server.api_server \
    --model_dir /path/to/your/model \
    --port 8088 \
    --tp 1

Basic Conversation Examples#

1. Simple Conversation#

import requests
import json

# Configuration
url = "http://localhost:8088/v1/chat/completions"
headers = {"Content-Type": "application/json"}

# Request data
data = {
    "model": "your_model_name",
    "messages": [
        {"role": "user", "content": "Hello, please introduce yourself"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
}

# Send request
response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    result = response.json()
    print("Reply:", result["choices"][0]["message"]["content"])
else:
    print("Error:", response.status_code, response.text)

2. Streaming Conversation#

import requests
import json

url = "http://localhost:8088/v1/chat/completions"
headers = {"Content-Type": "application/json"}

data = {
    "model": "your_model_name",
    "messages": [
        {"role": "user", "content": "Please write a short essay about artificial intelligence"}
    ],
    "stream": True,
    "temperature": 0.7,
    "max_tokens": 1000
}

# Streaming request
response = requests.post(url, headers=headers, json=data, stream=True)

if response.status_code == 200:
    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                data_str = line[6:]  # Remove "data: " prefix
                if data_str == '[DONE]':
                    break
                try:
                    chunk = json.loads(data_str)
                    if chunk['choices'][0]['delta'].get('content'):
                        print(chunk['choices'][0]['delta']['content'], end='', flush=True)
                except json.JSONDecodeError:
                    continue
else:
    print("Error:", response.status_code, response.text)

Function Calling Examples#

LightLLM supports OpenAI’s function calling functionality, providing function call parsing for three models. Specify the –tool_call_parser parameter when starting the service to choose. The service launch command is:

python -m lightllm.server.api_server \
    --model_dir /path/to/your/model \
    --port 8088 \
    --tp 1 \
    --tool_call_parser qwen25
# Optional parameters are qwen25, llama3, mistral

1. Basic Function Calling#

import requests
import json

url = "http://localhost:8088/v1/chat/completions"
headers = {"Content-Type": "application/json"}

# Define functions
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get current weather information for a specified city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name, e.g.: Beijing, Shanghai"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["city"]
            }
        }
    }
]

# Request data
data = {
    "model": "your_model_name",
    "messages": [
        {"role": "user", "content": "What's the weather like in Beijing today?"}
    ],
    "tools": tools,
    "tool_choice": "auto",  # Let the model automatically decide whether to call functions
    "temperature": 0.7,
    "max_tokens": 1000
}

# Send request
response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    result = response.json()
    message = result["choices"][0]["message"]

    # Check if there are function calls
    if message.get("tool_calls"):
        print("Model decided to call functions:")
        for tool_call in message["tool_calls"]:
            print(f"Function name: {tool_call['function']['name']}")
            print(f"Arguments: {tool_call['function']['arguments']}")
    else:
        print("Reply:", message["content"])
else:
    print("Error:", response.status_code, response.text)

2. Streaming Function Calling#

import requests
import json

url = "http://localhost:8088/v1/chat/completions"
headers = {"Content-Type": "application/json"}

tools = [
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "Mathematical expression"}
                },
                "required": ["expression"]
            }
        }
    }
]