You've already forked ModuleLLM-OpenAI-Plugin
mirror of
https://github.com/m5stack/ModuleLLM-OpenAI-Plugin.git
synced 2026-05-20 11:37:26 -07:00
2.8 KiB
2.8 KiB
OpenAI Compatible API Server For StackFlow
Overview
This server provides an OpenAI-compatible API interface supporting multiple AI model backends including LLMs, vision models, speech synthesis (TTS), and speech recognition (ASR).
Quick Start
- Install dependencies:
pip install -r requirements.txt
- Start the server:
python3 api_server.py
Supported Endpoints
Chat Completions
- Endpoint:
POST /v1/chat/completions - Request Format: OpenAI-compatible chat completion request
- Streaming: Supported
Text Completions
- Endpoint:
POST /v1/completions - Request Format: OpenAI-compatible completion request
- Streaming: Supported
Speech Synthesis (TTS)
- Endpoint:
POST /v1/audio/speech - Parameters:
model: TTS model nameinput: Text to synthesizevoice: Voice typeresponse_format: Audio format (mp3, wav, etc.)
Speech Recognition (ASR)
- Transcription:
POST /v1/audio/transcriptions- Converts speech to text in the same language
- Translation:
POST /v1/audio/translations- Converts speech to English text
- Parameters:
file: Audio filemodel: ASR model namelanguage(transcription only): Source languageprompt: Optional prompt
List Models
- Endpoint:
GET /v1/models - Returns: List of available models
FAQ
Q: Why am I getting "Unsupported model" errors?
A: The model name must exactly match one of the configured models in your config file.
Q: How do I enable streaming responses?
A: Set "stream": true in your request body for chat/completion endpoints.
Q: What audio formats are supported for ASR?
A: The supported formats depend on your ASR backend implementation.
Q: How do I manage model memory usage?
A: The server implements a pool system for LLM models - adjust pool_size in the config to control concurrent instances.
Troubleshooting
- Logs: Check server logs for detailed error messages
- Model Initialization: Verify all required backend services are running
- Configuration: Double-check model names and parameters in config.yaml
Example Requests
Chat Completion
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_KEY" \
-d '{
"model": "qwen2.5-0.5B-p256-ax630c",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7
}'
Speech Synthesis
curl -X POST "http://localhost:8001/v1/audio/speech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_KEY" \
-d '{
"model": "melotts_zh-cn",
"input": "Hello world!",
"voice": "alloy"
}' \
--output output.mp3