2025-03-27 19:17:58 +08:00
2025-03-25 11:36:12 +08:00
2025-03-25 11:36:12 +08:00
2025-03-27 19:17:58 +08:00
2025-02-20 17:40:30 +08:00
2025-03-27 19:17:58 +08:00
2025-03-27 14:21:21 +08:00
2025-03-27 16:06:26 +08:00

OpenAI Compatible API Server For StackFlow

Overview

This server provides an OpenAI-compatible API interface supporting multiple AI model backends including LLMs, vision models, speech synthesis (TTS), and speech recognition (ASR).

Quick Start

  1. Install dependencies:
pip install -r requirements.txt
  1. Start the server:
python main.py

Supported Endpoints

Chat Completions

  • Endpoint: POST /v1/chat/completions
  • Request Format: OpenAI-compatible chat completion request
  • Streaming: Supported

Text Completions

  • Endpoint: POST /v1/completions
  • Request Format: OpenAI-compatible completion request
  • Streaming: Supported

Speech Synthesis (TTS)

  • Endpoint: POST /v1/audio/speech
  • Parameters:
    • model: TTS model name
    • input: Text to synthesize
    • voice: Voice type
    • response_format: Audio format (mp3, wav, etc.)

Speech Recognition (ASR)

  • Transcription: POST /v1/audio/transcriptions
    • Converts speech to text in the same language
  • Translation: POST /v1/audio/translations
    • Converts speech to English text
  • Parameters:
    • file: Audio file
    • model: ASR model name
    • language (transcription only): Source language
    • prompt: Optional prompt

List Models

  • Endpoint: GET /v1/models
  • Returns: List of available models

FAQ

Q: Why am I getting "Unsupported model" errors?

A: The model name must exactly match one of the configured models in your config file.

Q: How do I enable streaming responses?

A: Set "stream": true in your request body for chat/completion endpoints.

Q: What audio formats are supported for ASR?

A: The supported formats depend on your ASR backend implementation.

Q: How do I manage model memory usage?

A: The server implements a pool system for LLM models - adjust pool_size in the config to control concurrent instances.

Troubleshooting

  • Logs: Check server logs for detailed error messages
  • Model Initialization: Verify all required backend services are running
  • Configuration: Double-check model names and parameters in config.yaml

Example Requests

Chat Completion

curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_KEY" \
-d '{
  "model": "gpt-3.5-turbo",
  "messages": [{"role": "user", "content": "Hello!"}],
  "temperature": 0.7
}'

Speech Synthesis

curl -X POST "http://localhost:8000/v1/audio/speech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_KEY" \
-d '{
  "model": "tts-1",
  "input": "Hello world!",
  "voice": "alloy"
}'

Required Libraries:

License

S
Description
No description provided
Readme 1.8 MiB
Languages
Python 100%