# OpenAI Compatible API Server For StackFlow

## Overview
This server provides an OpenAI-compatible API interface supporting multiple AI model backends including LLMs, vision models, speech synthesis (TTS), and speech recognition (ASR).

## Quick Start

1. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Start the server:
```bash
python3 api_server.py 
```

## Supported Endpoints

### Chat Completions
- **Endpoint**: `POST /v1/chat/completions`
- **Request Format**: OpenAI-compatible chat completion request
- **Streaming**: Supported

### Text Completions
- **Endpoint**: `POST /v1/completions`
- **Request Format**: OpenAI-compatible completion request
- **Streaming**: Supported

### Speech Synthesis (TTS)
- **Endpoint**: `POST /v1/audio/speech`
- **Parameters**:
  - `model`: TTS model name
  - `input`: Text to synthesize
  - `voice`: Voice type
  - `response_format`: Audio format (mp3, wav, etc.)

### Speech Recognition (ASR)
- **Transcription**: `POST /v1/audio/transcriptions`
  - Converts speech to text in the same language
- **Translation**: `POST /v1/audio/translations`
  - Converts speech to English text
- **Parameters**:
  - `file`: Audio file
  - `model`: ASR model name
  - `language` (transcription only): Source language
  - `prompt`: Optional prompt

### List Models
- **Endpoint**: `GET /v1/models`
- **Returns**: List of available models

## FAQ

### Q: Why am I getting "Unsupported model" errors?
A: The model name must exactly match one of the configured models in your config file.

### Q: How do I enable streaming responses?
A: Set `"stream": true` in your request body for chat/completion endpoints.

### Q: What audio formats are supported for ASR?
A: The supported formats depend on your ASR backend implementation.

### Q: How do I manage model memory usage?
A: The server implements a pool system for LLM models - adjust `pool_size` in the config to control concurrent instances.

## Troubleshooting

- **Logs**: Check server logs for detailed error messages
- **Model Initialization**: Verify all required backend services are running
- **Configuration**: Double-check model names and parameters in config.yaml

## Example Requests

### Chat Completion
```bash
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_KEY" \
-d '{
  "model": "qwen2.5-0.5B-p256-ax630c",
  "messages": [{"role": "user", "content": "Hello!"}],
  "temperature": 0.7
}'
```

### Speech Synthesis
```bash
curl -X POST "http://localhost:8001/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_KEY" \
  -d '{
    "model": "melotts_zh-cn",
    "input": "Hello world!",
    "voice": "alloy"
  }' \
  --output output.mp3
```

## Required Libraries:

- [StackFlow](https://github.com/m5stack/StackFlow)

## License

- [M5Module-LLM_OpenAI_API- MIT](LICENSE)