rtsp-video-analysis-system/python-inference-service/README.md

# Python推理服务

基于FastAPI的YOLOv8目标检测推理服务。

## 功能特性

- 支持YOLOv8模型推理
- RESTful API接口
- 支持Base64图像和文件上传
- 支持GPU加速（可选）
- Docker部署支持

## 模型要求

本服务使用**YOLOv8**（Ultralytics）进行目标检测。

### 模型文件准备

1. **模型文件**: 将YOLOv8训练好的模型文件命名为`best.pt`，放在`models/`目录下
2. **类别文件**: （可选）创建`classes.txt`文件，每行一个类别名称
3. **配置文件**: `models.json`配置模型参数

### 目录结构

```
python-inference-service/
├── app/
│   ├── __init__.py
│   ├── main.py          # FastAPI应用
│   ├── detector.py      # 检测器封装
│   └── models.py        # 数据模型
├── models/
│   ├── best.pt          # YOLOv8模型文件（必需）
│   ├── classes.txt      # 类别名称（可选）
│   ├── yolov8_model.py  # YOLOv8模型包装类
│   └── models.json      # 模型配置
├── requirements.txt
└── Dockerfile
```

## 安装依赖

```bash
pip install -r requirements.txt
```

主要依赖：
- `ultralytics>=8.0.0` - YOLOv8框架
- `fastapi` - Web框架
- `uvicorn` - ASGI服务器
- `opencv-python` - 图像处理
- `torch` - PyTorch

## 配置模型

编辑`models/models.json`：

```json
[
  {
    "name": "yolov8_detector",
    "path": "models/yolov8_model.py",
    "size": [640, 640],
    "comment": "YOLOv8检测模型"
  }
]
```

参数说明：
- `name`: 模型名称（API调用时使用）
- `path`: 模型包装类的路径
- `size`: 输入图像尺寸 [宽度, 高度]

## 启动服务

### 本地启动

```bash
# 启动服务（默认端口8000）
uvicorn app.main:app --host 0.0.0.0 --port 8000

# 或使用启动脚本
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000
```

### Docker启动

```bash
# 构建镜像
docker build -t python-inference-service .

# 运行容器
docker run -p 8000:8000 \
  -v $(pwd)/models:/app/models \
  python-inference-service
```

### 使用GPU

```bash
# 确保安装了NVIDIA Docker Runtime
docker run --gpus all -p 8000:8000 \
  -v $(pwd)/models:/app/models \
  python-inference-service
```

## API接口

服务启动后访问：http://localhost:8000/docs 查看API文档

### 1. 健康检查

```bash
GET /health
```

### 2. 获取可用模型列表

```bash
GET /api/models
```

### 3. Base64图像检测

```bash
POST /api/detect
Content-Type: application/json

{
  "model_name": "yolov8_detector",
  "image_data": "base64_encoded_image_string"
}
```

### 4. 文件上传检测

```bash
POST /api/detect/file
Content-Type: multipart/form-data

model_name: yolov8_detector
file: <image_file>
```

## 响应格式

```json
{
  "model_name": "yolov8_detector",
  "detections": [
    {
      "label": "[yolov8_detector] 类别名",
      "confidence": 0.95,
      "x": 100,
      "y": 150,
      "width": 200,
      "height": 180,
      "color": 65280
    }
  ],
  "inference_time": 45.6
}
```

## 自定义模型

要使用自己训练的YOLOv8模型：

1. **训练模型**：使用Ultralytics YOLOv8训练您的模型
   ```python
   from ultralytics import YOLO

   model = YOLO('yolov8n.yaml')
   model.train(data='your_data.yaml', epochs=100)
   ```

2. **导出模型**：训练完成后会生成`best.pt`文件

3. **准备类别文件**：创建`classes.txt`
   ```
   class1
   class2
   class3
   ```

4. **放置文件**：将`best.pt`和`classes.txt`放到`models/`目录

5. **更新配置**：确保`models.json`配置正确

6. **重启服务**

## 环境变量

- `MODEL_DIR`: 模型目录路径（默认：`/app/models`）
- `MODELS_JSON`: 模型配置文件路径（默认：`models/models.json`）

## 性能优化

### GPU加速

服务会自动检测GPU并使用。如果有多张GPU，可以指定：

```bash
CUDA_VISIBLE_DEVICES=0 uvicorn app.main:app --host 0.0.0.0 --port 8000
```

### 置信度阈值

在`yolov8_model.py`中调整：

```python
self.conf_threshold = 0.25  # 降低阈值检测更多目标
```

## 故障排查

### 模型加载失败

```
错误：找不到 best.pt
解决：确保模型文件在 models/ 目录下
```

### GPU不可用

```
错误：CUDA not available
解决：
1. 检查NVIDIA驱动
2. 检查PyTorch GPU版本
3. 检查CUDA版本兼容性
```

### 推理速度慢

```
解决：
1. 使用GPU加速
2. 使用更小的模型（如yolov8n.pt）
3. 减小输入图像尺寸
```

## 开发者

如需修改或扩展功能，请参考：
- `app/main.py` - API路由定义
- `app/detector.py` - 检测器基类
- `models/yolov8_model.py` - YOLOv8模型包装类

## 许可证

[根据项目实际许可证填写]