litellm-gateway/docs/SKILL_INTEGRATION.md

181 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Sileya Skill 接入 — lite-llm-admin
## 目标
Sileya 持有 admin key可以通过 API 管理 LiteLLM Gateway
- 查询各 agent 用量
- 生成/封禁 key
- 动态调整 fallback 策略
- 监控网关健康状态
---
## Skill 核心接口
### 基础信息
```
Base URL: http://<your-server>:4000
Admin Key: sk-litellm-admin-key西莉雅持有
Header: Authorization: Bearer <admin_key>
```
### API 列表
#### 1. 生成新 agent key
```http
POST /key/generate
Content-Type: application/json
Authorization: Bearer <admin_key>
{
"models": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.5-Lightning"],
"metadata": {
"agent": "李狗蛋",
"owner": "sileya"
}
}
Response:
{
"key": "sk-litellm-xxxxx",
"expires": null,
"models": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.5-Lightning"]
}
```
#### 2. 查询 key 用量
```http
GET /key/info?key=sk-litellm-xxxxx
Authorization: Bearer <admin_key>
Response:
{
"key": "sk-litellm-xxxxx",
"spend": 0.0001065,
"models": [...],
"expires": null
}
```
#### 3. 查询总用量
```http
GET /spend
Authorization: Bearer <admin_key>
Response:
{
"total_spend": 12.345,
"key_spends": [
{"key": "sk-litellm-xxxxx", "spend": 5.123, "agent": "李狗蛋"},
{"key": "sk-litellm-yyyyy", "spend": 7.222, "agent": "妮可"}
]
}
```
#### 4. 封禁 key
```http
POST /key/block
Content-Type: application/json
Authorization: Bearer <admin_key>
{
"key": "sk-litellm-xxxxx"
}
```
#### 5. 解封 key
```http
POST /key/unblock
Content-Type: application/json
Authorization: Bearer <admin_key>
{
"key": "sk-litellm-xxxxx"
}
```
#### 6. 动态添加模型(热生效)
```http
POST /model/new
Content-Type: application/json
Authorization: Bearer <admin_key>
{
"model_name": "gpt-4o",
"litellm_params": {
"model": "openai/gpt-4o",
"api_key": "sk-openai-xxx",
"rpm": 60
}
}
```
---
## Sileya Skill 实现
```python
# SKILL.md 核心逻辑
import requests
import os
LITELLM_BASE = os.environ["LITELLM_BASE_URL"] # http://<server>:4000
ADMIN_KEY = os.environ["LITELLM_ADMIN_KEY"]
def _headers():
return {"Authorization": f"Bearer {ADMIN_KEY}", "Content-Type": "application/json"}
def generate_agent_key(agent_name: str, models: list[str]):
"""给新 agent 分配独立 key"""
resp = requests.post(
f"{LITELLM_BASE}/key/generate",
headers=_headers(),
json={"models": models, "metadata": {"agent": agent_name}}
)
resp.raise_for_status()
return resp.json()["key"]
def get_all_spend():
"""查询总用量"""
resp = requests.get(f"{LITELLM_BASE}/spend", headers=_headers())
resp.raise_for_status()
return resp.json()
def block_agent_key(key: str):
"""封禁某 agent 的 key"""
resp = requests.post(
f"{LITELLM_BASE}/key/block",
headers=_headers(),
json={"key": key}
)
resp.raise_for_status()
def get_key_info(key: str):
"""查询单个 key 详情"""
resp = requests.get(
f"{LITELLM_BASE}/key/info",
headers=_headers(),
params={"key": key}
)
resp.raise_for_status()
return resp.json()
```
---
## 热更新保障
Sileya 的 key 配置(`sk-sileya-fixed`)在 `config.yaml` 中固定。管理员通过 API 修改其他 key 时Sileya 的 key 完全不受影响。
只有修改 `config.yaml` 本身的模型路由策略时需要重启网关,但这种情况极低频(可能一个月一次),且重启 < 5 秒。
---
## 使用场景
1. ** agent 上线** Sileya 调用 `/key/generate` 生成专属 key
2. **发现异常用量** Sileya 调用 `/key/info` 查询,必要时 `/key/block`
3. **月初对账** Sileya 调用 `/spend` 汇总所有 agent 消费
4. **529 高峰期** Sileya 监控 fallback 情况,确认回落正常