181 lines
3.7 KiB
Markdown
181 lines
3.7 KiB
Markdown
# Sileya Skill 接入 — lite-llm-admin
|
||
|
||
## 目标
|
||
|
||
Sileya 持有 admin key,可以通过 API 管理 LiteLLM Gateway:
|
||
- 查询各 agent 用量
|
||
- 生成/封禁 key
|
||
- 动态调整 fallback 策略
|
||
- 监控网关健康状态
|
||
|
||
---
|
||
|
||
## Skill 核心接口
|
||
|
||
### 基础信息
|
||
```
|
||
Base URL: http://<your-server>:4000
|
||
Admin Key: sk-litellm-admin-key(西莉雅持有)
|
||
Header: Authorization: Bearer <admin_key>
|
||
```
|
||
|
||
### API 列表
|
||
|
||
#### 1. 生成新 agent key
|
||
```http
|
||
POST /key/generate
|
||
Content-Type: application/json
|
||
Authorization: Bearer <admin_key>
|
||
|
||
{
|
||
"models": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.5-Lightning"],
|
||
"metadata": {
|
||
"agent": "李狗蛋",
|
||
"owner": "sileya"
|
||
}
|
||
}
|
||
|
||
Response:
|
||
{
|
||
"key": "sk-litellm-xxxxx",
|
||
"expires": null,
|
||
"models": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.5-Lightning"]
|
||
}
|
||
```
|
||
|
||
#### 2. 查询 key 用量
|
||
```http
|
||
GET /key/info?key=sk-litellm-xxxxx
|
||
Authorization: Bearer <admin_key>
|
||
|
||
Response:
|
||
{
|
||
"key": "sk-litellm-xxxxx",
|
||
"spend": 0.0001065,
|
||
"models": [...],
|
||
"expires": null
|
||
}
|
||
```
|
||
|
||
#### 3. 查询总用量
|
||
```http
|
||
GET /spend
|
||
Authorization: Bearer <admin_key>
|
||
|
||
Response:
|
||
{
|
||
"total_spend": 12.345,
|
||
"key_spends": [
|
||
{"key": "sk-litellm-xxxxx", "spend": 5.123, "agent": "李狗蛋"},
|
||
{"key": "sk-litellm-yyyyy", "spend": 7.222, "agent": "妮可"}
|
||
]
|
||
}
|
||
```
|
||
|
||
#### 4. 封禁 key
|
||
```http
|
||
POST /key/block
|
||
Content-Type: application/json
|
||
Authorization: Bearer <admin_key>
|
||
|
||
{
|
||
"key": "sk-litellm-xxxxx"
|
||
}
|
||
```
|
||
|
||
#### 5. 解封 key
|
||
```http
|
||
POST /key/unblock
|
||
Content-Type: application/json
|
||
Authorization: Bearer <admin_key>
|
||
|
||
{
|
||
"key": "sk-litellm-xxxxx"
|
||
}
|
||
```
|
||
|
||
#### 6. 动态添加模型(热生效)
|
||
```http
|
||
POST /model/new
|
||
Content-Type: application/json
|
||
Authorization: Bearer <admin_key>
|
||
|
||
{
|
||
"model_name": "gpt-4o",
|
||
"litellm_params": {
|
||
"model": "openai/gpt-4o",
|
||
"api_key": "sk-openai-xxx",
|
||
"rpm": 60
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Sileya Skill 实现
|
||
|
||
```python
|
||
# SKILL.md 核心逻辑
|
||
|
||
import requests
|
||
import os
|
||
|
||
LITELLM_BASE = os.environ["LITELLM_BASE_URL"] # http://<server>:4000
|
||
ADMIN_KEY = os.environ["LITELLM_ADMIN_KEY"]
|
||
|
||
def _headers():
|
||
return {"Authorization": f"Bearer {ADMIN_KEY}", "Content-Type": "application/json"}
|
||
|
||
def generate_agent_key(agent_name: str, models: list[str]):
|
||
"""给新 agent 分配独立 key"""
|
||
resp = requests.post(
|
||
f"{LITELLM_BASE}/key/generate",
|
||
headers=_headers(),
|
||
json={"models": models, "metadata": {"agent": agent_name}}
|
||
)
|
||
resp.raise_for_status()
|
||
return resp.json()["key"]
|
||
|
||
def get_all_spend():
|
||
"""查询总用量"""
|
||
resp = requests.get(f"{LITELLM_BASE}/spend", headers=_headers())
|
||
resp.raise_for_status()
|
||
return resp.json()
|
||
|
||
def block_agent_key(key: str):
|
||
"""封禁某 agent 的 key"""
|
||
resp = requests.post(
|
||
f"{LITELLM_BASE}/key/block",
|
||
headers=_headers(),
|
||
json={"key": key}
|
||
)
|
||
resp.raise_for_status()
|
||
|
||
def get_key_info(key: str):
|
||
"""查询单个 key 详情"""
|
||
resp = requests.get(
|
||
f"{LITELLM_BASE}/key/info",
|
||
headers=_headers(),
|
||
params={"key": key}
|
||
)
|
||
resp.raise_for_status()
|
||
return resp.json()
|
||
```
|
||
|
||
---
|
||
|
||
## 热更新保障
|
||
|
||
Sileya 的 key 配置(`sk-sileya-fixed`)在 `config.yaml` 中固定。管理员通过 API 修改其他 key 时,Sileya 的 key 完全不受影响。
|
||
|
||
只有修改 `config.yaml` 本身的模型路由策略时需要重启网关,但这种情况极低频(可能一个月一次),且重启 < 5 秒。
|
||
|
||
---
|
||
|
||
## 使用场景
|
||
|
||
1. **新 agent 上线** → Sileya 调用 `/key/generate` 生成专属 key
|
||
2. **发现异常用量** → Sileya 调用 `/key/info` 查询,必要时 `/key/block`
|
||
3. **月初对账** → Sileya 调用 `/spend` 汇总所有 agent 消费
|
||
4. **529 高峰期** → Sileya 监控 fallback 情况,确认回落正常
|