litellm-gateway/docs/SKILL_INTEGRATION.md

3.7 KiB
Raw Permalink Blame History

Sileya Skill 接入 — lite-llm-admin

目标

Sileya 持有 admin key可以通过 API 管理 LiteLLM Gateway

  • 查询各 agent 用量
  • 生成/封禁 key
  • 动态调整 fallback 策略
  • 监控网关健康状态

Skill 核心接口

基础信息

Base URL:  http://<your-server>:4000
Admin Key: sk-litellm-admin-key西莉雅持有
Header:    Authorization: Bearer <admin_key>

API 列表

1. 生成新 agent key

POST /key/generate
Content-Type: application/json
Authorization: Bearer <admin_key>

{
  "models": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.5-Lightning"],
  "metadata": {
    "agent": "李狗蛋",
    "owner": "sileya"
  }
}

Response:
{
  "key": "sk-litellm-xxxxx",
  "expires": null,
  "models": ["MiniMax-M2.7", "MiniMax-M2.5", "MiniMax-M2.5-Lightning"]
}

2. 查询 key 用量

GET /key/info?key=sk-litellm-xxxxx
Authorization: Bearer <admin_key>

Response:
{
  "key": "sk-litellm-xxxxx",
  "spend": 0.0001065,
  "models": [...],
  "expires": null
}

3. 查询总用量

GET /spend
Authorization: Bearer <admin_key>

Response:
{
  "total_spend": 12.345,
  "key_spends": [
    {"key": "sk-litellm-xxxxx", "spend": 5.123, "agent": "李狗蛋"},
    {"key": "sk-litellm-yyyyy", "spend": 7.222, "agent": "妮可"}
  ]
}

4. 封禁 key

POST /key/block
Content-Type: application/json
Authorization: Bearer <admin_key>

{
  "key": "sk-litellm-xxxxx"
}

5. 解封 key

POST /key/unblock
Content-Type: application/json
Authorization: Bearer <admin_key>

{
  "key": "sk-litellm-xxxxx"
}

6. 动态添加模型(热生效)

POST /model/new
Content-Type: application/json
Authorization: Bearer <admin_key>

{
  "model_name": "gpt-4o",
  "litellm_params": {
    "model": "openai/gpt-4o",
    "api_key": "sk-openai-xxx",
    "rpm": 60
  }
}

Sileya Skill 实现

# SKILL.md 核心逻辑

import requests
import os

LITELLM_BASE = os.environ["LITELLM_BASE_URL"]  # http://<server>:4000
ADMIN_KEY = os.environ["LITELLM_ADMIN_KEY"]

def _headers():
    return {"Authorization": f"Bearer {ADMIN_KEY}", "Content-Type": "application/json"}

def generate_agent_key(agent_name: str, models: list[str]):
    """给新 agent 分配独立 key"""
    resp = requests.post(
        f"{LITELLM_BASE}/key/generate",
        headers=_headers(),
        json={"models": models, "metadata": {"agent": agent_name}}
    )
    resp.raise_for_status()
    return resp.json()["key"]

def get_all_spend():
    """查询总用量"""
    resp = requests.get(f"{LITELLM_BASE}/spend", headers=_headers())
    resp.raise_for_status()
    return resp.json()

def block_agent_key(key: str):
    """封禁某 agent 的 key"""
    resp = requests.post(
        f"{LITELLM_BASE}/key/block",
        headers=_headers(),
        json={"key": key}
    )
    resp.raise_for_status()

def get_key_info(key: str):
    """查询单个 key 详情"""
    resp = requests.get(
        f"{LITELLM_BASE}/key/info",
        headers=_headers(),
        params={"key": key}
    )
    resp.raise_for_status()
    return resp.json()

热更新保障

Sileya 的 key 配置(sk-sileya-fixed)在 config.yaml 中固定。管理员通过 API 修改其他 key 时Sileya 的 key 完全不受影响。

只有修改 config.yaml 本身的模型路由策略时需要重启网关,但这种情况极低频(可能一个月一次),且重启 < 5 秒。


使用场景

  1. 新 agent 上线 → Sileya 调用 /key/generate 生成专属 key
  2. 发现异常用量 → Sileya 调用 /key/info 查询,必要时 /key/block
  3. 月初对账 → Sileya 调用 /spend 汇总所有 agent 消费
  4. 529 高峰期 → Sileya 监控 fallback 情况,确认回落正常