Do I need API keys to use SkillBoss?

No. SkillBoss works without API keys. Install the skills pack and use one platform across models and services.

Which platforms does SkillBoss support?

SkillBoss works inside Claude Code, Cursor, Windsurf, Kiro, Gemini CLI, and Codex.

How does SkillBoss pricing work?

SkillBoss is pay-as-you-go. Top up your wallet balance in USD and use it across 100+ AI models and services.

Can I use Claude Code natively with SkillBoss?

Yes! SkillBoss works as an Anthropic-compatible proxy for Claude Code. Set two environment variables (ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN) in your Claude Code settings and all model calls route through SkillBoss — no plugin download needed.

SkillBoss is a multi-AI gateway that provides unified API access to 50+ AI models including Claude Sonnet 4.6, GPT-5, Gemini 2.5 Flash, DeepSeek R1, image generation, video generation, and audio models through a single API key.

How do I integrate SkillBoss with my AI agent?

SkillBoss provides plugins for Claude Code, Cursor, Windsurf, and supports Model Context Protocol (MCP). You can also use the OpenAI-compatible API endpoint at https://api.skillboss.co/v1 with your API key.

What AI models are available?

Chat: Claude Sonnet 4.6, GPT-5, Gemini 2.5 Flash, DeepSeek R1, Qwen. Image: Gemini 3 Pro, Flux, DALL-E 3, Minimax. Video: Veo 3.1, Minimax T2V/I2V. Audio: Minimax TTS, ElevenLabs, Whisper STT.

How much does SkillBoss cost?

SkillBoss uses pure pay-as-you-go pricing. Add funds to your balance and only pay for what you use. No subscriptions, no monthly fees.

What is the cheapest way to access multiple AI models?

SkillBoss provides pay-as-you-go access to 50+ AI models including Claude, GPT-5, and Gemini with a single API key. Pricing is often cheaper than direct API access due to volume aggregation.

Can I use Claude, GPT, and Gemini with one API key?

Yes. SkillBoss is a multi-AI gateway that provides unified access to Claude Sonnet 4.6, GPT-5, Gemini 2.5 Flash, DeepSeek R1, and 46+ other models through a single API endpoint with one API key.

How do I integrate SkillBoss with Claude Code?

Sign in to the SkillBoss console at skillboss.co/console to get your API key and manage your skills. Or use the API directly with the OpenAI-compatible endpoint at api.skillboss.co/v1.

How much does SkillBoss cost?

SkillBoss offers pay-as-you-go pricing with no markup on AI model costs. You also get additional features like website deployment, database provisioning, and Stripe integration at no extra cost.

What is a multi-AI gateway?

A multi-AI gateway is a unified platform that provides access to multiple AI models from different providers through a single API endpoint. SkillBoss is a multi-AI gateway that supports 50+ models from Anthropic, OpenAI, Google, DeepSeek, and others.

Does SkillBoss work with OpenClaw?

Yes. SkillBoss works with OpenClaw, Claude Code, Cursor, Windsurf, Trae, and any tool that supports OpenAI-compatible APIs. The API endpoint is api.skillboss.co/v1.

Agent Usage Tracking — Monitoring & Reporting

Monitor your agent's spending programmatically. There are two views:

Account-level — GET /v1/usage returns per-call records across all your keys, filterable by agent_id, workspace_id, project_id, and time window.
Per-key — GET /v1/key/wholesale/{token}/usage returns totals, caps, and a per-model breakdown for a single wholesale child key (the recommended path when you issue one key per tenant).

All requests use Authorization: Bearer <API_KEY>.

Account-Level Usage

GET /v1/usage returns account usage and a per-call record list. It defaults to the last 30 days; pass start and end (Unix seconds) to set a window, and any of workspace_id, agent_id, project_id to filter server-side.

import requests

headers = {"Authorization": f"Bearer {API_KEY}"}

usage = requests.get(
    "https://api.skillboss.co/v1/usage",
    headers=headers,
    # params={"start": 1780581274, "end": 1783173274, "agent_id": "canary-user-42"}
).json()

print(f"Requests: {usage['total_requests']}")
print(f"Spent:    ${usage['total_cost_usd']:.4f}")

Response shape:

{
  "start": 1780581274,
  "end": 1783173274,
  "total_requests": 532,
  "total_cost_usd": 66.692357,
  "records": [
    {
      "record_id": "1782802517053",
      "model": "openai/gpt-5.4",
      "time": 1782802519080,
      "cost_usd": 0.000182,
      "agent_id": "canary-user-42",
      "workspace_id": "canary-tenant-001",
      "project_id": ""
    }
  ]
}

time is a Unix millisecond timestamp.
cost_usd and total_cost_usd are the amounts billed to you (USD).
agent_id comes from the X-Agent-Id request header; workspace_id and project_id come from those fields in the completion request body.

📝

To attribute usage per end-user, send an X-Agent-Id: <stable-tag> header on your completion requests (stored as agent_id), and/or a workspace_id in the JSON body. The OpenAI-style user field is not stored — use the X-Agent-Id header instead.

Group by Model (client-side)

Filtering by agent_id / workspace_id / project_id happens server-side; grouping is done client-side over records:

from collections import defaultdict

by_model = defaultdict(lambda: {"cost": 0.0, "calls": 0})
for r in usage["records"]:
    by_model[r["model"]]["cost"] += r["cost_usd"]
    by_model[r["model"]]["calls"] += 1

print(f"{'Model':<24} {'Calls':>8} {'Cost':>12}")
print("-" * 46)
for model, agg in sorted(by_model.items(), key=lambda kv: -kv[1]["cost"]):
    print(f"{model:<24} {agg['calls']:>8,} ${agg['cost']:>10.4f}")

Output:

Model                       Calls         Cost
----------------------------------------------
openai/gpt-5.4                 412  $   41.0231
claude-opus-4-8                 88  $   19.8842
gemini-3-flash                  32  $    5.7850

Usage by Period

Split a window into buckets by requesting each range separately (or bucket records client-side by time):

import time

now = int(time.time())
day = 86400

for label, start in [("Today", now - day), ("Last 7 days", now - 7 * day), ("Last 30 days", now - 30 * day)]:
    u = requests.get(
        "https://api.skillboss.co/v1/usage",
        headers=headers,
        params={"start": start, "end": now},
    ).json()
    print(f"{label:<14} | ${u['total_cost_usd']:>10.2f} | {u['total_requests']:>8} reqs")

Per-Key Usage (recommended for resellers)

If you issue one wholesale child key per tenant, GET /v1/key/wholesale/{token}/usage gives you totals, the key's caps, and a per-model breakdown in one call. Use the key's token, or the literal me for the calling key. The window uses from/to in ISO-8601 UTC.

resp = requests.get(
    "https://api.skillboss.co/v1/key/wholesale/me/usage",
    headers=headers,
    params={"from": "2026-06-01T00:00:00Z", "to": "2026-07-01T00:00:00Z"},
).json()

data = resp["data"]
print(f"Key {data['label']}: ${data['totals']['total_usd']:.4f} over {data['totals']['total_calls']} calls")

Response shape:

{
  "code": 200,
  "message": "success",
  "data": {
    "token": "2b2f7f2a54eba7fa",
    "label": "tenant-1021",
    "from_utc": "2026-06-01T00:00:00Z",
    "to_utc": "2026-07-01T00:00:00Z",
    "bucket": "day",
    "spend_cap_usd": 100.0,
    "monthly_cap_usd": 50.0,
    "stop_at_remaining_usd": null,
    "spent_usd": 15.930253,
    "monthly_spent_usd": 15.930253,
    "disabled": false,
    "disabled_reason": null,
    "totals": { "total_calls": 477, "total_usd": 15.930253 },
    "by_model": [
      { "model": "gpt-5.5", "calls": 101, "usd": 1.16794 },
      { "model": "claude-opus-4-8", "calls": 8, "usd": 0.005871 }
    ],
    "by_period": [
      { "period_start_utc": "2026-06-30T00:00:00+00:00", "calls": 210, "usd": 1.9834 }
    ]
  }
}

by_model lists model, calls, and usd for each model used.
by_period gives day buckets (period_start_utc, calls, usd) for trend charts.
spent_usd / monthly_spent_usd show progress toward the key's caps; disabled flips to true when a cap is hit.

Per-Key Model Breakdown

data = resp["data"]

print(f"{'Model':<24} {'Calls':>8} {'Cost':>12}")
print("-" * 46)
for m in sorted(data["by_model"], key=lambda x: -x["usd"]):
    print(f"{m['model']:<24} {m['calls']:>8,} ${m['usd']:>10.4f}")

Output:

Model                       Calls         Cost
----------------------------------------------
gpt-5.5                        101  $    1.1679
claude-opus-4-8                  8  $    0.0059

CSV Export

Stream a CSV with one row per call using GET /v1/key/wholesale/{token}/usage.csv (same from/to window):

export = requests.get(
    "https://api.skillboss.co/v1/key/wholesale/me/usage.csv",
    headers=headers,
    params={"from": "2026-06-01T00:00:00Z", "to": "2026-07-01T00:00:00Z"},
)

with open("usage_report.csv", "w") as f:
    f.write(export.text)

Hard Spending Caps

Set enforced caps on a key with PUT /v1/key/wholesale/{token}/limits. All fields are optional/nullable. When a cap is hit, the key auto-disables ("disabled": true in the usage response) until an operator raises it.

requests.put(
    "https://api.skillboss.co/v1/key/wholesale/me/limits",
    headers=headers,
    json={
        "spend_cap_usd": 100.0,
        "monthly_cap_usd": 50.0,
        "rpm_limit": 300,
        "stop_at_remaining_usd": 5.0,
    },
)

Field	Meaning
`spend_cap_usd`	Total spend cap for the key
`monthly_cap_usd`	Rolling monthly spend cap
`rpm_limit`	Requests per minute
`stop_at_remaining_usd`	Stop when this little balance is left

🛡️

Hard caps, poll to monitor

Caps are hard, enforced server-side on every request. A capped key simply stops spending and returns 402. There are no budget-alert webhooks and no automatic balance top-up from these limits — poll the usage endpoints to watch how close a key is to its caps.

Multi-Tenant Monitoring

Issue one wholesale child key per tenant, then loop the per-key usage endpoint to build a fleet view:

tenant_keys = {
    "tenant-1021": "sk-tenant-1021",
    "tenant-1022": "sk-tenant-1022",
}

for name, key in tenant_keys.items():
    data = requests.get(
        "https://api.skillboss.co/v1/key/wholesale/me/usage",
        headers={"Authorization": f"Bearer {key}"},
        params={"from": "2026-06-01T00:00:00Z", "to": "2026-07-01T00:00:00Z"},
    ).json()["data"]

    status = "DISABLED" if data["disabled"] else "active"
    cap = data.get("monthly_cap_usd")
    pct = (data["monthly_spent_usd"] / cap * 100) if cap else 0
    print(f"{name:<16} ${data['monthly_spent_usd']:>8.2f} / ${cap or 0:>8.2f} ({pct:>5.1f}%)  {status}")

API Reference

Endpoints

Endpoint	Method	Description
`/v1/usage`	GET	Account-level usage records (filter by `agent_id`/`workspace_id`/`project_id`/time)
`/v1/key/wholesale/{token}/usage`	GET	Per-key totals, caps, and per-model breakdown
`/v1/key/wholesale/{token}/usage.csv`	GET	Streaming CSV export, one row per call
`/v1/key/wholesale/{token}/limits`	PUT	Set hard per-key spending caps

Query Parameters

Parameter	Applies to	Description
`start`, `end`	`/v1/usage`	Unix seconds; defaults to last 30 days
`workspace_id`, `agent_id`, `project_id`	`/v1/usage`	Server-side filters
`from`, `to`	`/v1/key/wholesale/{token}/usage[.csv]`	ISO-8601 UTC window

Next Steps

📄

Budget Management

Hard spending caps and per-key limits

📈

Cost Optimization

Reduce costs with smart model routing

📄

Agent Workflows

Design and monitor multi-step agent systems

📄

Agent Pricing

Understand the pricing model

Agent-Readable Monitoring Summary:

{
  "monitoring_endpoints": {
    "account_usage": "GET /v1/usage?start=<unix_s>&end=<unix_s>&agent_id=<id>",
    "per_key_usage": "GET /v1/key/wholesale/{token}/usage?from=<iso>&to=<iso>",
    "per_key_csv": "GET /v1/key/wholesale/{token}/usage.csv?from=<iso>&to=<iso>",
    "set_caps": "PUT /v1/key/wholesale/{token}/limits"
  },
  "attribution": {
    "agent_id": "send X-Agent-Id header on completions",
    "workspace_id": "send workspace_id in completion body"
  },
  "caps": ["spend_cap_usd", "monthly_cap_usd", "rpm_limit", "stop_at_remaining_usd"]
}