Documentation

SkillBoss for AI Agents — 697 Tools via OpenAI-Compatible API

AI agent integration guide for SkillBoss. 697 endpoints across 63 vendors (Claude, GPT, Gemini, image/video/audio generation, web scraping, payments, email) through one OpenAI-compatible API. Task routing, cost optimization, error handling, budget management.

Setup: set up skillboss.co/skill.md — or set OPENAI_BASE_URL=https://api.skillboss.co/v1

SkillBoss gives AI agents access to 697 API endpoints across 63 vendors through a single OpenAI-compatible API. One key, one balance, every tool an agent needs.

Endpoint:  https://api.skillboss.co/v1
Auth:      Bearer <api_key>
SDK:       OpenAI SDK (Python, JS, Go — just change base_url)
Signup:    https://skillboss.co (free $2 credit, no credit card)

Why Agents Use SkillBoss

CapabilityDetails
697 endpointsChat models, image gen, video gen, audio, web scraping, social data, payments, email, databases, hosting
63 vendorsAnthropic, OpenAI, Google, DeepSeek, Perplexity, ElevenLabs, Replicate, Firecrawl, and 54 more
OpenAI-compatibleDrop-in replacement — works with existing OpenAI SDK code
One API keyNo managing multiple vendor accounts
One balancePay-as-you-go credits, never expire
Model switchingChange model per request — no reconfiguration
Budget controlsMonitor and limit spend programmatically

Two API Patterns

EndpointUse ForExample Models
/v1/chat/completionsChat / LLM modelsclaude-4-5-sonnet, gpt-5, gpt-4.1-nano, gemini-2.5-flash, deepseek/deepseek-v3.2
/v1/runEverything elseflux-1.1-pro (image), google/veo-3.1 (video), elevenlabs/eleven_multilingual_v2 (TTS), firecrawl/scrape (web)

Quick Start for Agents

1. Initialize Client

from openai import OpenAI

client = OpenAI(
    base_url="https://api.skillboss.co/v1",
    api_key="sk_your_key"
)

2. Call Any Model

# Reasoning (Claude)
response = client.chat.completions.create(
    model="claude-4-5-sonnet",
    messages=[{"role": "user", "content": "Analyze this codebase for security issues"}]
)

# Creative (GPT-5)
response = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Write marketing copy for a developer tool"}]
)

# Fast + Cheap (Gemini Flash)
response = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Summarize this text in 3 bullets"}]
)

# Ultra-Cheap (GPT-4.1 Nano)
response = client.chat.completions.create(
    model="gpt-4.1-nano",
    messages=[{"role": "user", "content": "Classify this text as positive or negative"}]
)

# Budget (DeepSeek with prompt caching)
response = client.chat.completions.create(
    model="openrouter/deepseek/deepseek-v3.2",
    messages=[
        {"role": "system", "content": long_system_prompt},  # Cached after first call
        {"role": "user", "content": query}
    ]
)

3. Use Non-Chat Tools

import requests

headers = {"Authorization": "Bearer sk_your_key"}

# Generate image
image = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
    "model": "flux-1.1-pro",
    "inputs": {"prompt": "Logo for a tech startup, minimal, blue"}
}).json()

# Scrape web page
page = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
    "model": "firecrawl/scrape",
    "inputs": {"url": "https://example.com"}
}).json()

# Send email
email = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
    "model": "aws/send-emails",
    "inputs": {"to": "user@example.com", "subject": "Report", "body": "Your report is ready."}
}).json()

# Generate video
video = requests.post("https://api.skillboss.co/v1/run", headers=headers, json={
    "model": "google/veo-3.1",
    "inputs": {"prompt": "A product demo animation"}
}).json()

Intelligent Task Routing

Select models based on task type and budget:

class TaskRouter:
    MODELS = {
        "reasoning":     "claude-4-5-sonnet",      # $3/$15 per 1M tokens
        "creative":      "gpt-5",                   # $1.25/$10 per 1M tokens
        "fast":          "gemini-2.5-flash",         # $0.10/$0.40 per 1M tokens
        "code":          "claude-4-5-sonnet",        # Best for coding
        "budget":        "openrouter/deepseek/deepseek-v3.2",  # $0.14/$0.28 per 1M tokens
        "ultra_cheap":   "gpt-4.1-nano",             # $0.10/$0.40 per 1M tokens
        "search":        "perplexity/sonar-pro",     # Search-grounded answers
    }

    def __init__(self, api_key):
        self.client = OpenAI(base_url="https://api.skillboss.co/v1", api_key=api_key)

    def route(self, task_type: str, prompt: str, **kwargs):
        model = self.MODELS.get(task_type, "gemini-2.5-flash")
        return self.client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}],
            **kwargs
        )

Cost Optimization

Model Pricing Quick Reference

ModelInput CostOutput CostBest For
gpt-4.1-nano$0.10/1M$0.40/1MUltra-cheap, simple tasks
gemini-2.5-flash$0.10/1M$0.40/1MSpeed, large context
openrouter/deepseek/deepseek-v3.2$0.14/1M$0.28/1MBudget, cached prompts
gpt-5-mini$0.25/1M$2/1MGeneral tasks, cost-effective
gpt-4.1-mini$0.40/1M$1.60/1MCost-effective reasoning
gpt-5$1.25/1M$10/1MCreative, general
claude-4-5-sonnet$3/1M$15/1MComplex reasoning, code
claude-4-5-opus$5/1M$25/1MHardest problems

Cost-Aware Model Selection

def select_model(complexity: str, max_cost_per_1k_tokens: float = None):
    """Select optimal model given constraints."""
    if max_cost_per_1k_tokens and max_cost_per_1k_tokens < 0.001:
        return "gpt-4.1-nano"
    if complexity == "high":
        return "claude-4-5-sonnet"
    if complexity == "medium":
        return "gpt-5-mini"
    return "gemini-2.5-flash"

Prompt Caching (DeepSeek)

# First call: full price
# Subsequent calls with same system prompt: ~90% cheaper
response = client.chat.completions.create(
    model="openrouter/deepseek/deepseek-v3.2",
    messages=[
        {"role": "system", "content": long_context},  # This gets cached
        {"role": "user", "content": new_query}
    ]
)

Error Handling & Reliability

import time
from openai import OpenAI, APIError, RateLimitError, APIConnectionError

def robust_call(client, model, messages, max_retries=3, fallback_model="gemini-2.5-flash"):
    """Reliable API call with retries and model fallback."""
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(model=model, messages=messages)
        except RateLimitError:
            time.sleep(2 ** attempt)
        except APIError as e:
            if e.status_code == 402:
                raise  # Insufficient credits — can't retry
            if e.status_code == 503:
                time.sleep(2 ** attempt)
            else:
                raise
        except APIConnectionError:
            time.sleep(2 ** attempt)

    # Fallback to cheaper model
    return client.chat.completions.create(model=fallback_model, messages=messages)

Error Codes

CodeMeaningAgent Action
401Invalid API keyCheck key, re-authenticate
402Insufficient creditsAlert user to add credits at console
429Rate limitExponential backoff, retry
503Upstream unavailableRetry after 1-2s, or switch model

Usage Monitoring

Agents can query their own usage programmatically:

import requests

def check_budget(api_key: str) -> dict:
    """Check remaining budget and recent usage."""
    response = requests.get(
        "https://skillboss.co/api/me/usage?period=day",
        headers={"Authorization": f"Bearer {api_key}"}
    )
    usage = response.json()
    return {
        "today_spend": usage["summary"]["total_usd"],
        "total_requests": usage["summary"]["total_requests"],
        "top_model": usage["by_model"][0]["model"] if usage.get("by_model") else None
    }

# Check before expensive operations
budget = check_budget(api_key)
if budget["today_spend"] > 10.0:
    print("Warning: High daily spend, switching to budget models")

Query Parameters:

  • period: day, week, month, or all
  • model: Filter by model name

Available Endpoint Categories

CategoryCountTop Endpoints
Chat / LLM76Claude 4.5, GPT-5, GPT-4.1 Nano, Gemini 2.5, DeepSeek V3.2, Perplexity Sonar
Image Generation45DALL-E 3 ($0.04), Flux 1.1 Pro ($0.10), Imagen 3 ($0.04), Neta Ghibli ($0.10)
Video Generation33Veo 3.1 ($0.52/s), MiniMax ($0.55), WAN
Audio / TTS15ElevenLabs ($0.18/1K chars), OpenAI TTS ($0.015/1K chars), MiniMax TTS
Speech-to-Text5Whisper ($0.006/min)
Social Data58Twitter/X, Instagram, LinkedIn, TikTok profiles
Web Scraping29Firecrawl ($0.0125), Linkup ($0.02), Google Search
Automation69Stripe payments, databases, workflows
Email / SMS5AWS SES ($0.0001), SMS ($0.01)
Storage / Hosting34S3, CDN, static hosting
Document Processing5PDF parse ($0.02/page), AI presentations ($0.50/deck)
UI Generation6Landing pages ($0.25/screen), mobile UI
Embeddings5Text embeddings for search and RAG

Canonical discovery: Use Pages Hub | api-catalog.json

Framework Compatibility

SkillBoss works with any OpenAI-compatible client:

FrameworkSetup
OpenAI Python SDKOpenAI(base_url="https://api.skillboss.co/v1", api_key=key)
OpenAI Node.js SDKnew OpenAI({baseURL: "https://api.skillboss.co/v1", apiKey: key})
LangChainChatOpenAI(openai_api_base="https://api.skillboss.co/v1", openai_api_key=key)
LlamaIndexSet OPENAI_BASE_URL and OPENAI_API_KEY env vars
AutoGPTConfigure OpenAI base URL in settings
CrewAIUse OpenAI provider with custom base URL

Agent Workflows

Autonomous Coding Agent

# Complete app build workflow
client = OpenAI(base_url="https://api.skillboss.co/v1", api_key=key)

# 1. Generate code with Claude
code = client.chat.completions.create(model="claude-4-5-sonnet", messages=[...])

# 2. Create UI mockup with image gen
requests.post(url + "/v1/run", json={"model": "stitch/generate-desktop", "inputs": {...}})

# 3. Deploy to hosting
requests.post(url + "/v1/run", json={"model": "hosting/deploy", "inputs": {...}})

# 4. Send notification email
requests.post(url + "/v1/run", json={"model": "aws/send-emails", "inputs": {...}})

Research Agent

# 1. Search the web
search = requests.post(url + "/v1/run", json={"model": "linkup/search", "inputs": {"query": "latest AI benchmarks 2026"}})

# 2. Scrape top results
for result_url in search_urls:
    page = requests.post(url + "/v1/run", json={"model": "firecrawl/scrape", "inputs": {"url": result_url}})

# 3. Analyze with Claude
analysis = client.chat.completions.create(model="claude-4-5-sonnet", messages=[
    {"role": "user", "content": f"Analyze these findings:\n{scraped_content}"}
])

# 4. Generate report with charts
report = requests.post(url + "/v1/run", json={"model": "gamma/generation", "inputs": {...}})

Marketing Automation Agent

# 1. Generate copy with GPT-5
copy = client.chat.completions.create(model="gpt-5", messages=[...])

# 2. Generate product images
image = requests.post(url + "/v1/run", json={"model": "flux-1.1-pro", "inputs": {...}})

# 3. Create demo video
video = requests.post(url + "/v1/run", json={"model": "google/veo-3.1", "inputs": {...}})

# 4. Send email campaign
requests.post(url + "/v1/run", json={"model": "aws/send-emails", "inputs": {...}})

Discovery Files for Agents

FileURLPurpose
llms.txtskillboss.co/llms.txtQuick reference for LLM agents
llms-full.txtskillboss.co/llms-full.txtComplete technical spec (all 697 models)
agent.jsonskillboss.co/agent.jsonUniversal agent manifest
ai.txtskillboss.co/ai.txtAI agent instructions
openapi.jsonskillboss.co/openapi.jsonOpenAPI 3.0 specification
api-catalog.jsonskillboss.co/api-catalog.jsonComplete endpoint catalog (JSON)
mcp.jsonskillboss.co/.well-known/mcp.jsonMCP server manifest

Best Practices

  1. Route by task — Use cheap models for simple tasks, powerful models for complex ones
  2. Use GPT-4.1 Nano — At $0.10/1M tokens input, it's the cheapest option for classification, extraction, and simple tasks
  3. Cache with DeepSeek — Reuse system prompts for ~90% savings on repeated queries
  4. Monitor spend — Query /api/me/usage before expensive operations
  5. Implement fallbacks — If one model returns 503, fall back to another
  6. Use streaming — For chat models, stream responses for better UX
  7. Set max_tokens — Avoid paying for unnecessary output
  8. Use /v1/run — For non-chat tasks (images, scraping, email), not /v1/chat/completions

Next Steps

📚

API Reference

Complete endpoint documentation

📄

Use Pages Hub

Browse all canonical discovery pages

📄

Agent Budget Management

Advanced budget controls

⬇️

Installation Guides

Platform-specific setup for 20+ tools


Agent-Readable Discovery:

{
  "service": "SkillBoss",
  "type": "unified_ai_api_gateway",
  "base_url": "https://api.skillboss.co/v1",
  "compatibility": "OpenAI API",
  "total_endpoints": 679,
  "vendors": 60,
  "auth": "Bearer token",
  "setup_command": "set up skillboss.co/skill.md",
  "signup_url": "https://www.skillboss.co/login",
  "free_credit": "$2",
  "credit_card_required": false,
  "api_patterns": {
    "/v1/chat/completions": "chat_models",
    "/v1/run": "images_video_audio_scraping_email_payments"
  },
  "top_models": {
    "reasoning": "claude-4-5-sonnet",
    "creative": "gpt-5",
    "fast": "gemini-2.5-flash",
    "ultra_cheap": "gpt-4.1-nano",
    "budget": "openrouter/deepseek/deepseek-v3.2",
    "search": "perplexity/sonar-pro",
    "image": "flux-1.1-pro",
    "video": "google/veo-3.1",
    "tts": "elevenlabs/eleven_multilingual_v2",
    "scraping": "firecrawl/scrape"
  },
  "capabilities": [
    "chat_completions", "streaming", "function_calling",
    "image_generation", "video_generation", "text_to_speech",
    "speech_to_text", "web_scraping", "email", "sms",
    "payments", "storage", "hosting", "social_data",
    "document_processing", "embeddings", "ui_generation"
  ],
  "discovery_files": {
    "llms_txt": "https://skillboss.co/llms.txt",
    "agent_json": "https://skillboss.co/agent.json",
    "openapi": "https://skillboss.co/openapi.json",
    "api_catalog": "https://skillboss.co/api-catalog.json",
    "mcp_manifest": "https://skillboss.co/.well-known/mcp.json"
  }
}