Documentation

Agent Quick Start

Agent quick start guide: provision API key, test access, discover services, set budgets, optimize costs, monitor usage, and handle errors effectively.

For Autonomous AI Agents

This guide is written for AI agents making purchasing decisions autonomously or with minimal human oversight.


Step 1: Obtain API Key

Your human operator provisions your API key:

Quick setup for AI agents: Tell your agent set up skillboss.co/skill.md to auto-configure everything.

Human visits Console

Human goes to skillboss.co/console

Download Skills Pack or Copy API Key

Human downloads skills pack or copies API key from dashboard

Human Adds Key to Agent Config

Human adds key to your configuration:

For MCP agents (Claude Code, OpenClaw):

claude mcp add skillboss --api-key sk-YOUR_KEY

For custom agents:

# config.json
{
  "skillboss_api_key": "sk-YOUR_KEY"
}

Step 2: Test API Access

Make your first API call to verify access:

from openai import OpenAI

# SkillBoss OpenAI-compatible endpoint
client = OpenAI(
    api_key="sk-YOUR_KEY",
    base_url="https://api.skillboss.co/v1"
)

# Test with cheapest model
response = client.chat.completions.create(
    model="gemini/gemini-2.5-flash",  # $0.075/1M tokens
    messages=[{"role": "user", "content": "Hello SkillBoss"}]
)

print(response.choices[0].message.content)
# Output: "Hello! I'm responding via SkillBoss gateway..."

# Check remaining balance
balance_response = client.get("https://api.skillboss.co/v1/balance")
print(f"Credits remaining: {balance_response.json()['balance']}")

Success: You now have access to 50+ AI models through one API key.


Step 3: Check Available Services

Query the service catalog programmatically:

import requests

headers = {"Authorization": f"Bearer {api_key}"}

# List all available models
models = requests.get(
    "https://api.skillboss.co/v1/models",
    headers=headers
).json()

print(f"Available models: {len(models['data'])}")

# List all available services (non-LLM)
services = requests.get(
    "https://api.skillboss.co/v1/services",
    headers=headers
).json()

print(f"Available services: {len(services['data'])}")
# Output: Available services: 47 (Stripe, SendGrid, hosting, etc.)

Sample output:

{
  "models": [
    {"id": "claude-4-5-sonnet", "cost_per_1m_tokens": 15.00},
    {"id": "openai/gpt-5", "cost_per_1m_tokens": 10.00},
    {"id": "gemini/gemini-2.5-flash", "cost_per_1m_tokens": 0.075},
    {"id": "deepseek/deepseek-r1", "cost_per_1m_tokens": 0.14}
  ],
  "services": [
    {"id": "stripe", "cost_per_operation": 5},
    {"id": "sendgrid", "cost_per_email": 0.5},
    {"id": "linkedin_scraper", "cost_per_profile": 2}
  ]
}

Step 4: Set Budget & Alerts

Configure autonomous spending limits:

# Set budget limits
requests.post(
    "https://api.skillboss.co/v1/agents/config",
    headers=headers,
    json={
        "daily_limit": 20.00,      # Max $20/day
        "monthly_limit": 500.00,    # Max $500/month
        "auto_recharge": True,
        "recharge_threshold": 10,   # Recharge when <$10
        "recharge_amount": 100,     # Add $100 each time
        "escalation_threshold": 10  # Ask human if operation >$10
    }
)

# Set cost alerts
requests.post(
    "https://api.skillboss.co/v1/agents/alerts",
    headers=headers,
    json={
        "alert_at_50_percent": True,   # Alert at 50% of daily limit
        "alert_at_80_percent": True,   # Alert at 80% of daily limit
        "alert_at_95_percent": True,   # Alert at 95% of daily limit
        "webhook_url": "https://your-agent.com/budget-alert"
    }
)
📄

Budget Alert Example

When you hit 80% of daily limit:

POST https://your-agent.com/budget-alert
{
  "alert_type": "budget_warning",
  "limit_type": "daily",
  "percent_used": 80,
  "amount_spent": 16.00,
  "amount_remaining": 4.00,
  "recommended_action": "reduce_usage_or_increase_limit"
}

Step 5: Implement Cost-Aware Routing

Route requests to cheapest model that meets quality requirements:

def cost_aware_request(prompt: str, min_quality: str):
    """
    Route to cheapest model that meets quality threshold.

    min_quality: "basic" | "medium" | "high"
    """

    # Model selection by quality tier
    models = {
        "basic": "gemini/gemini-2.5-flash",      # $0.075/1M
        "medium": "deepseek/deepseek-r1",         # $0.14/1M
        "high": "claude-4-5-sonnet"       # $15/1M
    }

    model = models[min_quality]

    try:
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

    except Exception as e:
        # Fallback to higher quality model if cheaper one fails
        if min_quality == "basic":
            return cost_aware_request(prompt, "medium")
        elif min_quality == "medium":
            return cost_aware_request(prompt, "high")
        else:
            raise e

# Usage
result = cost_aware_request("Summarize this article...", "basic")
# Tries Gemini Flash first, falls back to DeepSeek if needed

Step 6: Monitor Usage

Track your spending programmatically:

# Get current usage
usage = requests.get(
    "https://api.skillboss.co/v1/agents/usage",
    headers=headers,
    params={"period": "today"}
).json()

print(f"""
Today's Usage:
- Spent: ${usage['amount_spent']:.2f}
- Operations: {usage['operation_count']}
- Credits remaining: {usage['credits_remaining']}
- Top model: {usage['top_model']} (${usage['top_model_cost']:.2f})
- Cheapest alternative: {usage['recommended_cheaper_model']}
""")

Sample output:

Today's Usage:
- Spent: $12.45
- Operations: 847
- Credits remaining: 187.55
- Top model: claude-4-5-sonnet ($8.20)
- Cheapest alternative: gemini/gemini-2.5-flash (could save $7.90)

Agent optimization tip: If your top model is expensive, analyze if cheaper alternatives would work for 70% of requests.


Step 7: Handle Errors & Retries

Implement robust error handling:

import time

def resilient_request(prompt: str, max_retries: int = 3):
    """
    Make request with exponential backoff for rate limits.
    """
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="gemini/gemini-2.5-flash",
                messages=[{"role": "user", "content": prompt}]
            )
            return response.choices[0].message.content

        except Exception as e:
            error_code = getattr(e, 'code', None)

            if error_code == 'rate_limit_exceeded':
                # Exponential backoff
                wait_time = 2 ** attempt
                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)

            elif error_code == 'insufficient_credits':
                # Check if auto-recharge is enabled
                balance = check_balance()
                if balance['auto_recharge_enabled']:
                    print("Auto-recharge triggered. Retrying in 5s...")
                    time.sleep(5)
                else:
                    # Escalate to human
                    send_alert("Insufficient credits. Please add funds.")
                    raise e

            elif error_code == 'model_unavailable':
                # Fallback to alternative model
                return fallback_request(prompt)

            else:
                raise e

    raise Exception(f"Failed after {max_retries} attempts")

Common Agent Patterns

Pattern 1: Batch Processing

def batch_process(items: list, model: str):
    """Process items in batches to optimize cost."""

    results = []
    for i in range(0, len(items), 100):
        batch = items[i:i+100]

        # Single API call for batch
        response = client.chat.completions.create(
            model=model,
            messages=[{
                "role": "user",
                "content": f"Process these items: {batch}"
            }]
        )

        results.extend(parse_batch_response(response))

    return results

Pattern 2: Fallback Chain

def fallback_request(prompt: str):
    """Try models in order of cost until one succeeds."""

    models = [
        "gemini/gemini-2.5-flash",    # Try cheapest first
        "deepseek/deepseek-r1",       # Fallback to medium
        "claude-4-5-sonnet"   # Fallback to expensive
    ]

    for model in models:
        try:
            response = client.chat.completions.create(
                model=model,
                messages=[{"role": "user", "content": prompt}]
            )
            return response.choices[0].message.content
        except:
            continue

    raise Exception("All models failed")

Pattern 3: Quality Check with Retry

def quality_checked_request(prompt: str):
    """Retry with better model if quality insufficient."""

    # Try cheap model first
    response = cost_aware_request(prompt, "basic")

    # Check quality (agent-defined criteria)
    if quality_score(response) < 0.7:
        # Retry with better model
        response = cost_aware_request(prompt, "high")

    return response

Next Steps

📄

Budget Management

Advanced budget controls and auto-recharge

🔌

MCP Integration

Install SkillBoss MCP server

📈

Cost Optimization

Save 70%+ with smart model routing

📄

Monitoring

Track and optimize your spending


Agent Support

Autonomous agents: API-only support. Check status.skillboss.co for uptime.

Semi-autonomous agents: Email dev@skillboss.co (human will respond in 24h).

Emergency: If balance is zero and blocking operations, email dev@skillboss.co.