Chat (Anthropic native)

Claude Haiku 4.5 API

Claude Haiku 4.5 — small, fast, low-cost. Use for classifiers, simple chat, batch processing where Sonnet would be overkill.

model: anthropic/claude-haiku-4-5200K tokens
Wholesale key required. This endpoint accepts wholesale keys only. Have access? Get your key on the wholesale dashboard. Not on wholesale yet? Email admin@skillboss.co with your use case — we'll get you provisioned.

Quickstart

This model uses Anthropic's native /v1/messages endpoint — auth via the x-api-key header, not Authorization Bearer. Compatible with the official Anthropic SDK by setting base_url to https://api.skillboss.co.

bashcurl https://api.skillboss.co/v1/messages \
  -H "x-api-key: $SKILLBOSS_WHOLESALE_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Your first 200 response is the fastest way to confirm setup. From there, swap in your real prompt and tune the model-specific parameters listed below.

Authentication

Every request must include your wholesale key. The header name depends on the endpoint — match the SDK you're using:

bashx-api-key: $SKILLBOSS_WHOLESALE_KEY

Anthropic's native API uses x-api-key (not Authorization Bearer) — the official Anthropic SDK sets this automatically when you pass api_key to the client. The anthropic-version header is also required by the upstream Anthropic API.

Code examples

Python
pythonimport os
from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ["SKILLBOSS_WHOLESALE_KEY"],
    base_url="https://api.skillboss.co",
)

resp = client.messages.create(
    model="anthropic/claude-haiku-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.content[0].text)
JavaScript / TypeScript
typescriptimport Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.SKILLBOSS_WHOLESALE_KEY,
  baseURL: "https://api.skillboss.co",
});

const resp = await client.messages.create({
  model: "anthropic/claude-haiku-4-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello" }],
});

console.log(resp.content[0].text);

Parameters

Anthropic /v1/messages parameters. Important: max_tokens is REQUIRED (unlike OpenAI where it's optional), and the system prompt is a top-level field (not a message with role=system).

NameTypeRequiredDescription
modelstringrequiredModel id. Use anthropic/claude-haiku-4-5.
max_tokensintegerrequiredRequired by Anthropic API — max tokens in the response. Default 1024.
messagesarrayrequiredConversation messages. Each has role (user|assistant) and content (string or array of content blocks: text, image, tool_use, tool_result, …).
systemstring | arrayoptionalSystem prompt — separate top-level field, NOT inside messages. Use array form ([{ type: "text", text: "...", cache_control: { type: "ephemeral" } }]) to enable prompt caching for the system block.
streambooleanoptionalSet true for SSE streaming.
temperaturenumberoptionalSampling temperature (0–1 for Claude). Default 1.0.
top_pnumberoptionalNucleus sampling probability cutoff. Use *either* temperature *or* top_p/top_k.
top_kintegeroptionalSample only from the top-K most likely tokens at each step.
toolsarrayoptionalClaude tool-use definitions — schema differs from OpenAI: { name, description, input_schema: <JSON Schema> }.
tool_choiceobjectoptional{ type: "auto" } (default), { type: "any" } (must call some tool), or { type: "tool", name: "<tool_name>" } (force one).
stop_sequencesarray<string>optionalCustom stop strings — generation halts (without emitting) when any appears.
metadataobjectoptional{ user_id: "<your_user_id>" } — passed to Anthropic for trust + safety / abuse detection. Recommended for production.

Endpoint

MethodPOST
URLhttps://api.skillboss.co/v1/messages
Auth headerx-api-key: $SKILLBOSS_WHOLESALE_KEY
Content-Typeapplication/json
StreamingSet stream: true for SSE — each event is a Server-Sent Event with type prefix (message_start, content_block_delta, message_stop, etc.).

Errors

The API uses standard HTTP status codes:

200OKRequest succeeded.
400Bad RequestInvalid model, missing required field, or malformed JSON.
401UnauthorizedMissing or invalid wholesale key. Non-wholesale console keys are rejected here.
402Insufficient CreditsWholesale balance too low — top up on the wholesale dashboard.
429Rate LimitedToo many requests. Back off with exponential delay.
500Server ErrorTransient upstream issue. Safe to retry.
503Upstream UnavailableDiscount pool capacity issue (lite tier). Retry or fall back to standard tier.

Pricing

Wholesale pricing is your account-specific discount × vendor list price. Discount rate depends on your contract — see the live numbers on the wholesale dashboard. The dashboard shows your current cost per 1M tokens (or per image / per second) for every model.

No platform markup on standard token billing. Volume tiers + monthly caps are configurable per key.

Claude Haiku 4.5 API — Wholesale Integration Guide | SkillBoss