Wholesale Documentation

SkillBoss Wholesale API

One wholesale key, 85 frontier models — OpenAI GPT-5, Claude 4.6/4.7, Gemini 3, Qwen 3.5, image + video generation, embeddings. Pick a model from the left for its integration guide with copy-paste code samples and the exact model identifier to pass in your request.

Have wholesale access?
Generate or rotate your wholesale key on the wholesale dashboard.
Open wholesale dashboard →
Not on wholesale yet? Wholesale access is gated for verified accounts. Email admin@skillboss.co with your use case and expected volume, and we'll get you provisioned.
Endpoints at a glance
POST /v1/chat/completions — OpenAI-compatible chat (GPT, Qwen, and lite tiers). Use any OpenAI SDK.
POST /v1/messages — Anthropic Claude native. Use the Anthropic SDK.
POST /v1beta/models/<model>:generateContent — Gemini native. Use the Gemini SDK, Gemini CLI, or opencode.
POST /v1/embeddings — OpenAI-compatible embeddings.
POST /v1/run — Image + video generation (Nano Banana, GPT Image 2, VEO).
Base URL for all paths: https://api.skillboss.co

OpenAI (GPT)

GPT-5 family chat models + text embeddings. OpenAI SDK-compatible.

5 models

Anthropic (Claude)

Claude Opus, Sonnet, Haiku — native /v1/messages endpoint.

6 models

Google (Gemini text)

Gemini Pro/Flash text + multimodal chat. Native Gemini SDK-compatible.

4 models

Nano Banana (image gen)

Google Gemini image generation — three quality tiers. /v1/run shape.

3 models

OpenAI GPT Image 2

OpenAI gpt-image-2 image generation. Three quality tiers.

3 models

VEO (text-to-video)

Google DeepMind VEO 3.1 — photorealistic text-to-video.

2 models

Shotstack (video editing)

JSON-driven video editing — trim, merge, subtitles, transitions, filters.

10 models
Video generation
Shotstack Edit
Programmatic video editing via JSON timeline. Trim, merge, subtitle, transitions, filters.
shotstack/edit
Video generation
Shotstack Status
Look up render status and output URL by task_id.
shotstack/status
Video generation
Shotstack Template Create
Save a reusable video template from timeline + output JSON.
shotstack/template-create
Video generation
Shotstack Template List
List all templates owned by your Shotstack account.
shotstack/template-list
Video generation
Shotstack Template Get
Retrieve a single template by id.
shotstack/template-get
Video generation
Shotstack Template Update
Update an existing template name and/or definition.
shotstack/template-update
Video generation
Shotstack Template Delete
Delete a template by id.
shotstack/template-delete
Video generation
Shotstack Template Render
Render a video from a saved template with merge fields.
shotstack/template-render
Video generation
Shotstack Probe
Inspect a media file — codec, resolution, bitrate, duration.
shotstack/probe
Video generation
Shotstack Ingest
Fetch a media file from a public URL for timeline use.
shotstack/ingest

Higgsfield (image + image-to-video)

Soul text-to-image + DoP image-to-video. /v1/run shape.

4 models

Reve (text-to-image)

Versatile, low-cost text-to-image generation. /v1/run shape.

1 model

Kling (image-to-video)

Kling 2.1 Pro image-to-video — high-fidelity motion. /v1/run shape.

1 model

MiniMax Voice (TTS + Music)

MiniMax speech synthesis + music generation. /v1/run shape.

3 models

Qwen (Alibaba)

Multilingual chat + vision-language + code models. OpenAI SDK-compatible.

10 models
Chat
Qwen 3.5 Flash
Cheapest Qwen chat model. Multilingual, fast.
qwen/qwen3.5-flash
Chat
Qwen 3.5 Plus
Vision-language. Multimodal chat with deep visual reasoning.
qwen/qwen3.5-plus
Chat
Qwen Max
Top-tier Qwen for hardest reasoning tasks.
qwen/qwen-max
Chat
Qwen Plus
Balanced Qwen chat. Default for production routes.
qwen/qwen-plus
Chat
Qwen Flash
Fast, cheap Qwen. Light tasks.
qwen/qwen-flash
Chat
Qwen VL Max
Vision-language at top tier. Long video + image understanding.
qwen/qwen-vl-max
Chat
Qwen VL Plus
Cheaper VL. Production default for multimodal Qwen.
qwen/qwen-vl-plus
Chat
Qwen3 Coder Plus
Code-tuned Qwen. Strong on Chinese + English code tasks.
qwen/qwen3-coder-plus
Chat
Qwen3 Coder Flash
Fast code model. High-volume coding routes.
qwen/qwen3-coder-flash
Chat
Qwen Coder Plus
Previous-gen coder. Stable, still capable.
qwen/qwen-coder-plus

GLM (Z.ai)

Z.ai's flagship chat model. Strong CN/EN, tool use, fine-grained streaming.

9 models
Chat
GLM 5.2
Z.ai's newest flagship — top reasoning, tool use, strong CN/EN.
glm/glm-5.2
Chat
GLM 5.1
Z.ai flagship — fine-grained streaming, tool use, strong CN/EN.
glm/glm-5.1
Chat
GLM 5
Balanced GLM 5 — strong general chat at a lower price than 5.2.
glm/glm-5
Chat
GLM 5 Turbo
Faster GLM 5 — lower latency for high-throughput chat + agents.
glm/glm-5-turbo
Chat
GLM 4.7
Cost-effective GLM 4.7 — reliable general chat + tool use.
glm/glm-4.7
Chat
GLM 4.5 Air
Lightweight GLM — cheap + fast for high-volume routes.
glm/glm-4.5-air
Chat
GLM 4.7 Flash
Cheapest GLM — fastest, ideal for fan-out + simple tasks.
glm/glm-4.7-flash
Chat
GLM 4.5V
Vision-language GLM — image + text understanding, OCR, charts.
glm/glm-4.5v
Speech-to-text
GLM ASR 2512
Z.ai speech-to-text — low CER (0.0717), token-billed, audio URL in / text out.
glm/glm-asr-2512

OpenAI Lite (-l models)

80–90% cheaper OpenAI access via lower-priority routing. Same models.

18 models
Chat
Lite
GPT-5.4 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-5.4l
Chat
Lite
GPT-5.2 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-5.2l
Chat
Lite
GPT-5.1 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-5.1l
Chat
Lite
GPT-5 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-5l
Chat
Lite
GPT-5 Mini (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-5-minil
Chat
Lite
GPT-5 Nano (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-5-nanol
Chat
Lite
GPT-4.1 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-4.1l
Chat
Lite
GPT-4.1 Mini (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-4.1-minil
Chat
Lite
GPT-4.1 Nano (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-4.1-nanol
Chat
Lite
GPT-4o (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-4ol
Chat
Lite
GPT-4o Mini (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/gpt-4o-minil
Chat
Lite
o1 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/o1l
Chat
Lite
o1 Mini (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/o1-minil
Chat
Lite
o1 Pro (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/o1-prol
Chat
Lite
o3 (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/o3l
Chat
Lite
o3 Mini (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/o3-minil
Chat
Lite
o4 Mini (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/o4-minil
Chat
Lite
ChatGPT 4o (latest) (Lite)
Lite tier: 80–90% cheaper, lower-priority pool.
openai/chatgpt-4o-latestl

Anthropic Lite (-l models)

Discounted Claude access via lower-priority routing.

4 models

Google Lite (-l models)

Discounted Gemini access via lower-priority routing.

2 models
Wholesale API Docs — Models & Integration Guides | SkillBoss