Chat Completions API - SkillBoss Docs
Chat Completions API reference for Claude, GPT, Gemini, and 679+ endpoints. Streaming, function calling, vision, JSON mode, best practices, and code examples.
The Chat Completions API is 100% OpenAI-compatible and supports 679+ endpoints including Claude 4.5 Sonnet, GPT-5, Gemini 2.5 Flash, and DeepSeek R1.
Endpoint
POST https://api.skillboss.co/v1/chat/completions
Authentication
All API requests require an API key in the Authorization header:
curl https://api.skillboss.co/v1/chat/completions \
-H "Authorization: Bearer sk-sb-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-4.5-sonnet",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID (e.g., claude-4.5-sonnet, gpt-5) |
messages | array | Yes | Array of message objects |
max_tokens | integer | No | Maximum tokens to generate (default: 4096) |
temperature | number | No | Sampling temperature 0-2 (default: 1) |
top_p | number | No | Nucleus sampling 0-1 (default: 1) |
stream | boolean | No | Enable streaming responses (default: false) |
stop | string/array | No | Stop sequences |
presence_penalty | number | No | Penalize new topics -2 to 2 (default: 0) |
frequency_penalty | number | No | Penalize repetition -2 to 2 (default: 0) |
user | string | No | End-user ID for tracking |
Messages Format
{
"messages": [
{
"role": "system" | "user" | "assistant",
"content": string | array
}
]
}
Response Format
Non-Streaming Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1709251200,
"model": "claude-4.5-sonnet",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}
Streaming Response
When stream: true, responses are sent as Server-Sent Events (SSE):
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1709251200,"model":"claude-4.5-sonnet","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1709251200,"model":"claude-4.5-sonnet","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1709251200,"model":"claude-4.5-sonnet","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Code Examples
Node.js / TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.SKILLBOSS_API_KEY,
baseURL: 'https://api.skillboss.co/v1'
});
const completion = await client.chat.completions.create({
model: 'claude-4.5-sonnet',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain quantum computing in simple terms.' }
],
max_tokens: 500,
temperature: 0.7
});
console.log(completion.choices[0].message.content);
Python
from openai import OpenAI
client = OpenAI(
api_key="sk-sb-YOUR_API_KEY",
base_url="https://api.skillboss.co/v1"
)
response = client.chat.completions.create(
model="claude-4.5-sonnet",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
max_tokens=500,
temperature=0.7
)
print(response.choices[0].message.content)
cURL
curl https://api.skillboss.co/v1/chat/completions \
-H "Authorization: Bearer sk-sb-YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-4.5-sonnet",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain quantum computing in simple terms."
}
],
"max_tokens": 500,
"temperature": 0.7
}'
Go
package main
import (
"context"
"fmt"
"github.com/sashabaranov/go-openai"
)
func main() {
config := openai.DefaultConfig("sk-sb-YOUR_API_KEY")
config.BaseURL = "https://api.skillboss.co/v1"
client := openai.NewClientWithConfig(config)
resp, err := client.CreateChatCompletion(
context.Background(),
openai.ChatCompletionRequest{
Model: "claude-4.5-sonnet",
Messages: []openai.ChatCompletionMessage{
{
Role: openai.ChatMessageRoleSystem,
Content: "You are a helpful assistant.",
},
{
Role: openai.ChatMessageRoleUser,
Content: "Explain quantum computing in simple terms.",
},
},
MaxTokens: 500,
Temperature: 0.7,
},
)
if err != nil {
panic(err)
}
fmt.Println(resp.Choices[0].Message.Content)
}
Available Models
Premium Models (Recommended)
| Model | ID | Context | Input Price | Output Price |
|---|---|---|---|---|
| Claude 4.5 Sonnet | claude-4.5-sonnet | 200K | $3/1M tokens | $15/1M tokens |
| GPT-5 | gpt-5 | 128K | $10/1M tokens | $30/1M tokens |
| Gemini 2.5 Flash | gemini-2.5-flash | 1M | $0.15/1M tokens | $0.60/1M tokens |
High-Performance Models
| Model | ID | Context | Best For |
|---|---|---|---|
| Claude 4.5 Haiku | claude-4.5-haiku | 200K | Fast responses, high throughput |
| GPT-4o mini | gpt-4o-mini | 128K | Cost-efficient chat |
| DeepSeek R1 | deepseek-r1 | 64K | Reasoning tasks |
Multimodal Support (Vision)
Send images in messages:
const completion = await client.chat.completions.create({
model: 'gpt-5',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What's in this image?' },
{
type: 'image_url',
image_url: {
url: 'https://example.com/image.jpg'
}
}
]
}
]
});
Supported models:
gpt-5,gpt-4o,gpt-4o-miniclaude-4.5-sonnet,claude-4.5-opusgemini-2.5-flash,gemini-2.5-pro
Advanced Features
Function Calling (Tools)
const completion = await client.chat.completions.create({
model: 'claude-4.5-sonnet',
messages: [{ role: 'user', content: 'What's the weather in SF?' }],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
},
required: ['location']
}
}
}
],
tool_choice: 'auto'
});
// Handle tool calls
if (completion.choices[0].finish_reason === 'tool_calls') {
const toolCall = completion.choices[0].message.tool_calls[0];
console.log(toolCall.function.name); // "get_weather"
console.log(toolCall.function.arguments); // '{"location":"San Francisco","unit":"celsius"}'
}
JSON Mode
Force model to output valid JSON:
const completion = await client.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'system', content: 'Extract user info as JSON.' },
{ role: 'user', content: 'My name is John, I'm 30, from NYC.' }
],
response_format: { type: 'json_object' }
});
const data = JSON.parse(completion.choices[0].message.content);
// { "name": "John", "age": 30, "location": "NYC" }
Reproducible Outputs (Seed)
const completion = await client.chat.completions.create({
model: 'gpt-5',
messages: [{ role: 'user', content: 'Generate a random number.' }],
seed: 12345, // Same seed = same output
temperature: 1
});
Error Handling
HTTP Status Codes
| Code | Meaning | Solution |
|---|---|---|
| 200 | Success | - |
| 400 | Bad Request | Check request format |
| 401 | Unauthorized | Verify API key |
| 429 | Rate Limited | Reduce request rate |
| 500 | Server Error | Retry with exponential backoff |
Error Response Format
{
"error": {
"message": "Invalid API key provided",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}
Retry Logic Example
async function chatWithRetry(messages, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await client.chat.completions.create({
model: 'claude-4.5-sonnet',
messages
});
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
// Exponential backoff: 1s, 2s, 4s
await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
continue;
}
throw error;
}
}
}
Rate Limits
SkillBoss enforces the following rate limits:
| Tier | Requests/min | Tokens/min |
|---|---|---|
| Free | 60 | 90,000 |
| Paid | Unlimited* | 10M |
*Soft limit - contact support for higher limits
Best Practices
1. Use System Messages for Context
✅ Good:
messages: [
{ role: 'system', content: 'You are a Python expert. Be concise.' },
{ role: 'user', content: 'Explain list comprehensions.' }
]
❌ Bad:
messages: [
{ role: 'user', content: 'You are a Python expert. Explain list comprehensions.' }
]
2. Set max_tokens to Control Cost
{
model: 'claude-4.5-sonnet',
messages: [...],
max_tokens: 500 // Limit output to 500 tokens
}
3. Use temperature Wisely
- 0.0-0.3: Factual, deterministic (code, math)
- 0.7-1.0: Creative, varied (stories, brainstorming)
- 1.5-2.0: Highly creative (experimental)
4. Handle Streaming for Better UX
const stream = await client.chat.completions.create({
model: 'claude-4.5-sonnet',
messages: [...],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
Frequently Asked Questions
Can I use the OpenAI SDK?
Yes! SkillBoss is 100% OpenAI-compatible. Just change the baseURL:
const client = new OpenAI({
apiKey: 'sk-sb-YOUR_KEY',
baseURL: 'https://api.skillboss.co/v1' // Change this line
});
What's the difference between models?
- Claude: Best reasoning, long context (200K)
- GPT: Best general-purpose, fastest updates
- Gemini: Best multimodal, 1M context window
- DeepSeek: Best for coding & math
How is billing calculated?
You pay for:
- Input tokens: Tokens in your
messages - Output tokens: Tokens in the response
Example:
Input: 100 tokens × $3/1M = $0.0003
Output: 50 tokens × $15/1M = $0.00075
Total: $0.00105 per request
Can I use streaming with all models?
Yes! All models support streaming via stream: true.
Last updated: March 1, 2026 Related: