Chat (Gemini native)

Gemini 3.5 Flash API

Gemini 3.5 Flash — fast multimodal Google model with native text, image, and video understanding. Send a video as a content part (a YouTube/file URL via `fileData.fileUri`) alongside your prompt to summarize, transcribe, answer questions about, or pull timestamped moments out of it; also handles vision and plain chat. 1M-token context, $0.50/$3.00 per 1M tokens. Point the Gemini SDK at `https://api.skillboss.co/v1beta`.

URL model: gemini-3.5-flash1M tokens
Wholesale key required. This endpoint accepts wholesale keys only. Have access? Get your key on the wholesale dashboard. Not on wholesale yet? Email admin@skillboss.co with your use case — we'll get you provisioned.

Quickstart

This model uses the Gemini Developer API native format — the model id is in the URL path, not the request body. Compatible with the Gemini SDK (@google/generative-ai), Gemini CLI, and opencode by overriding the base URL.

bashcurl "https://api.skillboss.co/v1beta/models/gemini-3.5-flash:generateContent" \
  -H "x-goog-api-key: $SKILLBOSS_WHOLESALE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {"parts": [{"text": "Hello"}]}
    ]
  }'

Your first 200 response is the fastest way to confirm setup. From there, swap in your real prompt and tune the model-specific parameters listed below.

Authentication

Every request must include your wholesale key. The header name depends on the endpoint — match the SDK you're using:

bashx-goog-api-key: $SKILLBOSS_WHOLESALE_KEY

Gemini's native API uses x-goog-api-key (Gemini SDK + opencode default) or ?key= as a query parameter (Gemini CLI). Both work — pick whichever matches your client. Standard (non-wholesale) console keys are rejected at the gateway with 401.

Code examples

Python
pythonimport os
from google import genai

client = genai.Client(
    api_key=os.environ["SKILLBOSS_WHOLESALE_KEY"],
    http_options={"base_url": "https://api.skillboss.co"},
)

resp = client.models.generate_content(
    model="gemini-3.5-flash",
    contents="Hello",
)
print(resp.text)
JavaScript / TypeScript
typescriptimport { GoogleGenerativeAI } from "@google/generative-ai";

// Point the Gemini SDK at SkillBoss by overriding the base URL.
const genAI = new GoogleGenerativeAI(process.env.SKILLBOSS_WHOLESALE_KEY, {
  baseUrl: "https://api.skillboss.co",
});

const model = genAI.getGenerativeModel({ model: "gemini-3.5-flash" });
const result = await model.generateContent("Hello");
console.log(result.response.text());

Video & image understanding

This model natively understands video and images, not just text. Add a fileData content part (a public video/image URL, or a YouTube link) alongside your text prompt and the model will summarize, transcribe, answer questions about, or pull timestamped moments out of it. Video is billed by input token (VIDEO modality) — roughly 5–10k tokens per minute of video. The first request on a new video can take 1–2 minutes while it is ingested; repeat requests on the same video hit the cache and return in seconds.

bashcurl "https://api.skillboss.co/v1beta/models/gemini-3.5-flash:generateContent" \
  -H "x-goog-api-key: $SKILLBOSS_WHOLESALE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {"parts": [
        {"text": "Summarize this video in 3 bullet points with timestamps."},
        {"fileData": {"fileUri": "https://www.youtube.com/watch?v=aqz-KE-bpKQ"}}
      ]}
    ]
  }'

For programmatic video editing, compose this understanding step with the Shotstack Edit timeline API: use Gemini to decide the cuts / captions / highlights, then render the result with a Shotstack timeline. No separate video-editing service required.

Parameters

Gemini Developer API parameters. The model id is in the URL path (not the body) — see the curl example above. Body fields documented here.

NameTypeRequiredDescription
contentsarrayrequiredConversation parts. Each entry: { role: "user"|"model", parts: [{ text: "..." } | { inlineData: { mimeType, data } } | { fileData: { mimeType, fileUri } }] }.
generationConfigobjectoptionalGeneration tuning. Fields: temperature (0-2), topP (0-1), topK (integer), maxOutputTokens, candidateCount (default 1), stopSequences (array), responseMimeType ("text/plain" | "application/json"), responseSchema (OpenAPI-subset schema for strict structured JSON output).
systemInstructionobjectoptionalSystem prompt as { parts: [{ text: "..." }] } (or just { text: "..." } — server accepts both).
toolsarrayoptionalGemini tool/function-calling declarations. Function schema: [{ functionDeclarations: [{ name, description, parameters: <JSON Schema> }] }]. Also supports googleSearch + codeExecution built-in tools.
toolConfigobjectoptional{ functionCallingConfig: { mode: "AUTO" | "ANY" | "NONE", allowedFunctionNames: [...] } } — control when/which functions can be called.
safetySettingsarrayoptionalPer-category harm thresholds (overrides defaults). [{ category: "HARM_CATEGORY_*", threshold: "BLOCK_NONE"|"BLOCK_LOW_AND_ABOVE"|… }].
cachedContentstringoptionalFull cached-content resource name (projects/<id>/locations/<region>/cachedContents/<id>) — reuses a pre-cached system prompt + long context for 75% discount on cached tokens. Server passes through as-is.

Endpoint

MethodPOST
URLhttps://api.skillboss.co/v1beta/models/gemini-3.5-flash:generateContent
Auth headerx-goog-api-key: $SKILLBOSS_WHOLESALE_KEY
Content-Typeapplication/json
StreamingUse the :streamGenerateContent path (instead of :generateContent) for SSE streaming.

Errors

The API uses standard HTTP status codes:

200OKRequest succeeded.
400Bad RequestInvalid model, missing required field, or malformed JSON.
401UnauthorizedMissing or invalid wholesale key. Non-wholesale console keys are rejected here.
402Insufficient CreditsWholesale balance too low — top up on the wholesale dashboard.
429Rate LimitedToo many requests. Back off with exponential delay.
500Server ErrorTransient upstream issue. Safe to retry.
503Upstream UnavailableDiscount pool capacity issue (lite tier). Retry or fall back to standard tier.

Pricing

Wholesale pricing is your account-specific discount × vendor list price. Discount rate depends on your contract — see the live numbers on the wholesale dashboard. The dashboard shows your current cost per 1M tokens (or per image / per second) for every model.

No platform markup on standard token billing. Volume tiers + monthly caps are configurable per key.