Speech-to-Text

GLM ASR 2512 API

GLM-ASR-2512 by Z.ai — speech-to-text transcription with a low character error rate (0.0717). POST /v1/run with `inputs.audio` set to a public audio URL (wav/mp3/m4a/ogg/flac/webm) and get back the transcribed `text`. Stereo files are auto-downmixed to mono. Token-billed on audio (input) + text (output) tokens — typically a fraction of a cent per minute of speech.

model: glm/glm-asr-2512
Wholesale key required. This endpoint accepts wholesale keys only. Have access? Get your key on the wholesale dashboard. Not on wholesale yet? Email admin@skillboss.co with your use case — we'll get you provisioned.

Quickstart

Speech-to-text uses /v1/run. Pass `inputs.audio` as a public URL to the audio file (wav/mp3/m4a/ogg/flac/webm) — stereo is auto-downmixed to mono. The response returns the transcribed `text`. Token-billed on audio (input) + text (output) tokens.

bashcurl --max-time 120 https://api.skillboss.co/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_WHOLESALE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm/glm-asr-2512",
    "inputs": {
      "audio": "https://interactive-examples.mdn.mozilla.net/media/examples/t-rex-roar.mp3"
    }
  }'

Your first 200 response is the fastest way to confirm setup. From there, swap in your real prompt and tune the model-specific parameters listed below.

Authentication

Every request must include your wholesale key. The header name depends on the endpoint — match the SDK you're using:

bashAuthorization: Bearer $SKILLBOSS_WHOLESALE_KEY

Treat the wholesale key like a password — never commit it to source control or ship it in client-side bundles. Rotate from the wholesale dashboard if exposed. Standard (non-wholesale) console keys are rejected at the gateway with 401.

Code examples

Python
pythonimport os, requests

resp = requests.post(
    "https://api.skillboss.co/v1/run",
    headers={"Authorization": f"Bearer {os.environ['SKILLBOSS_WHOLESALE_KEY']}"},
    json={
        "model": "glm/glm-asr-2512",
        "inputs": {
            "audio": "https://interactive-examples.mdn.mozilla.net/media/examples/t-rex-roar.mp3",
        },
    },
    timeout=120,
)
data = resp.json()
print(data["text"])  # transcribed text
JavaScript / TypeScript
typescriptconst resp = await fetch("https://api.skillboss.co/v1/run", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.SKILLBOSS_WHOLESALE_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "glm/glm-asr-2512",
    inputs: {
      audio: "https://interactive-examples.mdn.mozilla.net/media/examples/t-rex-roar.mp3",
    },
  }),
});
const data = await resp.json();
console.log(data.text);  // transcribed text

Parameters

Pass the model identifier as a top-level model field, and the audio source under an inputs object. Set inputs.audio to a public URL of the audio file; the response returns the transcribed text.

NameTypeRequiredDescription
modelstringrequiredModel id. Use glm/glm-asr-2512.
inputs.audiostringrequiredPublic URL of the audio file to transcribe. Supports wav, mp3, m4a, ogg, flac, webm. Stereo files are automatically downmixed to mono. Token-billed on audio (input) + transcribed-text (output) tokens.
inputs.languagestringoptionalOptional ISO language hint, e.g. zh or en. Auto-detected when omitted.

Endpoint

MethodPOST
URLhttps://api.skillboss.co/v1/run
Auth headerAuthorization: Bearer $SKILLBOSS_WHOLESALE_KEY
Content-Typeapplication/json
StreamingNo SSE streaming — transcription returns the complete text in a single response.

Errors

The API uses standard HTTP status codes:

200OKRequest succeeded.
400Bad RequestInvalid model, missing required field, or malformed JSON.
401UnauthorizedMissing or invalid wholesale key. Non-wholesale console keys are rejected here.
402Insufficient CreditsWholesale balance too low — top up on the wholesale dashboard.
429Rate LimitedToo many requests. Back off with exponential delay.
500Server ErrorTransient upstream issue. Safe to retry.
503Upstream UnavailableDiscount pool capacity issue (lite tier). Retry or fall back to standard tier.

Pricing

Wholesale pricing is your account-specific discount × vendor list price. Discount rate depends on your contract — see the live numbers on the wholesale dashboard. The dashboard shows your current cost per 1M tokens (or per image / per second) for every model.

No platform markup on standard token billing. Volume tiers + monthly caps are configurable per key.

GLM ASR 2512 API — Wholesale Integration Guide | SkillBoss