说话人分离
Speaker Diarization
上传一段多人对话音频,服务返回带时间戳的说话人分段结果。 适用于会议纪要、播客转写、通话质检、访谈剪辑—— 谁在什么时间说了话,一次调用说清楚。
Upload a multi-speaker audio file and the service returns timestamped speaker-labeled segments. Great for meeting minutes, podcast transcription, call QA, and interview cutting — who spoke, and when, in a single request.
限制与约束
Limits & Constraints
超出任一限制将返回 413 Payload Too Large 或 422 Validation Error——
参见 §05 错误码。
Exceeding any limit returns 413 Payload Too Large or 422 Validation Error —
see §05 Error Codes.
说明:音频被内部重采样至 16 kHz 单声道;采样率、声道数、位深不受限制。
模型无语音识别输出,返回的 text 字段始终为空字符串。
Note: audio is internally resampled to 16 kHz mono; sample rate, channels, and bit depth
are unrestricted on input. The model does not perform ASR — the text field
is always an empty string.
性能指标
Performance
在线体验
Try It
上传一段音频(或拖进虚线框)。返回结果后可点击时间轴或列表定位播放,一目了然。
Upload an audio file (or drag into the dashed box). After the result comes back, click the timeline or list to jump to playback.
请求与响应
Request & Response
先看下面请求字段和响应结构,再切到最下方语言 tab 复制可直接运行的代码片段。 需要边说边分请看 说话人日志 · 实时流。
Start by reading the request fields and response shape below, then scroll to the language tabs at the bottom for copy-paste-ready snippets. Need diarization as-you-speak? See Speaker Diarization · Stream.
上传一段音频,返回说话人分段列表(segments[],按起始时间升序)。
请求为 multipart/form-data。本接口 不做 ASR,text 恒为空串。
Upload a single audio file, get a speaker-segment list (segments[]) sorted by start time.
Request body is multipart/form-data. This endpoint does not perform ASR;
the text field is always an empty string.
请求字段
Request fields
wav / mp3 / flac / m4a / ogg。任意采样率/声道,内部重采样至 16 kHz 单声道。wav / mp3 / flac / m4a / ogg. Any sample rate or channel layout — internally resampled to 16 kHz mono.AUTH_ENABLED=True 时需要。格式 Bearer <token>,token 由 POST /auth/login 获取。AUTH_ENABLED=True. Format: Bearer <token>. Obtain via POST /auth/login.响应
Response
// segments[] 按 start_time 升序;text 恒为空串(本接口不做 ASR) { "segments": [ { "start_time": 0.00, "end_time": 4.71, "speaker_id": 0, "text": "" }, { "start_time": 5.21, "end_time": 11.22, "speaker_id": 1, "text": "" } ] }
响应字段
Response fields
end_time - start_time 即为该段时长。end_time - start_time.代码示例
Code snippets
错误码
Error Codes
成功响应返回 200 及 segments[];失败响应为 application/json,
形如 { "detail": "..." }。
Success returns 200 with segments[]. Errors are
application/json in the form { "detail": "..." }.
Authorization 头缺失或 token 不匹配。Authorization header is missing or token mismatch.GET /readiness 可探活。GET /readiness to probe.AI 集成 — 一键复制提示词
AI Integration — Copy the Prompt
将下面这段提示词复制到 Claude / Cursor / ChatGPT 里,让 AI 替你写接入代码。 提示词含接口契约、鉴权、重试、错误处理——很难出错。
Paste the prompt below into Claude / Cursor / ChatGPT and let the AI write your integration code. It bakes in the contract, auth, retries, and error handling — hard to get wrong.
用 AI 快速集成
Integrate fast with AI
粘贴到对话框,加一句「用我的技术栈实现」即可。已覆盖边界情况:大文件分片建议、限流退避、鉴权缺失、模型加载中等。
Paste into the chat and add "implement in my stack". Covers edge cases: large-file chunking, rate-limit backoff, missing auth, model-loading state.