# 超拟人交互 API
# 前言
# Web接口说明
- 必须符合WebSocket协议规范(RFC6455)。
- WebSocket握手成功后,用户在10秒内未发送请求数据,服务端会主动断开连接。
- 默认采用全双工交互模式,单个连接最长可持续30分钟。
- 服务端下发错误码后,客户端应重建连接。
# 一、API介绍
极速多模模拟人是端到端语音交互方案,支持音频输入与合成音频/视频流输出。
# 1.1 开通授权
- 在xx平台应用管理页面领取/购买授权包
- 联系技术支持开通授权
- DeepSeek版本需额外申请授权
# 二、API接入说明
# 2.1 请求说明
# 2.1.1请求地址
wss://sparkos.xfym.cn/v1/openapi/chat
# 2.1.2接口鉴权
# 2.1.3请求协议示例
{
"header": {
"app_id": "", // 应用id
"uid": "", // 用户唯一标识,用户关联用户交互历史,
"status": 0, // 客户端发送数据的状态 0:开始;1:中间状态;2:结束
"stmid": "1", // 交互轮数,客户端发起的交互计数。使用continuous_vad单工模式时,一次ws连接内的每一轮对话【必须】递增更新stmid。 使用continuous双工模式时,stmid固定不变。
"scene": "sos_app", // 固定默认值
"interact_mode":"continuous" //交互模式选择,支持continuous(双工)和 continuous_vad(单工)可选,默认continuous
},
"parameter": {
"iat": {
"iat": {
"encoding": "utf8",
"compress": "raw",
"format": "json"
},
"vgap":50
},
"nlp": {
"nlp": {
"encoding": "utf8",
"compress": "raw",
"format": "json"
},
"new_session": "true",
"personal":"人设 id",
"prompt":"prompt 信息, 例如:你是小明,一个小学学生,热爱画画"
},
"tts": {
"vcn": "x5_lingfeiyi_flow",
"res_id": "xxxx",
"res_gender":"",
"speed": 50,
"volume": 50,
"pitch": 50,
"tts": {
"encoding": "raw",
"sample_rate": 16000,
"channels": 1,
"bit_depth": 16,
"frame_size": 0
}
},
"avatar": {
"avatar_id": "",
"image": "",
"encoding": "",
"width": 512,
"height": 512
}
},
"payload": {
"audio": {
"status": 0,
"audio": "base64的音频数据",
"encoding": "raw",
"sample_rate": 16000,
"channels": 1,
"bit_depth": 16,
"frame_size": 0
}
}
}
# 2.1.4请求参数
# Header 参数
参数名称 | 类型 | 必传 | 描述 | 取值限制/备注 |
---|---|---|---|---|
app_id | string | 是 | 应用ID | maxLength:15 |
uid | string | 是 | 授权的用户ID | maxLength:64,需保证在appid下唯一 |
status | int | 是 | 会话状态 | 0:第一帧,1:中间帧,2:最后一帧 |
stmid | string | 是 | 会话ID/交互轮数 | maxLength:32 单工模式(continuous_vad)必须递增 双工模式(continuous)固定不变 需为可解析为整数的字符串,如"0","1",具体可见下文示例 |
scene | string | 是 | 情景模式 | maxLength:16,从AIUI/飞云平台创建 |
msc.lat | float | 否 | 纬度 | [-90,90] |
msc.lng | float | 否 | 经度 | [-180,180] |
interact_mode | string | 是 | 交互模式 | continuous(全双工) continuous_vad(单工) |
bot_id | string | 否 | 指定bot ID | maxLength:64 示例:sos_app_deepseekv3 |
# Parameter 参数
# 语音识别(iat)
参数名称 | 类型 | 必传 | 描述 | 取值限制 |
---|---|---|---|---|
iat.encoding | string | 是 | 结果编码 | 仅支持utf8 |
iat.compress | string | 是 | 压缩类型 | 仅支持raw |
iat.format | string | 是 | 结果格式 | 仅支持json |
vgap | int | 否 | 静音断句阈值 | 默认80(800ms) 范围40-1000(400-10000ms) |
# NLP(nlp)
参数名称 | 类型 | 必传 | 描述 | 取值限制 |
---|---|---|---|---|
nlp.encoding | string | 是 | 结果编码 | 仅支持utf8 |
nlp.compress | string | 是 | 压缩类型 | 仅支持raw |
nlp.format | string | 是 | 结果格式 | 仅支持json |
new_session | string | 否 | 是否新会话 | "true"/"global":清空历史 "false":保留历史 |
prompt | string | 否 | 回复要求设定 | 通过该参数设定大模型回复风格、格式以及其他回答要求等 |
# 语音合成(tts)
参数名称 | 类型 | 必传 | 描述 | 取值限制 |
---|---|---|---|---|
tts.encoding | string | 是 | 输出音频编码 | raw/lame/opus-wb/opus-swb 默认raw:pcm音频 lame:mp3格式音频 |
sample_rate | int | 是 | 采样率 | 16000/24000 默认16000Hz |
channels | int | 是 | 声道数 | 默认1(单声道) |
bit_depth | int | 是 | 位深 | 默认16bit |
vcn | string | 是 | 发音人 | x5_lingxiaoyue_flow(聆小玥,女性助理) x5_lingfeiyi_flow(聆飞逸,男性助理) |
speed | int | 否 | 语速 | 0-100,默认50 |
volume | int | 否 | 音量 | 0-100,默认50 |
pitch | int | 否 | 音调 | 0-100,默认50 |
# Payload 参数
# 音频数据(audio)
参数名称 | 类型 | 必传 | 描述 | 取值限制 |
---|---|---|---|---|
audio | string | 是 | 音频数据 | base64编码 单帧间隔≤40ms数据 |
status | int | 是 | 音频状态 | 0:首帧,1:中间帧,2:末帧 |
frame_size | int | 是 | 压缩帧大小 | 未压缩传0 |
encoding | string | 是 | 编码格式 | raw/opus 默认raw |
sample_rate | int | 是 | 采样率 | 8000/16000 默认16000Hz |
channels | int | 是 | 声道数 | 1/2 默认1 |
bit_depth | int | 是 | 位深 | 8/16 默认16bit |
# 2.1.5 请求示例
单工模式(continuous_vad):端侧控制音频开始上传的模式,主要应用于嘈杂的环境,避免噪音导致误输入
- 每轮对话【必须】更新header.stmid的值 并且不能重复以前的值。比如"0", "1", "2"...。
- header.status: 会话的状态。第一个数据包的header.status是0,后续都是1,结束时传2。
- payload.audio.status:音频的状态。每轮对话的音频,从0 1 ... 1 2。
- 每轮对话的第一个数据包,发送全部的字段。后续的数据包,可以发送简化的内容(可以不发parameter部分,header中也只需要发送几个关键的字段)。
``
建立ws连接
#【第一轮对话: header.stmid="0"】
# 第一轮对话的第一个数据包:需要发送全部的字段 (header.status=0, payload.audio.status=0)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":0,"stmid":"0","scene":"ai-personality-1","os_sys":"android","interact_mode":"continuous_vad","pers_param":"{\"appid\":\"879230fc\",\"uid\":\"thzhang4\"}"},"parameter":{"iat":{"vgap":60,"dwa":"wpgs","iat":{"encoding":"utf8","compress":"raw","format":"json"},"eos":"800","domain":"sms"},"nlp":{"nlp":{"encoding":"utf8","compress":"raw","format":"json"},"new_session":"false"},"tts":{"vcn":"x5_lingxiaoyue_flow","speed":50,"volume":50,"pitch":50,"tts":{"encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}},"payload":{"audio":{"status":0,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 后续的数据包(header.status=1, payload.audio.status=1)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 第一轮轮对话的最后一个数据包(header.status=1, payload.audio.status=2)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":2,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 【第二轮对话 更新header.stmid="1"】
# 第二轮对话的第一个数据包:需要发送全部的字段 (header.status=1, payload.audio.status=0)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1","os_sys":"android","interact_mode":"continuous_vad","pers_param":"{\"appid\":\"879230fc\",\"uid\":\"thzhang4\"}"},"parameter":{"iat":{"vgap":60,"dwa":"wpgs","iat":{"encoding":"utf8","compress":"raw","format":"json"},"eos":"800","domain":"sms"},"nlp":{"nlp":{"encoding":"utf8","compress":"raw","format":"json"},"new_session":"false"},"tts":{"vcn":"x5_lingxiaoyue_flow","speed":50,"volume":50,"pitch":50,"tts":{"encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}},"payload":{"audio":{"status":0,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 后续的数据包(header.status=1, payload.audio.status=1)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 第二轮轮对话的最后一个数据包(header.status=1, payload.audio.status=2)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1"},"payload":{"audio":{"status":2,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 【第三轮对话 更新header.stmid="2"】
# 第三轮对话的第一个数据包:需要发送全部的字段 (header.status=1, payload.audio.status=0)
# 后续的数据包(header.status=1, payload.audio.status=1)
# 第三轮轮对话的最后一个数据包(header.status=1, payload.audio.status=2)
# 结束会话
header.status=2
# 关闭ws连接
ws.close()
双工模式(continuous):应用于持续上传音频的模式,类似语音通话的场景
- header.stmid的值【固定不变】
- header.status: 会话的状态。第一个数据包的header.status是0,后续都是1,结束时传2。
- payload.audio.status:音频的状态。第一个数据包的header.status是0,后续都是1,结束时传2。
- 第一个数据包,发送全部的字段。后续的数据包,可以发送简化的内容(可以不发parameter部分,header中也只需要发送几个关键的字段)。
- 【持续不断的发送音频】:即使用户没有说话,也要一直发送音频。服务侧会自动判断用户有没有说话。
建立ws连接
# 第一个数据包:需要发送全部的字段 (header.status=0, payload.audio.status=0)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":0,"stmid":"0","scene":"ai-personality-1","os_sys":"android","interact_mode":"continuous","test":"孙悟空","pers_param":"{\"appid\":\"879230fc\",\"uid\":\"thzhang4\"}"},"parameter":{"iat":{"vgap":60,"dwa":"wpgs","iat":{"encoding":"utf8","compress":"raw","format":"json"}},"nlp":{"nlp":{"encoding":"utf8","compress":"raw","format":"json"},"new_session":"false"},"tts":{"vcn":"x5_lingxiaoyue_flow","speed":50,"volume":50,"pitch":50,"tts":{"encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}},"payload":{"audio":{"status":0,"audio":"base64的音频","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 后续的数据包(header.status=1, payload.audio.status=1)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
# 即使用户没有说话,也要一直发送音频
...
# 结束会话 (header.status=2, payload.audio.status=2)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":2,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":2,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 关闭ws连接
ws.close()
# 三、响应说明
# 3.1 响应示例
首帧数据发送成功响应 // 不报错可以省略掉,通常用于返回参数错误、链接错误、鉴权等错误码
{
"header": {
"code": 0,
"message": "success",
"sid": "xgo00010205@dx192743899eb0001822-audio-1",
"status": 0,
"stmid": "1"
}
}
正常结果示例
{
"header": {
"code": 0,
"message": "success",
"sid": "xgo00010205@dx192743899eb0001822-audio-1",
"status": 0,
"stmid":"1"
},
"payload": {
"event": {
"compress": "",
"encoding": "",
"format": "",
"seq": 0,
"status": 0,
"text": "eyJ0eXBlIjoiVmFkIiwiZGF0YSI6IiIsImtleSI6IkJvcyIsImRlc2MiOnt9fQ=="
},
"iat": {
"compress": "raw",
"encoding": "utf8",
"format": "json",
"seq": 1,
"status": 2,
"text": "eyJ0ZXh0Ijp7InNuIjoxLCJscyI6ZmFsc2UsImJnIjowLCJyZyI6bnVsbCwiZWQiOjAsInBncyI6IiIsInJzdCI6IiIsInNpZ24iOiIiLCJ3cyI6W3siYmciOjAsImN3IjpbeyJzYyI6MCwidyI6IuS9oCIsInBoIjoiIn1dfSx7ImJnIjowLCJjdyI6W3sic2MiOjAsInciOiLlj6siLCJwaCI6IiJ9XX0seyJiZyI6MCwiY3ciOlt7InNjIjowLCJ3Ijoi5LuA5LmIIiwicGgiOiIifV19LHsiYmciOjAsImN3IjpbeyJzYyI6MCwidyI6IuWQjeWtlyIsInBoIjoiIn1dfV19fQ=="
},
"nlp": {
"compress": "",
"encoding": "",
"format": "",
"seq": 0,
"status": 0,
"text": "5bGV56S65LqG5LiA5Liq"
},
"tts": {
"compress": "",
"encoding": "",
"format": "",
"seq": 0,
"status": 0,
"audio": "Base64 audio data"
},
"cbm_vms": {
"compress": "",
"encoding": "",
"format": "",
"seq": 0,
"status": 0,
"text": "Base64 vms connect/start/stop info"
},
}
}
异常结果
{
"header": {
"code": 10110,
"message": "server licence error",
"sid": "xgo00010205@dx192743899eb0001822-audio-1",
"status": 2,
"stmid": "1"
}
}
# 3.2 响应参数
# 3.2.1协议字段说明
# 字段
字段名 | 类型 | 描述 | 备注 |
---|---|---|---|
header | object | - | |
header.code | int | 服务错误码 | 0表示成功,其他值表示失败 |
header.sid | string | 会话的sid | |
header.status | int | 会话的状态 | 取值示例:0, 1, 1, ..., 1, 2 |
header.stmid | string | 会话的id | 长连接中每个会话的唯一ID;客户端发起请求后云端迭代返回(如0-1→0-2→0-3) |
payload | object | 数据字段 | 包含详细业务数据 |
payload.event | object | 事件数据 | |
payload.event.text | base64 | 数据内容 | 云端的识别会判断用户音频的断点信息,会下发相应的事件,目前主要分为三类事件,包括开始说话,结束说话,无人说话 |
payload.event.text.status | int | 数据的状态 | 可选值0,1,2。 其中0表示首帧结果,1表示中间帧,2表示最后一帧。 |
payload.event.text.encoding | string | 数据的编码格式 | -- |
payload.event.text.seq | int | 数据段的编号 | -- |
payload.event.text.format | string | 数据内容的格式 | -- |
payload.event.text.compress | string | 数据内容的压缩方法 | -- |
payload.iat | object | 用户语音输入的识别结果 | 详见下方参数示例 |
payload.nlp | object | 模型回复的文本内容 | 详见下方参数示例 |
payload.tts | object | 模型合成的音频数据 | 详见下方参数示例 |
# 3.2.2 数据字段详解
事件数据
云端的识别会判断用户音频的断点信息,会下发相应的事件,目前主要分为三类事件,包括开始说话,结束说话,无人说话,
事件结果在 payload.event.text 中
事件目前分三种事件,详细说明如下:
// 以下结果已经 base64decode text的内容
Bos 事件,检测到音频中有人说话,触发该事件返回
{"type":"Vad","data":"","key":"Bos","desc":{}}
Eos事件,检测到音频中子句说完了,触发该事件返回
{"type":"Vad","data":"","key":"Eos","desc":{}}
Silence 事件,检测到音频无人说话,最终结束事件,收到后结束会话即可,一般静音事件过长,或者上传status=2时会触发
{"type":"Vad","data":"","key":"Silence","desc":{}}
识别数据
识别结果在 payload.iat.text 字段中(注意示例为流式结果,当前默认识别流式结果)
//以下结果已经 base64decode text的内容
{"sn":1,"ls":false,"bg":0,"ed":0,"pgs":"apd","ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]}]}
{"sn":2,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,1],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"1385"}]}]}
{"sn":3,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,2],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856"}]}]}
{"sn":4,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,3],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"138569"}]}]}
{"sn":5,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,4],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"1385690"}]}]}
{"sn":6,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,5],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901"}]}]}
{"sn":7,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,6],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"138569012"}]}]}
{"sn":8,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,7],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"1385690123"}]}]}
{"sn":9,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,8],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]}]}
{"sn":10,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,9],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]}]}
{"sn":11,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,10],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"重"}]},{"bg":0,"cw":[{"sc":0.00,"w":"六"}]}]}
{"sn":12,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,11],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"六"}]}]}
{"sn":13,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,12],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"六"}]},{"bg":0,"cw":[{"sc":0.00,"w":"千"}]}]}
{"sn":14,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,13],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6800"}]}]}
{"sn":15,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,14],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6880"}]}]}
{"sn":16,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,15],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]}]}
{"sn":17,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,16],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"点"}]}]}
{"sn":18,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,17],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]}]}
{"sn":19,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,18],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]},{"bg":0,"cw":[{"sc":0.00,"w":"元"}]}]}
{"sn":20,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,19],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]},{"bg":0,"cw":[{"sc":0.00,"w":"元"}]},{"bg":0,"cw":[{"sc":0.00,"w":"话费"}]}]}
{"sn":21,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,20],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]},{"bg":0,"cw":[{"sc":0.00,"w":"元"}]},{"bg":0,"cw":[{"sc":0.00,"w":"话费"}]}]}
{"sn":22,"ls":true,"bg":0,"ed":0,"pgs":"apd","ws":[{"bg":0,"cw":[{"sc":0.00,"w":"。"}]}]}
字段含义:
Json字段 | 类型 | 描述 |
---|---|---|
pgs | string | 开启wpgs会有此字段 取值为 "apd"时表示该片结果是追加到前面的最终结果;取值为"rpl" 时表示替换前面的部分结果,替换范围为rg字段 |
rg | array | 替换范围,开启wpgs会有此字段 假设值为[2,5],则代表要替换的是第2次到第5次返回的结果 |
**后台保证返回pgs结果无交叉嵌套。根据参数说明得到上述示例的最终听写结果为 “给13856901234充6888.8元话费。” **
解析规则:
- 普通结果,每个识别结果均是独立结果,直接解析即可
- PGS结果,需要解析BOS和EOS之前的所有结果进行拼接,最后的结果一般在rst=rlt中
回复文本数据
模型结果在 payload.nlp.text 结果中,流式下发
//以下结果已经 base64decode text的内容
今天
天气晴朗
,气温 10~12℃,
东北风微风
合成音频数据
合成音频数据在 payload.tts.audio 中,需要客户端 base64 decode,可按照音频参数encoding、sample_rate、bit_depth 播放
合成音频数据示例如下,音频数据段大,流式下发:
{
"payload": {
"tts": {
"audio": "Base64 data ",
"bit_depth": 16,
"channels": 1,
"encoding": "raw",
"frame_size": 0,
"sample_rate": 24000,
"seq": 1,
"status": 0
}
}
}
# 四、示例代码
在这篇文章中: