# 超拟人交互 API

# 前言

# Web接口说明

  1. 必须符合WebSocket协议规范(RFC6455)。
  2. WebSocket握手成功后,用户在10秒内未发送请求数据,服务端会主动断开连接。
  3. 默认采用全双工交互模式,单个连接最长可持续30分钟。
  4. 服务端下发错误码后,客户端应重建连接。

# 一、API介绍

极速多模模拟人是端到端语音交互方案,支持音频输入与合成音频/视频流输出。

# 1.1 开通授权

  1. 在xx平台应用管理页面领取/购买授权包
  2. 联系技术支持开通授权
  3. DeepSeek版本需额外申请授权

# 二、API接入说明

# 2.1 请求说明

# 2.1.1请求地址

wss://sparkos.xfym.cn/v1/openapi/chat

# 2.1.2接口鉴权

参考通用鉴权 (opens new window)

# 2.1.3请求协议示例

{
    "header": {
        "app_id": "", // 应用id
        "uid": "", // 用户唯一标识,用户关联用户交互历史,
        "status": 0, // 客户端发送数据的状态   0:开始;1:中间状态;2:结束
        "stmid": "1", // 交互轮数,客户端发起的交互计数。使用continuous_vad单工模式时,一次ws连接内的每一轮对话【必须】递增更新stmid。  使用continuous双工模式时,stmid固定不变。
        "scene": "sos_app", // 固定默认值
        "interact_mode":"continuous" //交互模式选择,支持continuous(双工)和 continuous_vad(单工)可选,默认continuous
    },
    "parameter": {
        "iat": {
            "iat": {
                "encoding": "utf8",
                "compress": "raw",
                "format": "json"
            },
            "vgap":50
        },
        "nlp": {
            "nlp": {
                "encoding": "utf8",
                "compress": "raw",
                "format": "json"
            },
            "new_session": "true",
            "personal":"人设 id",
            "prompt":"prompt 信息, 例如:你是小明,一个小学学生,热爱画画"
        },
        "tts": {
            "vcn": "x5_lingfeiyi_flow",
            "res_id": "xxxx",
            "res_gender":"",
            "speed": 50,
            "volume": 50,
            "pitch": 50,
            "tts": {
                "encoding": "raw",
                "sample_rate": 16000,
                "channels": 1,
                "bit_depth": 16,
                "frame_size": 0
            }
        },
        "avatar": {
            "avatar_id": "",
            "image": "",
            "encoding": "",
            "width": 512,
            "height": 512
        }
    },
    "payload": {
        "audio": {
            "status": 0,
            "audio": "base64的音频数据",
            "encoding": "raw",
            "sample_rate": 16000,
            "channels": 1,
            "bit_depth": 16,
            "frame_size": 0
        }
    }
}

# 2.1.4请求参数

# Header 参数
参数名称 类型 必传 描述 取值限制/备注
app_id string 应用ID maxLength:15
uid string 授权的用户ID maxLength:64,需保证在appid下唯一
status int 会话状态 0:第一帧,1:中间帧,2:最后一帧
stmid string 会话ID/交互轮数 maxLength:32
单工模式(continuous_vad)必须递增
双工模式(continuous)固定不变
需为可解析为整数的字符串,如"0","1",具体可见下文示例
scene string 情景模式 maxLength:16,从AIUI/飞云平台创建
msc.lat float 纬度 [-90,90]
msc.lng float 经度 [-180,180]
interact_mode string 交互模式 continuous(全双工)
continuous_vad(单工)
bot_id string 指定bot ID maxLength:64
示例:sos_app_deepseekv3
# Parameter 参数
# 语音识别(iat)
参数名称 类型 必传 描述 取值限制
iat.encoding string 结果编码 仅支持utf8
iat.compress string 压缩类型 仅支持raw
iat.format string 结果格式 仅支持json
vgap int 静音断句阈值 默认80(800ms)
范围40-1000(400-10000ms)
# NLP(nlp)
参数名称 类型 必传 描述 取值限制
nlp.encoding string 结果编码 仅支持utf8
nlp.compress string 压缩类型 仅支持raw
nlp.format string 结果格式 仅支持json
new_session string 是否新会话 "true"/"global":清空历史
"false":保留历史
prompt string 回复要求设定 通过该参数设定大模型回复风格、格式以及其他回答要求等
# 语音合成(tts)
参数名称 类型 必传 描述 取值限制
tts.encoding string 输出音频编码 raw/lame/opus-wb/opus-swb
默认raw:pcm音频
lame:mp3格式音频
sample_rate int 采样率 16000/24000
默认16000Hz
channels int 声道数 默认1(单声道)
bit_depth int 位深 默认16bit
vcn string 发音人 x5_lingxiaoyue_flow(聆小玥,女性助理)
x5_lingfeiyi_flow(聆飞逸,男性助理)
speed int 语速 0-100,默认50
volume int 音量 0-100,默认50
pitch int 音调 0-100,默认50
# Payload 参数
# 音频数据(audio)
参数名称 类型 必传 描述 取值限制
audio string 音频数据 base64编码
单帧间隔≤40ms数据
status int 音频状态 0:首帧,1:中间帧,2:末帧
frame_size int 压缩帧大小 未压缩传0
encoding string 编码格式 raw/opus
默认raw
sample_rate int 采样率 8000/16000
默认16000Hz
channels int 声道数 1/2
默认1
bit_depth int 位深 8/16
默认16bit
# 2.1.5 请求示例

单工模式(continuous_vad):端侧控制音频开始上传的模式,主要应用于嘈杂的环境,避免噪音导致误输入


  • 每轮对话【必须】更新header.stmid的值 并且不能重复以前的值。比如"0", "1", "2"...。
  • header.status: 会话的状态。第一个数据包的header.status是0,后续都是1,结束时传2。
  • payload.audio.status:音频的状态。每轮对话的音频,从0 1 ... 1 2。
  • 每轮对话的第一个数据包,发送全部的字段。后续的数据包,可以发送简化的内容(可以不发parameter部分,header中也只需要发送几个关键的字段)。

``

建立ws连接
#【第一轮对话: header.stmid="0"】
# 第一轮对话的第一个数据包:需要发送全部的字段  (header.status=0, payload.audio.status=0)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":0,"stmid":"0","scene":"ai-personality-1","os_sys":"android","interact_mode":"continuous_vad","pers_param":"{\"appid\":\"879230fc\",\"uid\":\"thzhang4\"}"},"parameter":{"iat":{"vgap":60,"dwa":"wpgs","iat":{"encoding":"utf8","compress":"raw","format":"json"},"eos":"800","domain":"sms"},"nlp":{"nlp":{"encoding":"utf8","compress":"raw","format":"json"},"new_session":"false"},"tts":{"vcn":"x5_lingxiaoyue_flow","speed":50,"volume":50,"pitch":50,"tts":{"encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}},"payload":{"audio":{"status":0,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 后续的数据包(header.status=1, payload.audio.status=1)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 第一轮轮对话的最后一个数据包(header.status=1, payload.audio.status=2)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":2,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}


# 【第二轮对话  更新header.stmid="1"】
# 第二轮对话的第一个数据包:需要发送全部的字段  (header.status=1, payload.audio.status=0)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1","os_sys":"android","interact_mode":"continuous_vad","pers_param":"{\"appid\":\"879230fc\",\"uid\":\"thzhang4\"}"},"parameter":{"iat":{"vgap":60,"dwa":"wpgs","iat":{"encoding":"utf8","compress":"raw","format":"json"},"eos":"800","domain":"sms"},"nlp":{"nlp":{"encoding":"utf8","compress":"raw","format":"json"},"new_session":"false"},"tts":{"vcn":"x5_lingxiaoyue_flow","speed":50,"volume":50,"pitch":50,"tts":{"encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}},"payload":{"audio":{"status":0,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 后续的数据包(header.status=1, payload.audio.status=1)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 第二轮轮对话的最后一个数据包(header.status=1, payload.audio.status=2)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"1","scene":"ai-personality-1"},"payload":{"audio":{"status":2,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}


# 【第三轮对话  更新header.stmid="2"】
# 第三轮对话的第一个数据包:需要发送全部的字段  (header.status=1, payload.audio.status=0)
# 后续的数据包(header.status=1, payload.audio.status=1)
# 第三轮轮对话的最后一个数据包(header.status=1, payload.audio.status=2)


# 结束会话
header.status=2
# 关闭ws连接
ws.close()


双工模式(continuous):应用于持续上传音频的模式,类似语音通话的场景


  • header.stmid的值【固定不变】
  • header.status: 会话的状态。第一个数据包的header.status是0,后续都是1,结束时传2。
  • payload.audio.status:音频的状态。第一个数据包的header.status是0,后续都是1,结束时传2。
  • 第一个数据包,发送全部的字段。后续的数据包,可以发送简化的内容(可以不发parameter部分,header中也只需要发送几个关键的字段)。
  • 【持续不断的发送音频】:即使用户没有说话,也要一直发送音频。服务侧会自动判断用户有没有说话。
建立ws连接

# 第一个数据包:需要发送全部的字段  (header.status=0, payload.audio.status=0)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":0,"stmid":"0","scene":"ai-personality-1","os_sys":"android","interact_mode":"continuous","test":"孙悟空","pers_param":"{\"appid\":\"879230fc\",\"uid\":\"thzhang4\"}"},"parameter":{"iat":{"vgap":60,"dwa":"wpgs","iat":{"encoding":"utf8","compress":"raw","format":"json"}},"nlp":{"nlp":{"encoding":"utf8","compress":"raw","format":"json"},"new_session":"false"},"tts":{"vcn":"x5_lingxiaoyue_flow","speed":50,"volume":50,"pitch":50,"tts":{"encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}},"payload":{"audio":{"status":0,"audio":"base64的音频","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
# 后续的数据包(header.status=1, payload.audio.status=1)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
{"header":{"app_id":"879230fc","uid":"thzhang4","status":1,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":1,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}
...
# 即使用户没有说话,也要一直发送音频
...
# 结束会话 (header.status=2, payload.audio.status=2)
{"header":{"app_id":"879230fc","uid":"thzhang4","status":2,"stmid":"0","scene":"ai-personality-1"},"payload":{"audio":{"status":2,"audio":"base64的音频数据","encoding":"raw","sample_rate":16000,"channels":1,"bit_depth":16}}}

# 关闭ws连接
ws.close()

# 三、响应说明

# 3.1 响应示例

首帧数据发送成功响应 // 不报错可以省略掉,通常用于返回参数错误、链接错误、鉴权等错误码

{
    "header": {
        "code": 0,
        "message": "success",
        "sid": "xgo00010205@dx192743899eb0001822-audio-1",
        "status": 0,
        "stmid": "1"
    }
}

正常结果示例

{
    "header": {
        "code": 0,
        "message": "success",
        "sid": "xgo00010205@dx192743899eb0001822-audio-1",
        "status": 0,
        "stmid":"1"
    },
    "payload": {
        "event": {
            "compress": "",
            "encoding": "",
            "format": "",
            "seq": 0,
            "status": 0,
            "text": "eyJ0eXBlIjoiVmFkIiwiZGF0YSI6IiIsImtleSI6IkJvcyIsImRlc2MiOnt9fQ=="
        },
        "iat": {
            "compress": "raw",
            "encoding": "utf8",
            "format": "json",
            "seq": 1,
            "status": 2,
            "text": "eyJ0ZXh0Ijp7InNuIjoxLCJscyI6ZmFsc2UsImJnIjowLCJyZyI6bnVsbCwiZWQiOjAsInBncyI6IiIsInJzdCI6IiIsInNpZ24iOiIiLCJ3cyI6W3siYmciOjAsImN3IjpbeyJzYyI6MCwidyI6IuS9oCIsInBoIjoiIn1dfSx7ImJnIjowLCJjdyI6W3sic2MiOjAsInciOiLlj6siLCJwaCI6IiJ9XX0seyJiZyI6MCwiY3ciOlt7InNjIjowLCJ3Ijoi5LuA5LmIIiwicGgiOiIifV19LHsiYmciOjAsImN3IjpbeyJzYyI6MCwidyI6IuWQjeWtlyIsInBoIjoiIn1dfV19fQ=="
        },
         "nlp": {
            "compress": "",
            "encoding": "",
            "format": "",
            "seq": 0,
            "status": 0,
            "text": "5bGV56S65LqG5LiA5Liq"
        },
         "tts": {
            "compress": "",
            "encoding": "",
            "format": "",
            "seq": 0,
            "status": 0,
            "audio": "Base64 audio data"
        },
         "cbm_vms": {
            "compress": "",
            "encoding": "",
            "format": "",
            "seq": 0,
            "status": 0,
            "text": "Base64 vms connect/start/stop info"
        },
        
    }
}

异常结果

{
    "header": {
        "code": 10110,
        "message": "server licence error",
        "sid": "xgo00010205@dx192743899eb0001822-audio-1",
        "status": 2,
        "stmid": "1"
    }
}

# 3.2 响应参数

# 3.2.1协议字段说明

# 字段

字段名 类型 描述 备注
header object -
header.code int 服务错误码 0表示成功,其他值表示失败
header.sid string 会话的sid
header.status int 会话的状态 取值示例:0, 1, 1, ..., 1, 2
header.stmid string 会话的id 长连接中每个会话的唯一ID;客户端发起请求后云端迭代返回(如0-1→0-2→0-3)
payload object 数据字段 包含详细业务数据
payload.event object 事件数据
payload.event.text base64 数据内容 云端的识别会判断用户音频的断点信息,会下发相应的事件,目前主要分为三类事件,包括开始说话,结束说话,无人说话
payload.event.text.status int 数据的状态 可选值0,1,2。 其中0表示首帧结果,1表示中间帧,2表示最后一帧。
payload.event.text.encoding string 数据的编码格式 --
payload.event.text.seq int 数据段的编号 --
payload.event.text.format string 数据内容的格式 --
payload.event.text.compress string 数据内容的压缩方法 --
payload.iat object 用户语音输入的识别结果 详见下方参数示例
payload.nlp object 模型回复的文本内容 详见下方参数示例
payload.tts object 模型合成的音频数据 详见下方参数示例

# 3.2.2 数据字段详解

事件数据

云端的识别会判断用户音频的断点信息,会下发相应的事件,目前主要分为三类事件,包括开始说话,结束说话,无人说话,
事件结果在 payload.event.text 中

事件目前分三种事件,详细说明如下:

// 以下结果已经 base64decode text的内容
Bos 事件,检测到音频中有人说话,触发该事件返回
{"type":"Vad","data":"","key":"Bos","desc":{}}

Eos事件,检测到音频中子句说完了,触发该事件返回
{"type":"Vad","data":"","key":"Eos","desc":{}}

Silence 事件,检测到音频无人说话,最终结束事件,收到后结束会话即可,一般静音事件过长,或者上传status=2时会触发
{"type":"Vad","data":"","key":"Silence","desc":{}}

识别数据

识别结果在 payload.iat.text 字段中(注意示例为流式结果,当前默认识别流式结果)

 //以下结果已经 base64decode text的内容
{"sn":1,"ls":false,"bg":0,"ed":0,"pgs":"apd","ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]}]}
{"sn":2,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,1],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"1385"}]}]}
{"sn":3,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,2],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856"}]}]}
{"sn":4,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,3],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"138569"}]}]}
{"sn":5,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,4],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"1385690"}]}]}
{"sn":6,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,5],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901"}]}]}
{"sn":7,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,6],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"138569012"}]}]}
{"sn":8,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,7],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"1385690123"}]}]}
{"sn":9,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,8],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]}]}
{"sn":10,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,9],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]}]}
{"sn":11,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,10],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"重"}]},{"bg":0,"cw":[{"sc":0.00,"w":"六"}]}]}
{"sn":12,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,11],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"六"}]}]}
{"sn":13,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,12],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"六"}]},{"bg":0,"cw":[{"sc":0.00,"w":"千"}]}]}
{"sn":14,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,13],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6800"}]}]}
{"sn":15,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,14],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6880"}]}]}
{"sn":16,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,15],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]}]}
{"sn":17,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,16],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"点"}]}]}
{"sn":18,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,17],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]}]}
{"sn":19,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,18],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]},{"bg":0,"cw":[{"sc":0.00,"w":"元"}]}]}
{"sn":20,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,19],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]},{"bg":0,"cw":[{"sc":0.00,"w":"元"}]},{"bg":0,"cw":[{"sc":0.00,"w":"话费"}]}]}
{"sn":21,"ls":false,"bg":0,"ed":0,"pgs":"rpl","rg":[1,20],"ws":[{"bg":0,"cw":[{"sc":0.00,"w":"给"}]},{"bg":0,"cw":[{"sc":0.00,"w":"13856901234"}]},{"bg":0,"cw":[{"sc":0.00,"w":"充"}]},{"bg":0,"cw":[{"sc":0.00,"w":"6888"}]},{"bg":0,"cw":[{"sc":0.00,"w":"."}]},{"bg":0,"cw":[{"sc":0.00,"w":"8"}]},{"bg":0,"cw":[{"sc":0.00,"w":"元"}]},{"bg":0,"cw":[{"sc":0.00,"w":"话费"}]}]}
{"sn":22,"ls":true,"bg":0,"ed":0,"pgs":"apd","ws":[{"bg":0,"cw":[{"sc":0.00,"w":"。"}]}]}

字段含义:

Json字段 类型 描述
pgs string 开启wpgs会有此字段 取值为 "apd"时表示该片结果是追加到前面的最终结果;取值为"rpl" 时表示替换前面的部分结果,替换范围为rg字段
rg array 替换范围,开启wpgs会有此字段 假设值为[2,5],则代表要替换的是第2次到第5次返回的结果

**后台保证返回pgs结果无交叉嵌套。根据参数说明得到上述示例的最终听写结果为 “给13856901234充6888.8元话费。” **

解析规则:

  1. 普通结果,每个识别结果均是独立结果,直接解析即可
  2. PGS结果,需要解析BOS和EOS之前的所有结果进行拼接,最后的结果一般在rst=rlt中

回复文本数据

模型结果在 payload.nlp.text 结果中,流式下发

//以下结果已经 base64decode text的内容
今天
天气晴朗
,气温 10~12℃,
东北风微风

合成音频数据

合成音频数据在 payload.tts.audio 中,需要客户端 base64 decode,可按照音频参数encoding、sample_rate、bit_depth 播放

合成音频数据示例如下,音频数据段大,流式下发:

{
    "payload": {
        "tts": {
            "audio": "Base64 data ",
            "bit_depth": 16,
            "channels": 1,
            "encoding": "raw",
            "frame_size": 0,
            "sample_rate": 24000,
            "seq": 1,
            "status": 0
        }
    }
}

# 四、示例代码

Python示例demo (opens new window)

Java示例demo (opens new window)

在线咨询
体验中心