卷曲 IBM 语音中的"Unsupported Media Type"错误



我目前正在终端中使用此命令,尝试将我的电话单声道语音样本wav以16bps、8kHz的采样率转录到IBM语音到文本引擎。

curl -X POST 
-u "apikey goes here" 
--header "Content-Type: audio/wav", "model: en-US_NarrowbandModel" 
--data-binary @{path_url_goes_here)/OSR_us_000_0010_8k.wav 
"https://stream.watsonplatform.net/speech-to-text/api/v1/recognize"

然而,输出表明不支持输入wav:

 curl: (3) Port number ended with ' '
{
   "code_description": "Unsupported Media Type", 
   "code": 415, 
   "error": "Unable to transcode from audio/wav, to one of: audio/x-float-array; rate=16000; channels=1, application/srgs, application/srgs+xml, application/jsgf, application/fsm, application/bnf"
}

根据IBM支持的音频格式列表,我已将我的模型更改为"窄带模型",以允许最小输入采样率为8kHz。

我的问题:我的请求或音频文件有问题吗?

更新:我尝试过以恒定的8和48千赫采样率将.wav转换为MP3格式。随着对"内容类型:音频/mp3"的更改,也会产生与上述相同的输出。

尝试在URL中传递model=en-US_NarrowbandModel作为参数。以下curl命令适用于我的wav文件。

curl -X POST 
-u "apikey:XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" 
-H "Content-Type:audio/wav" 
--data-binary @OSR_us_000_0010_8k.wav 
"https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_NarrowbandModel"
{
 "results": [
  {
     "alternatives": [
        {
           "confidence": 0.985, 
           "transcript": "the birch canoes slid on the smooth planks "
        }
     ], 
     "final": true
  }, 

相关内容

最新更新