我正在使用Azure SpeechSDK服务,使用recognizeOnceAsync
进行语音到文本转录。当前代码类似于:
var SpeechSDK, recognizer, synthesizer;
var speechConfig = SpeechSDK.SpeechConfig.fromSubscription('SUB_KEY', 'SUB_REGION');
var audioConfig = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
new Promise(function(resolve) {
recognizer.onend = resolve;
recognizer.recognizeOnceAsync(
function (result) {
recognizer.close();
recognizer = undefined;
resolve(result.text);
},
function (err) {
alert(err);
recognizer.close();
recognizer = undefined;
}
);
}).then(r => {
console.log(`Azure STT enterpreted: ${r}`);
});
在HTML文件中,我导入Azure包,如下所示:
<script src="https://aka.ms/csspeech/jsbrowserpackageraw"></script>
问题是,我想增加";静音时间";这在CCD_ 2方法返回结果之前是允许的。(也就是说,假设你已经说完了,你应该能够停下来喘口气,而不用这种方法(。有什么方法可以用fromDefaultMicrophoneInput
做到这一点吗?我尝试过各种各样的东西,比如:
const SILENCE_UNTIL_TIMEOUT_MS = 5000;
speechConfig.SpeechServiceConnection_EndSilenceTimeoutMs = SILENCE_UNTIL_TIMEOUT_MS;
audioConfig.setProperty("Speech_SegmentationSilenceTimeoutMs", SILENCE_UNTIL_TIMEOUT_MS);
但似乎没有人将";静音时间余量";正确地
这是我一直在寻找的资源:https://learn.microsoft.com/en-us/javascript/api/microsoft-cognitiveservices-speech-sdk/propertyid?view=azure-节点最新
根据您所描述的内容,您需要设置分段静默超时。不幸的是,目前JS SDK中存在一个错误,PropertyId.Speech_SegmentationSilenceTimeoutMs
设置不正确。
作为一种变通方法,您可以按如下方式设置分段超时:
const speechConfig = SpeechConfig.fromSubscription(subscriptionKey, subscriptionRegion);
speechConfig.speechRecognitionLanguage = "en-US";
const reco = new SpeechRecognizer(speechConfig);
const conn = Connection.fromRecognizer(reco);
conn.setMessageProperty("speech.context", "phraseDetection", {
"INTERACTIVE": {
"segmentation": {
"mode": "custom",
"segmentationSilenceTimeoutMs": 5000
}
},
mode: "Interactive"
});
reco.recognizeOnceAsync(
(result) =>
{
console.log("Recognition done!!!");
// do something with the recognition
},
(error) =>
{
console.log("Recognition failed. Error:" + error);
});
请注意,分段超时的允许范围为100-5000毫秒(含(