使用Cognitive Services修改输入关键字后的超时时间



我想创建一种类似于Siri或Alexa的语音驱动个人助理。说出一个关键字,然后将剩下的音频处理成文本。我有一个工作版本,我可以做到这一点。但是如果你输入关键字并等待很长时间,它就会超时。我无法说出关键字,请等待1或2秒,然后说出命令的其余部分。

我希望能够说出关键字并让它在超时之前等待10或15秒。

我试过设置这些属性,但是没有改变任何东西。

SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "15000");
SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "15000");

SpeechRecognizer.Properties.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "15000");
SpeechRecognizer.Properties.SetProperty(PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "15000");

我用

SpeechRecognizer.StartKeywordRecognitionAsync()

进行识别。我试着用

来阻止它
SpeedRecognizer.StopKeywordRecognitionAsync()

,然后使用

SpeechRecognizer.StartContinousRecognitionAsync()

在SessionStarted, SessionStopped, Recognized或Recognized事件中。cancelled事件永远不会被调用。

我期望它会在关键字被说出后等待,但它没有。有人知道怎么做吗?我错过了什么?

我能够通过阅读这里的文档来弄清楚:https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/

基本前提是首先创建一个KeywordRecognizer。然后调用识别器函数来获取关键字。结果是一个RecognizedKeyword,从那里你可以创建一个语音识别器。调用识别器函数,您将获得命令的其余部分。默认延迟为获取关键字后30秒,直到超时。

下面是一些示例代码:

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

namespace SpeechRecognitionDemo
{
class Program
{
static SpeechConfig speechConfig;
static KeywordRecognitionModel keywordModel;
static AudioConfig audioConfig;
static TaskCompletionSource<int> stopRecognition;

static async Task Main(string[] args)
{
// Creates an instance of a speech config with specified subscription key and service region.
// Replace with your own subscription key and service region (e.g., "westus").
speechConfig = SpeechConfig.FromSubscription("subscription key", "region");
speechConfig.SpeechRecognitionLanguage = "en-US";

// set this property to allow more time between words in the command
speechConfig.SetProperty(PropertyId.Speech_SegmentationSilenceTimeoutMs, "2000");

// Creates an instance of a keyword recognition model. Update this to
// point to the location of your keyword recognition model.
keywordModel = KeywordRecognitionModel.FromFile("keywords.table");
audioConfig = AudioConfig.FromDefaultMicrophoneInput();

await RunAssistant();
}

static async Task RunAssistant()
{
bool keepRunning = true;

while (keepRunning)
{
// Starts recognizing.
Console.WriteLine($"Say something starting with the keyword 'Hey Assistant' followed by whatever you want...");

stopRecognition = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously);

using (var keywordRecognizer = new KeywordRecognizer(audioConfig))
{
// recognize the keywords
KeywordRecognitionResult result = await keywordRecognizer.RecognizeOnceAsync(keywordModel);

if (result.Reason == ResultReason.RecognizedKeyword)
{
Console.WriteLine($"RECOGNIZED KEYWORD: Text={result.Text}");

using (var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig))
{
// Subscribes to events.
speechRecognizer.Recognizing += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizingSpeech)
{
Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
}
};

speechRecognizer.Recognized += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
}
else if (e.Result.Reason == ResultReason.NoMatch)
{
Console.WriteLine("NOMATCH: Speech could not be recognized.");
}
};

speechRecognizer.SessionStarted += (s, e) =>
{
Console.WriteLine("nSession started event.n");
};

speechRecognizer.SessionStopped += (s, e) =>
{
Console.WriteLine("nSession stopped event.");
Console.WriteLine("nStop recognition.");

stopRecognition.TrySetResult(0);
};
// now recognize the commands
await speechRecognizer.RecognizeOnceAsync();
}
}

if (result.Reason == ResultReason.Canceled)
{
Console.WriteLine($"CANCELLED KEYWORD");
stopRecognition.TrySetResult(0);
}

if (result.Reason == ResultReason.NoMatch)
{
Console.WriteLine($"NO MATCH KEYWORD");
}

// Use Task.WaitAny to keep the task rooted.
Task.WaitAny(new[] { stopRecognition.Task });

Console.WriteLine("n");
}
}
audioConfig.Dispose();
}
}
}

你需要创建一个关键字。表格文件使用演讲工作室,这是相当不言自明的。您还需要一个订阅id,然后下载一个模型以离线使用。

此示例等待关键字,然后等待更多文本。它将结果打印到控制台,然后返回来重新执行此操作。