我想创建一种类似于Siri或Alexa的语音驱动个人助理。说出一个关键字,然后将剩下的音频处理成文本。我有一个工作版本,我可以做到这一点。但是如果你输入关键字并等待很长时间,它就会超时。我无法说出关键字,请等待1或2秒,然后说出命令的其余部分。
我希望能够说出关键字并让它在超时之前等待10或15秒。
我试过设置这些属性,但是没有改变任何东西。
SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "15000");
SpeechConfig.SetProperty(PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "15000");
和
SpeechRecognizer.Properties.SetProperty(PropertyId.SpeechServiceConnection_InitialSilenceTimeoutMs, "15000");
SpeechRecognizer.Properties.SetProperty(PropertyId.SpeechServiceConnection_EndSilenceTimeoutMs, "15000");
我用
SpeechRecognizer.StartKeywordRecognitionAsync()
进行识别。我试着用
来阻止它SpeedRecognizer.StopKeywordRecognitionAsync()
,然后使用
SpeechRecognizer.StartContinousRecognitionAsync()
在SessionStarted, SessionStopped, Recognized或Recognized事件中。cancelled事件永远不会被调用。
我期望它会在关键字被说出后等待,但它没有。有人知道怎么做吗?我错过了什么?
我能够通过阅读这里的文档来弄清楚:https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/
基本前提是首先创建一个KeywordRecognizer。然后调用识别器函数来获取关键字。结果是一个RecognizedKeyword,从那里你可以创建一个语音识别器。调用识别器函数,您将获得命令的其余部分。默认延迟为获取关键字后30秒,直到超时。
下面是一些示例代码:
using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
namespace SpeechRecognitionDemo
{
class Program
{
static SpeechConfig speechConfig;
static KeywordRecognitionModel keywordModel;
static AudioConfig audioConfig;
static TaskCompletionSource<int> stopRecognition;
static async Task Main(string[] args)
{
// Creates an instance of a speech config with specified subscription key and service region.
// Replace with your own subscription key and service region (e.g., "westus").
speechConfig = SpeechConfig.FromSubscription("subscription key", "region");
speechConfig.SpeechRecognitionLanguage = "en-US";
// set this property to allow more time between words in the command
speechConfig.SetProperty(PropertyId.Speech_SegmentationSilenceTimeoutMs, "2000");
// Creates an instance of a keyword recognition model. Update this to
// point to the location of your keyword recognition model.
keywordModel = KeywordRecognitionModel.FromFile("keywords.table");
audioConfig = AudioConfig.FromDefaultMicrophoneInput();
await RunAssistant();
}
static async Task RunAssistant()
{
bool keepRunning = true;
while (keepRunning)
{
// Starts recognizing.
Console.WriteLine($"Say something starting with the keyword 'Hey Assistant' followed by whatever you want...");
stopRecognition = new TaskCompletionSource<int>(TaskCreationOptions.RunContinuationsAsynchronously);
using (var keywordRecognizer = new KeywordRecognizer(audioConfig))
{
// recognize the keywords
KeywordRecognitionResult result = await keywordRecognizer.RecognizeOnceAsync(keywordModel);
if (result.Reason == ResultReason.RecognizedKeyword)
{
Console.WriteLine($"RECOGNIZED KEYWORD: Text={result.Text}");
using (var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig))
{
// Subscribes to events.
speechRecognizer.Recognizing += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizingSpeech)
{
Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
}
};
speechRecognizer.Recognized += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
}
else if (e.Result.Reason == ResultReason.NoMatch)
{
Console.WriteLine("NOMATCH: Speech could not be recognized.");
}
};
speechRecognizer.SessionStarted += (s, e) =>
{
Console.WriteLine("nSession started event.n");
};
speechRecognizer.SessionStopped += (s, e) =>
{
Console.WriteLine("nSession stopped event.");
Console.WriteLine("nStop recognition.");
stopRecognition.TrySetResult(0);
};
// now recognize the commands
await speechRecognizer.RecognizeOnceAsync();
}
}
if (result.Reason == ResultReason.Canceled)
{
Console.WriteLine($"CANCELLED KEYWORD");
stopRecognition.TrySetResult(0);
}
if (result.Reason == ResultReason.NoMatch)
{
Console.WriteLine($"NO MATCH KEYWORD");
}
// Use Task.WaitAny to keep the task rooted.
Task.WaitAny(new[] { stopRecognition.Task });
Console.WriteLine("n");
}
}
audioConfig.Dispose();
}
}
}
你需要创建一个关键字。表格文件使用演讲工作室,这是相当不言自明的。您还需要一个订阅id,然后下载一个模型以离线使用。
此示例等待关键字,然后等待更多文本。它将结果打印到控制台,然后返回来重新执行此操作。