Usage for CreateSpeechRecognizerWithFileInput in Microsoft.C



在语音示例应用中,有一个 CreateSpeechRecognizerWithFileInput 的示例,但它在第一个语句之后返回。 我确实注意到您可以多次调用RecognizeAsync,但是它有一些奇怪的行为:

  1. 我收到 识别错误引发 在文件中间出现"NoMatch"错误。
  2. 如果文件中有一段时间的静默,则会触发 FinalResultsReceived,结果为空。
  3. 识别完成似乎没有一致/可跟踪的 EOF 事件。

如果我想转录 20 分钟的音频文件,是否有更好的方法可以使用统一语音 SDK 执行此操作? 在旧的牛津包下,相同的文件是可以的。 理想情况下,我希望能够获得话语和转录的时间偏移。

您可以使用 SDK 的StartContinuousRecognitionAsync ();StopContinuousRecognitionAsync ();来启动和停止识别。

下面是一个示例:

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
namespace MicrosoftSpeechSDKSamples
{
public class SpeechRecognitionSamples
{
// Speech recognition from microphone.
public static async Task RecognitionWithMicrophoneAsync()
{
// <recognitionWithMicrophone>
// Creates an instance of a speech factory with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
var factory = SpeechFactory.FromSubscription("59a0243e86ae4919aa26f9e839f34b28", "westus");
// Creates a speech recognizer using microphone as audio input. The default language is "en-us".
using (var recognizer = factory.CreateSpeechRecognizer())
{
// Starts recognizing.
Console.WriteLine("Say something...");
// Starts recognition. It returns when the first utterance has been recognized.
var result = await recognizer.RecognizeAsync().ConfigureAwait(false);
// Checks result.
if (result.RecognitionStatus != RecognitionStatus.Recognized)
{
Console.WriteLine($"There was an error. Status:{result.RecognitionStatus.ToString()}, Reason:{result.RecognitionFailureReason}");
}
else
{
Console.WriteLine($"We recognized: {result.RecognizedText}");
}
}
// </recognitionWithMicrophone>
}
// Speech recognition in the specified spoken language.
public static async Task RecognitionWithLanguageAsync()
{
// <recognitionWithLanguage>
// Creates an instance of a speech factory with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
var factory = SpeechFactory.FromSubscription("59a0243e86ae4919aa26f9e839f34b28", "westus");
// Creates a speech recognizer for the specified language, using microphone as audio input.
var lang = "en-us";
using (var recognizer = factory.CreateSpeechRecognizer(lang))
{
// Starts recognizing.
Console.WriteLine($"Say something in {lang} ...");
// Starts recognition. It returns when the first utterance has been recognized.
var result = await recognizer.RecognizeAsync().ConfigureAwait(false);
// Checks result.
if (result.RecognitionStatus != RecognitionStatus.Recognized)
{
Console.WriteLine($"There was an error. Status:{result.RecognitionStatus.ToString()}, Reason:{result.RecognitionFailureReason}");
}
else
{
Console.WriteLine($"We recognized: {result.RecognizedText}");
}
}
// </recognitionWithLanguage>
}
// Speech recognition from file.
public static async Task RecognitionWithFileAsync()
{
// <recognitionFromFile>
// Creates an instance of a speech factory with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
var factory = SpeechFactory.FromSubscription("59a0243e86ae4919aa26f9e839f34b28", "westus");
// Creates a speech recognizer using file as audio input.
// Replace with your own audio file name.
using (var recognizer = factory.CreateSpeechRecognizerWithFileInput(@"YourAudioFile.wav"))
{
// Starts recognition. It returns when the first utterance is recognized.
var result = await recognizer.RecognizeAsync().ConfigureAwait(false);
// Checks result.
if (result.RecognitionStatus != RecognitionStatus.Recognized)
{
Console.WriteLine($"There was an error. Status:{result.RecognitionStatus.ToString()}, Reason:{result.RecognitionFailureReason}");
}
else
{
Console.WriteLine($"We recognized: {result.RecognizedText}");
}
}
// </recognitionFromFile>
}
// <recognitionCustomized>
// Speech recognition using a customized model.
public static async Task RecognitionUsingCustomizedModelAsync()
{
// Creates an instance of a speech factory with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
var factory = SpeechFactory.FromSubscription("59a0243e86ae4919aa26f9e839f34b28", "westus");
// Creates a speech recognizer using microphone as audio input.
using (var recognizer = factory.CreateSpeechRecognizer())
{
// Replace with the CRIS deployment id of your customized model.
recognizer.DeploymentId = "YourDeploymentId";
Console.WriteLine("Say something...");
// Starts recognition. It returns when the first utterance has been recognized.
var result = await recognizer.RecognizeAsync().ConfigureAwait(false);
// Checks results.
if (result.RecognitionStatus != RecognitionStatus.Recognized)
{
Console.WriteLine($"There was an error. Status:{result.RecognitionStatus.ToString()}, Reason:{result.RecognitionFailureReason}");
}
else
{
Console.WriteLine($"We recognized: {result.RecognizedText}");
}
}
}
// </recognitionCustomized>
// <recognitionContinuous>
// Speech recognition with events
public static async Task ContinuousRecognitionAsync()
{
// Creates an instance of a speech factory with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").
var factory = SpeechFactory.FromSubscription("59a0243e86ae4919aa26f9e839f34b28", "westus");
// Creates a speech recognizer using microphone as audio input.
using (var recognizer = factory.CreateSpeechRecognizer())
{
// Subscribes to events.
recognizer.IntermediateResultReceived += (s, e) => {
Console.WriteLine($"n    Partial result: {e.Result.RecognizedText}.");
};
recognizer.FinalResultReceived += (s, e) => {
if (e.Result.RecognitionStatus == RecognitionStatus.Recognized)
{
Console.WriteLine($"n    Final result: Status: {e.Result.RecognitionStatus.ToString()}, Text: {e.Result.RecognizedText}.");
}
else
{
Console.WriteLine($"n    Final result: Status: {e.Result.RecognitionStatus.ToString()}, FailureReason: {e.Result.RecognitionFailureReason}.");
}
};
recognizer.RecognitionErrorRaised += (s, e) => {
Console.WriteLine($"n    An error occurred. Status: {e.Status.ToString()}, FailureReason: {e.FailureReason}");
};
recognizer.OnSessionEvent += (s, e) => {
Console.WriteLine($"n    Session event. Event: {e.EventType.ToString()}.");
};
// Starts continuos recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
Console.WriteLine("Say something...");
await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);
Console.WriteLine("Press any key to stop");
Console.ReadKey();
await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
}
}
// </recognitionContinuous>
}
}

如果您有大量音频,使用批量转录也是一个好主意。

更新

根据文档,SpeechFactory类最近作为重大更改被删除,并从语音 SDK 1.0.0 开始替换为SpeechConfig

SpeechFactory类将被删除。相反,类SpeechConfig是 介绍语音配置的各种设置和 该类AudioConfig描述不同的音频源(麦克风、 文件或流输入(。要创建SpeechRecognizer,请使用其之一 以 SpeechConfig 和 AudioConfig 作为参数的构造函数。

下面的 C# 代码演示如何使用默认麦克风输入创建语音识别器。

// Creates an instance of speech config with specified subscription key and service region.
var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
// Creates a speech recognizer using microphone as audio input.
using (var recognizer = new SpeechRecognizer(config))
{
// Performs recognition.
var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
// Process result.
// ...
}

相同的概念应用于创建IntentRecognizerTranslationRecognizer,除了SpeechTranslationConfig是 创建TranslationRecognizer所必需的。

CreateSpeechRecognizerWithFileInput()被替换为AudioConfig。以下 C# 代码演示如何使用文件输入创建语音识别器。

// Creates an instance of speech config with specified subscription key and service region.
var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
// Creates a speech recognizer using file as audio input.
// Replace with your own audio file name.
using (var audioInput = AudioConfig.FromWavFileInput(@"whatstheweatherlike.wav"))
{
using (var recognizer = new SpeechRecognizer(config, audioInput))
{
// Performs recognition.
var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
// Process result.
// ...
}
}

正如 Ali 所说,如果你想识别多个话语,StartContinuRecognitionAsync(( 和 StopContinousRecognitionAsync(( 是正确的方法。

SpeechSDK 的最新示例 https://github.com/Azure-Samples/cognitive-services-speech-sdk 提供,其中包括不同平台(目前将添加 Windows/Linux 和更多平台(上不同语言(当前为 C++/C#,如果支持新语言,将添加(的示例。

关于问题 3(,会话停止事件用于检测 EOF。您可以在此处找到示例:https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/Windows/csharp_samples/speech_recognition_samples.cs#L194。

谢谢

相关内容

  • 没有找到相关文章

最新更新