使 Python 语音识别更快

我一直在使用 Python 的 Google 语音识别。这是我的代码：

import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
print("Say something!")
audio = r.listen(source)
print(r.recognize_google(audio))

虽然识别非常准确，但大约需要 4-5 秒才能吐出识别出的文本。由于我正在创建一个语音助手，我想修改上面的代码以使语音识别更快。

有什么方法可以将这个数字降低到大约 1-2 秒吗？如果可能的话，我试图像Siri和Ok Google等服务一样快地进行识别。

我对 python 很陌生，所以如果我的问题有一个简单的答案，我深表歉意。

您可以使用其他语音识别程序。例如，您可以在 IBM 设置一个帐户，以使用他们的 Watson Speech To Text。如果可能的话，尝试使用他们的 websocket 界面，因为这样它会在您还在说话时主动转录您正在说的话。

一个例子(不使用websockets(是：

import speech_recognition as sr
# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Adjusting for background noise. One second")
r.adjust_for_ambient_noise(source)
print("Say something!")
audio = r.listen(source)
IBM_USERNAME = "INSERT IBM SPEECH TO TEXT USERNAME HERE"  # IBM Speech to Text usernames are strings of the form XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
IBM_PASSWORD = "INSERT IBM SPEECH TO TEXT PASSWORD HERE"  # IBM Speech to Text passwords are mixed-case alphanumeric strings
try:
print("IBM Speech to Text thinks you said " + r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD))
except sr.UnknownValueError:
print("IBM Speech to Text could not understand audio")
except sr.RequestError as e:
print("Could not request results from IBM Speech to Text service; {0}".format(e))

你也可以尝试使用口袋狮身人面像，但就我个人而言，我没有特别好的体验。它是离线的(一个加号(，但对我来说，并不是特别准确。您可能会调整一些检测设置并消除一些背景噪音。我相信还有一个训练选项可以将其修改为您的声音，但它看起来并不简单。

一些有用的链接：

语音识别

麦克风识别示例

IBM Watson 语音转文本

祝你好运。一旦语音识别正常工作，它非常有用和有益！

使用正确的输入通道和调整以获得最佳效果：

def speech_to_text():
required=-1
for index, name in enumerate(sr.Microphone.list_microphone_names()):
if "pulse" in name:
required= index
r = sr.Recognizer()
with sr.Microphone(device_index=required) as source:
r.adjust_for_ambient_noise(source)
print("Say something!")
audio = r.listen(source, phrase_time_limit=4)
try:
input = r.recognize_google(audio)
print("You said: " + input)
return str(input)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))

使用正确的输入通道和调整以获得最佳效果：

相关内容

最新更新

热门标签：