将音频区块队列传递给 Google 的异步转录选项

我正在尝试将使用chunks.get(in_data)从PyAudio的回调函数获得的音频块传递给Google Speech的异步转录。

此外，我正在使用 Python 的multiprocessing模块使用单个 worker 的Threadpool逐个处理这些块：

pool = ThreadPool(processes=1, initializer=initGoogleCloud, initargs=(audio_rate, credentials_json, lang_code, asr_narrowband, preferred_phrases, show_all))  
async_result = pool.apply_async(GoogleCloud, (self.detect_chunk_buffer.get()))
return_text = async_result.get()

def initGoogleCloud(SAMPLERATE, credentials_json, lang_code, is_narrowband, preferred_phrases, show_all):
assert isinstance(lang_code, str), "lang_code must be a string."
try:
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
from google.oauth2 import service_account
except ImportError:
print('google.cloud failed to import.')
if is_narrowband is True:
use_enhanced = True
model = 'phone_call'
else:
use_enhanced = False
model = 'default'
# Configurations for Google Cloud
with open('tmp_credentials.json', 'w') as fp:
json.dump(credentials_json, fp)
google_credentials = service_account.Credentials.from_service_account_file('tmp_credentials.json')
client = speech.SpeechClient(credentials=google_credentials)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=SAMPLERATE,
language_code=lang_code,
use_enhanced=use_enhanced,
model=model)
streaming_config = types.StreamingRecognitionConfig(config=config, interim_results=True)
def GoogleCloud(audio_chunk):
byte_chunk = b''.join(audio_chunk)
audio = types.RecognitionAudio(byte_chunk)
operation = client.long_running_recognize(config, audio)
#Waiting for operation to complete...
response = operation.result(timeout=90)
# Processing response
return listen_print_loop(responses)

输出：类型错误：GoogleCloud() 需要 1 个位置参数，但给出了 2048 个中止陷阱：6

似乎就像chunk.get()正在提取所有音频序列作为参数。有没有办法传递队列中的单个块进行处理？

我的 PyAudio 格式是pyaudio.paInt16.

为了将音频块"打包"到参数中，我修改了async_result = pool.apply_async(GoogleCloud, (self.detect_chunk_buffer.get()))

自audio_chunk = [self.detect_chunk_buffer.get()]在将其作为async_result = pool.apply_async(rttASR.GoogleCloud, args=(audio_chunk))参数发送之前将其打包到列表中。

它可以工作，似乎我的self.detect_chunk_buffer.get()(包含来自 in_data PyAudio 回调的 paInt16 音频块)不需要任何额外的 base64 编码。

相关内容

最新更新

热门标签：