谷歌云的速率和音高韵律属性

我是Google Cloud文本转语音服务的新手。文档中显示了具有rate和pitch属性的<prosody>标签。但这些对我的要求没有影响。例如，如果我使用rate="slow"或rate="fast"，或pitch="+2st"或pitch="-2st"，结果与文档上的示例相同且不同，其具有较慢的速率和较低的音调。

我确保了最新的版本:

python3 -m pip install --upgrade google-cloud-texttospeech

最小可复制示例:

import os
from google.cloud import texttospeech
AUDIO_CONFIG = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.LINEAR16)
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/path/to/file"
tts_client = texttospeech.TextToSpeechClient()
voice = texttospeech.VoiceSelectionParams(
language_code="en-US",
name=  "en-US-Wavenet-A"
)
ssml_input = texttospeech.SynthesisInput(
ssml='<prosody rate="fast" pitch="+2st">Can you hear me now?</prosody>'
# or this one:
#ssml='<prosody rate="slow" pitch="-2st">Can you hear me now?</prosody>'
)
response = tts_client.synthesize_speech(
input=ssml_input, voice=voice, audio_config=AUDIO_CONFIG
)
with open("/tmp/cloud.wav", 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)

我如何使用谷歌云的速度和音高韵律属性?

根据本文档，当您编写SSML脚本在文本到语音代码中，SSML脚本的格式应该类似于:

<speak>
<prosody rate="slow" pitch="low">Hi good morning have a nice day</prosody>
</speak>

你可以参考下面提到的一段代码，我在我的结束，它为我工作。

代码1:我使用音调低和速率慢.


from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
# Sets the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(
ssml= '<speak><prosody rate="slow" pitch="low">Hi good morning have a nice day</prosody></speak>'
)
# Builds the voice request, selects the language code ("en-US") and
# the SSML voice gender ("MALE")
voice = texttospeech.VoiceSelectionParams(
language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.MALE
)
# Selects the type of audio file to return
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3
)
# Performs the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
input=synthesis_input, voice=voice, audio_config=audio_config
)
# Writes the synthetic audio to the output file.
with open("output.mp3", "wb") as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')

音频输出输出音频

代码2:我使用的速率为fast和节距+5st.


from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
# Sets the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(
ssml= '<speak><prosody rate="fast" pitch="+5st">Hi good morning have a nice day</prosody></speak>'
)
# Builds the voice request, selects the language code ("en-US") and
# the SSML voice gender ("MALE")
voice = texttospeech.VoiceSelectionParams(
language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.MALE
)
# Selects the type of audio file to return
audio_config = texttospeech.AudioConfig(
audio_encoding=texttospeech.AudioEncoding.MP3
)
# Performs the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
input=synthesis_input, voice=voice, audio_config=audio_config
)
# Writes the synthetic audio to the output file.
with open("output.mp3", "wb") as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')

音频输出输出音频

相关内容

最新更新

热门标签：