我正在使用Google Colab为所需的输入文本运行Vakyansh的TTS模型。我已将Vakyansh的git克隆文件夹挂载到我的驱动器上,并已完美地进入该目录。但是,下面这段代码给了我这个错误:
SystemExit: Error:/src/glow_tts目录不存在
!git clone https://github.com/Open-Speech-EkStep/vakyansh-tts
!cd vakyansh-tts
!bash install.sh
!python setup.py bdist_wheel
!pip install -e .
!cd tts_infer
!wget https://storage.googleapis.com/vakyansh-open-models/translit_models.zip && unzip -q translit_models.zip
from tts_infer.tts import TextToMel, MelToWav
from tts_infer.transliterate import XlitEngine
from tts_infer.num_to_word_on_sent import normalize_nums
import re
from scipy.io.wavfile import write
text_to_mel = TextToMel(glow_model_dir='/src/glow_tts', device='cuda')
mel_to_wav = MelToWav(hifi_model_dir='/src/hifi_gan', device='cuda')
def translit(text, lang):
reg = re.compile(r'[a-zA-Z]')
engine = XlitEngine(lang)
words = [engine.translit_word(word, topk=1)[lang][0] if reg.match(word) else word for word in text.split()]
updated_sent = ' '.join(words)
return updated_sent
def run_tts(text, lang):
text = text.replace('।', '.') # only for hindi models
text_num_to_word = normalize_nums(text, lang) # converting numbers to words in lang
text_num_to_word_and_transliterated = translit(text_num_to_word, lang) # transliterating english words to lang
mel = text_to_mel.generate_mel(text_num_to_word_and_transliterated)
audio, sr = mel_to_wav.generate_wav(mel)
write(filename='temp.wav', rate=sr, data=audio) # for saving wav file, if needed
return (sr, audio)
我无法破解如何获得。wav文件作为输出。
我用这个作为我的参考:Vakyansh模型
你需要从https://github.com/Open-Speech-EkStep/vakyansh-models#tts-models下载2个模型,然后需要为glow_model_dir和hifi_model_dir指定目录。