我有一个Python程序,使用OpenAI Whisper模块将音频转录为文本。不幸的是,尽管向该模块传递了一个完全解析的路径,它还是崩溃了,并出现了一个错误,说它找不到文件。我知道该文件存在于目录中,因为从下面的代码和输出中可以看到,脚本本身可以找到它(查看我打印输入文件时间戳的代码和输出)。我用的是Windows 10电脑。
为什么导入的模块找不到输入文件,我该如何解决这个问题?我读了几篇关于路径和subprocess
模块的帖子,但没有一个解决方案适合我。
代码如下:
import whisper
import pandas as pd
import os
import sys
from datetime import datetime
# Show the current working directory
cwd = os.getcwd()
print ("Current working directory: {0}n".format(cwd))
# Transcript a previously downloaded audio file.
# audio_file = "./audio.mp4"
# with open(os.path.join(sys.path[0], "audio.mp4"), "r") as f:
audio_file = os.path.join(cwd, "audio.mp4")
print ("Using audio input file: {0}n".format(audio_file))
# Get the timestamp for the file
timestamp = os.path.getmtime(audio_file)
# Convert the timestamp to a datetime object
dt = datetime.fromtimestamp(timestamp)
# Format the datetime object in the desired format
formatted_timestamp = dt.strftime("%m/%d/%Y")
# Print the formatted timestamp
print("Input file timestamp: {0}nn".format(formatted_timestamp))
#Load the OpenAI Whisper model
whisper_model = whisper.load_model("tiny")
# Transcribe the audio.
transcription = whisper_model.transcribe(audio_file)
# Display the transcription. This will display
# the transcription result in segments with
# start and end time. The full concatenated
# string is available as transcription['text']
# print as DataFrame
df = pd.DataFrame(transcription['segments'], columns=['start', 'end', 'text'])
print(df)
# or, print as String
print(transcription['text'])
程序输出如下:
C:UsersmainDocumentsGitHubMEopen-aiwhisperpython-utilities>python transcribe-audio.py
Current working directory: C:UsersmainDocumentsGitHubMEopen-aiwhisperpython-utilities
Using audio input file: C:UsersmainDocumentsGitHubMEopen-aiwhisperpython-utilitiesaudio.mp4
Input file timestamp: 12/27/2022
C:Python310libsite-packageswhispertranscribe.py:78: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
File "C:UsersmainDocumentsGitHubMEopen-aiwhisperpython-utilitiestranscribe-audio.py", line 36, in <module>
transcription = whisper_model.transcribe(audio_file)
File "C:Python310libsite-packageswhispertranscribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "C:Python310libsite-packageswhisperaudio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "C:Python310libsite-packageswhisperaudio.py", line 42, in load_audio
ffmpeg.input(file, threads=0)
File "C:Python310libsite-packagesffmpeg_run.py", line 313, in run
process = run_async(
File "C:Python310libsite-packagesffmpeg_run.py", line 284, in run_async
return subprocess.Popen(
File "C:Python310libsubprocess.py", line 966, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:Python310libsubprocess.py", line 1435, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
我找到问题了。ffmpeg
不在脚本目录中,也不在系统path
变量中。不幸的是,产生的错误消息并没有表明这是真正的问题,导致人们认为这是无法找到输入音频文件,而实际的问题是无法找到ffmpeg.exe
。我将ffmpeg.exe
复制到脚本目录中,它工作得很好。