我正在尝试语音识别模块的一些转录方法,我可以使用Google API(recognize_google()
)进行转录,但当我尝试使用OpenAPI的Whisper(recognize_whisper()
)时,会创建一个临时文件“%LocalAppData%\Temp\tmps_pfkh0z.wav”(实际文件名每次都会略有变化),脚本失败并显示“permission denied”错误:
Traceback (most recent call last):
File "D:\Users\Renato\Documents\Code\projects\transcriber\.venv\lib\site-packages\whisper\audio.py", line 42, in load_audio
ffmpeg.input(file, threads=0)
File "D:\Users\Renato\Documents\Code\projects\transcriber\.venv\lib\site-packages\ffmpeg\_run.py", line 325, in run
raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "d:\Users\Renato\Documents\Code\projects\transcriber\main.py", line 15, in <module>
print("Transcription: " + r.recognize_whisper(audio_data=audio_data, model="medium", language="uk"))
File "D:\Users\Renato\Documents\Code\projects\transcriber\.venv\lib\site-packages\speech_recognition\__init__.py", line 1697, in recognize_whisper
result = self.whisper_model[model].transcribe(
File "D:\Users\Renato\Documents\Code\projects\transcriber\.venv\lib\site-packages\whisper\transcribe.py", line 85, in transcribe
mel = log_mel_spectrogram(audio)
File "D:\Users\Renato\Documents\Code\projects\transcriber\.venv\lib\site-packages\whisper\audio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "D:\Users\Renato\Documents\Code\projects\transcriber\.venv\lib\site-packages\whisper\audio.py", line 47, in load_audio
libavdevice 59. 7.100 / 59. 7.100
libavfilter 8. 44.100 / 8. 44.100
libswscale 6. 7.100 / 6. 7.100
libswresample 4. 7.100 / 4. 7.100 libpostproc 56. 6.100 / 56. 6.100C:\Users\Renato\AppData\Local\Temp\tmps_pfkh0z.wav: Permission denied
代码本身非常简单:
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("audio.wav") as src:
audio_data = r.record(src)
print("Transcription: " + r.recognize_whisper(audio_data=audio_data, model="medium", language="en"))
我尝试了不同的ffmpeg安装(gyan.dev和BtbN预构建包,我还尝试了通过chocolatey安装)。
我还尝试取消选中Temp文件夹属性上的“只读”选项,但错误仍然发生。
我在一台Windows机器上用venv创建的虚拟环境中运行该脚本。
1条答案
按热度按时间798qvoo81#
我有和你一样的问题。根据文档,rexognize_whisper应该以AudioData的形式接收Audiofile示例,但是它不工作。