我试图在一个项目中使用pyannote的扬声器日记,但我得到了以下错误:
RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 80000 but got size 79659 for tensor number 11 in the list.
字符串
生成此错误的代码很简单,并遵循TL;DR:https://github.com/pyannote/pyannote-audio
from pyannote.audio import Pipeline
from env import hugging_face_token
# from pyannote.audio.pipelines import SpeakerDiarization
print('loading pipeline')
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
use_auth_token = hugging_face_token)
print('diarizing')
diarization = pipeline("data/db short intro.ogg")
print('diarizing complete')
型
我使用vscode的devcontainer扩展在docker容器中运行它。
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/python
{
"name": "Python 3",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/python:0-3.10",
// Features to add to the dev container. More info: https://containers.dev/features.
// "features": {},
"features": {
"ghcr.io/devcontainers-contrib/features/ffmpeg-apt-get:1": {}
},
// Use 'postCreateCommand' to run commands after the container is created.
"postCreateCommand": "pip3 install --user -r requirements.txt"
}
ipython
torch==1.11.0
torchvision==0.12.0
torchaudio==0.11.0
torchtext==0.12.0
speechbrain==0.5.12
pyannote.audio
git+https://github.com/openai/whisper.git#egg=openai-whisper
的字符串
我已经尝试从requirements.txt中删除特定版本,并让它获得所有内容的最新版本,但问题仍然存在。
1条答案
按热度按时间stszievb1#
根据this issue,这可能是将其他音频格式转换为. wav的问题。
你可以在上面提到的讨论的结尾找到解决办法。