pytorch 如何使用pyannote.audio解决Tensor大小不匹配

ecbunoof  于 2024-01-09  发布在  其他
关注(0)|答案(1)|浏览(214)

我试图在一个项目中使用pyannote的扬声器日记,但我得到了以下错误:

RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 80000 but got size 79659 for tensor number 11 in the list.

字符串
生成此错误的代码很简单,并遵循TL;DR:https://github.com/pyannote/pyannote-audio

from pyannote.audio import Pipeline
from env import hugging_face_token
# from pyannote.audio.pipelines import SpeakerDiarization

print('loading pipeline')
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
                                    use_auth_token = hugging_face_token)

print('diarizing')
diarization = pipeline("data/db short intro.ogg")
print('diarizing complete')


我使用vscode的devcontainer扩展在docker容器中运行它。

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/python
{
    "name": "Python 3",
    // Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
    "image": "mcr.microsoft.com/devcontainers/python:0-3.10",
    // Features to add to the dev container. More info: https://containers.dev/features.
    // "features": {},
    "features": {
        "ghcr.io/devcontainers-contrib/features/ffmpeg-apt-get:1": {}
    },
    // Use 'postCreateCommand' to run commands after the container is created.
    "postCreateCommand": "pip3 install --user -r requirements.txt"
}
ipython
torch==1.11.0 
torchvision==0.12.0 
torchaudio==0.11.0 
torchtext==0.12.0
speechbrain==0.5.12
pyannote.audio
git+https://github.com/openai/whisper.git#egg=openai-whisper

的字符串
我已经尝试从requirements.txt中删除特定版本,并让它获得所有内容的最新版本,但问题仍然存在。

stszievb

stszievb1#

根据this issue,这可能是将其他音频格式转换为. wav的问题。
你可以在上面提到的讨论的结尾找到解决办法。

相关问题