ludwig 音频特征:ValueError:无法将形状为(360000,2,1)的输入数组广播到形状为(360000,1)的数组,

guykilcj  于 2个月前  发布在  其他
关注(0)|答案(6)|浏览(38)

尝试使用音频输入功能,但出现了这个错误信息:ValueError: could not broadcast input array from shape (360000,2,1) into shape (360000,1)
我已经尝试过使用wav和ogg/vorbis文件进行测试,但都没有成功。
我在ludwig 0.4.1上使用Python 3.8.10进行了测试。
堆栈跟踪:

Traceback (most recent call last):
  File "model.py", line 30, in <module>
    model.train(
  File "env/lib/python3.8/site-packages/ludwig/api.py", line 415, in train
    preprocessed_data = self.preprocess(
  File "env/lib/python3.8/site-packages/ludwig/api.py", line 1337, in preprocess
    preprocessed_data = preprocess_for_training(
  File "env/lib/python3.8/site-packages/ludwig/data/preprocessing.py", line 1517, in preprocess_for_training
    processed = data_format_processor.preprocess_for_training(
  File "env/lib/python3.8/site-packages/ludwig/data/preprocessing.py", line 235, in preprocess_for_training
    return _preprocess_file_for_training(
  File "env/lib/python3.8/site-packages/ludwig/data/preprocessing.py", line 1649, in _preprocess_file_for_training
    data, training_set_metadata = build_dataset(
  File "env/lib/python3.8/site-packages/ludwig/data/preprocessing.py", line 1151, in build_dataset
    proc_cols = build_data(
  File "env/lib/python3.8/site-packages/ludwig/data/preprocessing.py", line 1306, in build_data
    proc_cols = add_feature_data(
  File "env/lib/python3.8/site-packages/ludwig/features/audio_feature.py", line 372, in add_feature_data
    audio_features = AudioFeatureMixin._process_in_memory(
  File "env/lib/python3.8/site-packages/ludwig/features/audio_feature.py", line 159, in _process_in_memory
    processed_audio = df_engine.map_objects(
  File "env/lib/python3.8/site-packages/ludwig/data/dataframe/pandas.py", line 50, in map_objects
    return series.map(map_fn)
  File "env/lib/python3.8/site-packages/pandas/core/series.py", line 4237, in map
    new_values = self._map_values(arg, na_action=na_action)
  File "env/lib/python3.8/site-packages/pandas/core/base.py", line 880, in _map_values
    new_values = map_f(values, mapper)
  File "pandas/_libs/lib.pyx", line 2870, in pandas._libs.lib.map_infer
  File "env/lib/python3.8/site-packages/ludwig/features/audio_feature.py", line 161, in <lambda>
    lambda row: AudioFeatureMixin._transform_to_feature(
  File "env/lib/python3.8/site-packages/ludwig/features/audio_feature.py", line 249, in _transform_to_feature
    audio_feature_padded[:broadcast_feature_length, :] = audio_feature[
ValueError: could not broadcast input array from shape (360000,2,1) into shape (360000,1)

数据集有文件的路径,模型正在使用以下代码进行训练:

model = LudwigModel({
    'input_features': [{
        'name': 'audio_path',
        'type': 'audio',
    }],
    'output_features: [{
        'name': 'track_artist',
        'type': 'category',
    }]
})
model.train(
    training_set='./dataset/dataset.csv',
    test_set='./dataset/test.csv',
)
93ze6v8z

93ze6v8z1#

你好,@modernlearner。我将查看这个问题。几个小时后我会给你一个答案!我会随时更新我的进展。

lvjbypge

lvjbypge2#

你好,@modernlearner。你能尝试升级你的Ludwig版本到0.5rc2吗?在这个发布候选版本中我们修复了很多问题,快速更新版本可能会解决你正在经历的问题。如果问题仍然存在,我会继续进行故障排除,不过我想看看这个快速修复是否能帮到我们。请告诉我:)

oknwwptz

oknwwptz3#

你好,@connor-mccorm。我成功安装了0.5rc2版本,但是遇到了一个关于缺少模块(pytorch)的问题。这是我目前正在使用的设备:
requirements.txt
这是堆栈跟踪:

Traceback (most recent call last):
  File "model_ludwig.py", line 1, in <module>
    from ludwig.api import LudwigModel
  File "env/lib/python3.8/site-packages/ludwig/api.py", line 38, in <module>
    from ludwig.backend import Backend, initialize_backend
  File "env/lib/python3.8/site-packages/ludwig/backend/__init__.py", line 20, in <module>
    from ludwig.backend.base import Backend, LocalBackend
  File "env/lib/python3.8/site-packages/ludwig/backend/base.py", line 24, in <module>
    from ludwig.models.ecd import ECD
  File "env/lib/python3.8/site-packages/ludwig/models/ecd.py", line 10, in <module>
    from ludwig.combiners.combiners import Combiner, get_combiner_class
  File "env/lib/python3.8/site-packages/ludwig/combiners/combiners.py", line 28, in <module>
    from ludwig.encoders.sequence_encoders import ParallelCNN, StackedCNN, StackedCNNRNN, StackedParallelCNN, StackedRNN
  File "env/lib/python3.8/site-packages/ludwig/encoders/__init__.py", line 8, in <module>
    import ludwig.encoders.image_encoders
  File "env/lib/python3.8/site-packages/ludwig/encoders/image_encoders.py", line 24, in <module>
    from ludwig.modules.convolutional_modules import Conv2DStack, ResNet
  File "env/lib/python3.8/site-packages/ludwig/modules/convolutional_modules.py", line 22, in <module>
    from ludwig.utils.image_utils import get_img_output_shape
  File "env/lib/python3.8/site-packages/ludwig/utils/image_utils.py", line 26, in <module>
    import torchvision.transforms.functional as F
ModuleNotFoundError: No module named 'torchvision'
e5njpo68

e5njpo684#

你好,@modernlearner,这里是PyTorch的要求:

torch==1.9.1 ; platform_system == "Windows"  # https://github.com/pytorch/pytorch/issues/65473
torch>=1.10.0 ; platform_system != "Windows"

我相信当你安装了ludwig 0.5rc2时,它应该已经安装了所有依赖项。所以如果你遇到了更多的依赖问题,我们可以尝试一次性安装它们,而不必一个接一个地解决它们。

oxosxuxt

oxosxuxt5#

我添加了torch依赖,但不起作用,所以然后我添加了以下依赖项以使其运行:

torchvision==0.12.0
psutil==5.9.0

它运行了,但最后出现了相同的错误:

audio_feature_padded[:broadcast_feature_length, :] = audio_feature[:max_length, :]
ValueError: could not broadcast input array from shape (360000,2,1) into shape (360000,1)

我在Intellij中添加了一个断点来比较值,看起来audio_feature有一对重复的值,它们没有正确地合并到正确的形状:

oyjwcjzk

oyjwcjzk6#

我认为这是修复方法。

# 0.5rc2
audio_feature_padded[:broadcast_feature_length, :] = audio_feature[:max_length, 1]

然后我遇到了一个问题,即CUDA内存不足。然而,它确实通过了 ValueError 🎉
对于我使用的另一个版本的ludwig,我使用的修复方法是这个:

# 0.4.1
audio_feature_padded[:broadcast_feature_length, :] = audio_feature[
                                                             :max_length, 1]

并且它确实运行正确。

相关问题