donut 大小不匹配错误

6jygbczu  于 9个月前  发布在  其他
关注(0)|答案(3)|浏览(188)

当我运行 "python3 app.py" 进行演示时,无法加载预训练模型 naver-clova-ix/donut-base-finetuned-docvqa,出现了尺寸不匹配的错误。

错误信息如下:

  1. RuntimeError: Error(s) in loading state_dict for DonutModel:
  2. size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
  3. size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
  4. size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
  5. size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
  6. size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
  7. size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
  8. You may consider adding `ignore_mismatched_sizes=True` in the model method.

这个错误是由于预训练模型和当前模型的某些层参数尺寸不匹配导致的。你可以尝试在模型方法中添加 ignore_mismatched_sizes=True 以解决这个问题。

zy1mlcev

zy1mlcev1#

我已经在 "colab-demo-for-donut-base-finetuned-docvqa.ipynb" 中复制了这个错误。

在加载预训练模型时,出现了以下错误:

  1. RuntimeError: Error(s) in loading state_dict for DonutModel:
  2. size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
  3. size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
  4. size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
  5. size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
  6. size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
  7. size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
  8. You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

这个错误是由于在加载预训练模型的状态字典时,模型的某些层与检查点中的参数形状不匹配导致的。为了解决这个问题,你可以尝试在模型的 forward 方法中添加缺失的层。

r9f1avp5

r9f1avp52#

我已经在 "colab-demo-for-donut-base-finetuned-docvqa.ipynb" 中复制了这个错误。

在 "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py" 的第 3530 行,有一个运行时错误:

  1. RuntimeError: Error(s) in loading state_dict for DonutModel: size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]). You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

这个问题可能是由于预训练模型和当前模型的结构不匹配导致的。你可以尝试在模型的 forward 方法中添加缺失的部分,以解决这个问题。

xwbd5t1u

xwbd5t1u3#

这可能与 #206 有关。

相关问题