tensorflow 整形的输入与请求的形状不匹配

cgvd09ve 于 2022-11-16 发布在其他

关注(0)|答案(1)|浏览(198)

我知道其他人已经发布了类似的问题，但我找不到一个合适的解决方案。
我写了一个自定义keras层来平均DistilBert基于掩码的输出。也就是说，我有dim=[batch_size, n_tokens_out, 768]输入，基于掩码dim=[batch_size, n_tokens_out]沿着n_tokens_out进行掩码。输出应该是dim=[batch_size, 768]。下面是该层的代码：

class CustomPool(tf.keras.layers.Layer):
    def __init__(self, output_dim, **kwargs):
        self.output_dim = output_dim
        super(CustomPool, self).__init__(**kwargs)
    
    def call(self, x, mask):
        masked = tf.cast(tf.boolean_mask(x, mask = mask, axis = 0), tf.float32)
        mn = tf.reduce_mean(masked, axis = 1, keepdims=True)
        return tf.reshape(mn, (tf.shape(x)[0], self.output_dim))
    
    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.output_dim)

模型编译时没有错误，但训练一开始，我就收到这个错误：

InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  Input to reshape is a tensor with 967 values, but the requested shape has 12288
     [[node pooled_distilBert/CustomPooling/Reshape (defined at <ipython-input-245-a498c2817fb9>:13) ]]
     [[assert_greater_equal/Assert/AssertGuard/pivot_f/_3/_233]]
  (1) Invalid argument:  Input to reshape is a tensor with 967 values, but the requested shape has 12288
     [[node pooled_distilBert/CustomPooling/Reshape (defined at <ipython-input-245-a498c2817fb9>:13) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_211523]

Errors may have originated from an input operation.
Input Source operations connected to node pooled_distilBert/CustomPooling/Reshape:
 pooled_distilBert/CustomPooling/Mean (defined at <ipython-input-245-a498c2817fb9>:11)

Input Source operations connected to node pooled_distilBert/CustomPooling/Reshape:
 pooled_distilBert/CustomPooling/Mean (defined at <ipython-input-245-a498c2817fb9>:11)

我得到的尺寸比预期的尺寸小，这对我来说很奇怪。
下面是模型的外观（TFDistilBertModel来自huggingface transformers库）：

dbert_layer = TFDistilBertModel.from_pretrained('distilbert-base-uncased')

in_id = tf.keras.layers.Input(shape=(seq_max_length,), dtype='int32', name="input_ids")
in_mask = tf.keras.layers.Input(shape=(seq_max_length,), dtype='int32', name="input_masks")
    
dbert_inputs = [in_id, in_mask]
dbert_output = dbert_layer(dbert_inputs)[0]
x = CustomPool(output_dim = dbert_output.shape[2], name='CustomPooling')(dbert_output, in_mask)
dense1 = tf.keras.layers.Dense(256, activation = 'relu', name='dense256')(x)
pred = tf.keras.layers.Dense(n_classes, activation='softmax', name='MODEL_OUT')(dense1)

model = tf.keras.models.Model(inputs = dbert_inputs, outputs = pred, name='pooled_distilBert')

这里的任何帮助都将 * 非常 * 感谢，因为我已经浏览了现有的问题，大多数最终通过指定输入形状来解决（在我的情况下不适用）。

tensorflow

来源：https://stackoverflow.com/questions/63186066/input-to-reshape-doesnt-match-requested-shape

1条答案

按热度按时间

bweufnob1#

Using tf.reshape before a pooling layer
我知道我的回答有点晚了，但我想分享一下我对这个问题的解决方案。问题是，当你在模型训练过程中尝试重塑一个固定大小的向量（Tensor）时。向量会改变它的输入大小，像tf.reshape（updated_inputs，（shape = fixed_shape））这样的固定重塑会引发你的问题，实际上是我的问题：））希望它能有所帮助

赞(0）回复(0）举报 2022-11-16

我来回答

tensorflow 整形的输入与请求的形状不匹配

1条答案

相关问题

热门标签

最新问答