无法使多输出CNN工作(tensorflow和keras)

p8ekf7hl  于 2023-03-23  发布在  其他
关注(0)|答案(1)|浏览(167)

我目前正在进行内窥镜视频的光纤尖端跟踪任务。为此,我有两个型号:

  • 分类器,用于判断图像是否包含纤维(is_visible)
  • 预测纤维尖端位置(x,y)的回归因子

我使用ResNet 18在ImageNet上进行预训练,效果很好。但我遇到了性能问题,所以我决定使用多输出方法将这两个模型合并为一个模型。但到目前为止,我还无法让它工作。

TENSORFLOW:

TensorFlow版本:2.10.1

数据集:

我的数据集以HDF 5格式存储。每个样本都有:

  • 图像(224,224,3)
  • 可见性标志的uint 8
  • 以及用于光纤尖端位置(x,y)的两个浮子

我使用自定义生成器加载此数据集,如下所示:

output_types = (tf.float32, tf.uint8, tf.float32)
output_shapes = (
    tf.TensorShape((None, image_height, image_width, number_of_channels)),  # image
    tf.TensorShape((None, 1)),                                              # is_visible
    tf.TensorShape((None, 1, 1, 2)),                                        # x, y
)

train_dataset = tf.data.Dataset.from_generator(
    generator, output_types=output_types, output_shapes=output_shapes,
)

型号:

我的模型定义如下:

model = ResNet18(input_shape=(224, 224, 3), weights="imagenet", include_top=False)
inputLayer = model.input
innerLayer = tf.keras.layers.Flatten()(model.output)

is_visible = tf.keras.layers.Dense(1, activation="sigmoid", name="is_visible")(innerLayer)

position = tf.keras.layers.Dense(2)(innerLayer)
position = tf.keras.layers.Reshape((1, 1, 2), name="position")(position)

model = tf.keras.Model(inputs=[inputLayer], outputs=[is_visible, position])
adam = tf.keras.optimizers.Adam(1e-4)
model.compile(
    optimizer=adam,
    loss={
        "is_visible": "binary_crossentropy",
        "position": "mean_squared_error",
    },
    loss_weights={
        "is_visible": 1.0,
        "position": 1.0
    },
    metrics={
        "is_visible": "accuracy",
        "position": "mean_squared_error"
    },
)

问题:

数据集运行得很好,我可以循环遍历每一批数据。但当涉及到训练时

model.fit(
    train_dataset,
    validation_data=validation_dataset,
    epochs=100000,
    callbacks=callbacks,
)

1.我收到以下错误

  • 值错误:无法挤压dim[3],预期维度为1,对于“{{node mean_squared_error/weighted_loss/Squeeze}} = SqueezeT=DT_FLOAT,squeeze_dims=[-1]”,使用输入形状得到2:[?,1,1,2].*

1.我尝试像这样更改数据集格式:

output_types = (tf.float32, tf.uint8, tf.float32, tf.float32)
output_shapes = (
    tf.TensorShape((None, image_height, image_width, number_of_channels)),  # image
    tf.TensorShape((None, 1)),                                              # is_visible
    tf.TensorShape((None, 1)),                                              # x
    tf.TensorShape((None, 1)),                                              # y
)

但这会导致另一个错误:

  • 值错误:数据的格式应为x(x,)(x, y)(x, y, sample_weight),发现:(〈tf.Tensor 'IteratorGetNext:0' shape=(None,224,224,3)dtype=float32〉,〈tf.Tensor 'IteratorGetNext:1' shape=(None,1)dtype=uint8〉,〈tf.Tensor 'IteratorGetNext:2' shape=(None,1)dtype=float32〉,〈tf.Tensor 'IteratorGetNext:3' shape=(None,1)dtype=float32〉)*

我试着把从train_dataset返回的is_visible和(x,y)打包到字典中,如下所示:

yield image_batch, {"is_visible": is_visible_batch, "position": position_batch}

也尝试了这些选项:

yield image_batch, (is_visible_batch, position_batch)
yield image_batch, [is_visible_batch, position_batch]

但这并没有帮助
谁能告诉我我做错了什么?我完全卡住了))

mpgws1up

mpgws1up1#

回答我自己的问题。我能让这东西工作。
我已经修改了我的代码一点(删除冗余维度),但不要让它分散你的注意力。下面你会发现解决问题的确切修改。我的数据集现在看起来如下:

output_types = (
    tf.float32,
    (
        tf.float32,
        tf.uint8
    )
)
output_shapes = (
    tf.TensorShape((256, 256, 3)),
    (
        tf.TensorShape((2)),
        tf.TensorShape((1)),
    )
)
train_dataset = tf.data.Dataset.from_generator(
    generator, output_types=output_types, output_shapes=output_shapes,
)

请注意,数据集Tensor由两种类型组成:

  • 第一类型是用于输入图像的Tensor
  • 第二种类型是用于多个输出目标的元组(可见性标志和像素坐标)

如果你有多个目标,你需要把它们 Package 成元组,像这样:

output_types = (
    #input, in my case I have a single input image
    tf.float32,     #image type

    #output, in my case I have a multioutput dataset (and multioutput model),
    #so we need to wrap target types into tuple
    (
        tf.float32, #type for regression task - fiber tip position (pixel coordinates in range [0; 1])
        tf.uint8    #type for classification task - is fiber visible (0 - not visible, 1 - visible)
    )
)

同样我们也有数据集形状

output_shapes = (
    #single tensor for input image
    tf.TensorShape((256, 256, 3)),

    #tuple of tensors for multiple output
    (
        tf.TensorShape((2)),    #two coordinates for x, y position
        tf.TensorShape((1)),    #single value for classification task (visibility flag)
    )
)

这是我的模型

#I've changed my DNN architecture to VGG16, but this is not the case, it should work for any network
#as soon the model types are configured properly
model = tf.keras.applications.VGG16(input_shape=(224, 224, 3), weights="imagenet", include_top=False)

model.trainable = False
for layer in model.layers[-3:]:
    layer.trainable = True

inputLayer = model.input

hiddenLayers = tf.keras.layers.Flatten(name="flatten")(model.output)

position = tf.keras.layers.Dense(2, activation="sigmoid", name="position")(hiddenLayers)
is_visible = tf.keras.layers.Dense(1, activation="sigmoid", name="is_visible")(hiddenLayers)

model = tf.keras.Model(inputs=[inputLayer], outputs=[position, is_visible])
adam = tf.keras.optimizers.Adam(1e-4)
model.compile(
    optimizer=adam,
    loss={
        "position": "mean_squared_error",
        "is_visible": "binary_crossentropy"
    },
    loss_weights={
        "position": 1.0,
        "is_visible": 1.0
    },
    metrics={
        "position": "mean_squared_error",
        "is_visible": "accuracy"
    }
)

最后调用fit方法来训练:

model.fit(
train_dataset,
validation_data=validation_dataset,
epochs=100000,
callbacks=callbacks,


希望这有助于一些新手谁将遵循同样的道路

相关问题