Keras迁移学习设置图层,trainable为True无效

c6ubokkw  于 2023-01-13  发布在  其他
关注(0)|答案(1)|浏览(152)

我想使用tf.keras(tensorflow 2.3)对efficientnet进行微调,但我无法正确更改层的训练状态。

data_augmentation_layers = tf.keras.Sequential([
 keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
 keras.layers.experimental.preprocessing.RandomRotation(0.8)])

efficientnet = EfficientNetB3(weights="imagenet", include_top=False,
                                input_shape=(*img_size, 3))

#Setting to not trainable as described in the standard keras FAQ
efficientnet.trainable = False

inputs = keras.layers.Input(shape=(*img_size, 3))
augmented = augmentation_layers(inputs)
base = efficientnet(augmented, training=False)
pooling = keras.layers.GlobalAveragePooling2D()(base)
outputs = keras.layers.Dense(5, activation="softmax")(pooling)

model = keras.Model(inputs=inputs, outputs=outputs)

model.compile(loss="categorical_crossentropy", optimizer=keras_opt, metrics=["categorical_accuracy"])

这样做是为了使我的随机重量的自定义顶部不会破坏的重量尽快。

Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 512, 512, 3)]     0         
_________________________________________________________________
sequential (Sequential)      (None, 512, 512, 3)       0         
_________________________________________________________________
efficientnetb3 (Functional)  (None, 16, 16, 1536)      10783535  
_________________________________________________________________
global_average_pooling2d (Gl (None, 1536)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 7685      
=================================================================
Total params: 10,791,220
Trainable params: 7,685
Non-trainable params: 10,783,535

在此之前,一切似乎都正常,我训练了我的模型两个时期,然后我想开始微调efficientnet基数,因此我调用

for l in model.get_layer("efficientnetb3").layers:
  if not isinstance(l, keras.layers.BatchNormalization):
    l.trainable = True

model.compile(loss="categorical_crossentropy", optimizer=keras_opt, metrics=["categorical_accuracy"])

我重新编译并再次打印了摘要,发现不可训练的权重数量保持不变,而且拟合并没有带来比保持冻结更好的结果。

dense (Dense)                (None, 5)                 7685      
    =================================================================
    Total params: 10,791,220
    Trainable params: 7,685
    Non-trainable params: 10,783,535

Ps:我也试过efficientnet3.trainable = True,但这也没有效果。
可能是因为我同时使用了顺序模型和函数模型吗?

kzipqqlq

kzipqqlq1#

对我来说,问题是在模型的一部分使用了顺序API。当我改为顺序API时,我的model.sumary()显示了所有的子层,并且可以将其中一些设置为可训练的,而另一些则不可以。

相关问题