如何在Keras模型中平均权重,当我用不同的初始化训练具有相同架构的几个模型时?
现在我的代码看起来像这样?
datagen = ImageDataGenerator(rotation_range=15,
width_shift_range=2.0/28,
height_shift_range=2.0/28
)
epochs = 40
lr = (1.234e-3)
optimizer = Adam(lr=lr)
main_input = Input(shape= (28,28,1), name='main_input')
sub_models = []
for i in range(5):
x = Conv2D(32, kernel_size=(3,3), strides=1)(main_input)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPool2D(pool_size=2)(x)
x = Conv2D(64, kernel_size=(3,3), strides=1)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPool2D(pool_size=2)(x)
x = Conv2D(64, kernel_size=(3,3), strides=1)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Flatten()(x)
x = Dense(1024)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Dropout(0.1)(x)
x = Dense(256)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Dropout(0.4)(x)
x = Dense(10, activation='softmax')(x)
sub_models.append(x)
x = keras.layers.average(sub_models)
main_output = keras.layers.average(sub_models)
model = Model(inputs=[main_input], outputs=[main_output])
model.compile(loss='categorical_crossentropy', metrics=['accuracy'],
optimizer=optimizer)
print(model.summary())
plot_model(model, to_file='model.png')
filepath="weights.best.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
tensorboard = TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=True)
callbacks = [checkpoint, tensorboard]
model.fit_generator(datagen.flow(X_train, y_train, batch_size=128),
steps_per_epoch=len(X_train) / 128,
epochs=epochs,
callbacks=callbacks,
verbose=1,
validation_data=(X_test, y_test))
所以现在我只对最后一层求平均,但我想在分别训练每一层后对所有层的权重求平均。
谢谢!
2条答案
按热度按时间rta7y2nd1#
假设
models
是模型的集合。首先,收集所有权重:现在-创建一个新的平均权重:
剩下的就是在新模型中设置这些权重:
当然-平均权重可能是一个坏主意,但如果你尝试-你应该遵循这种方法。
nzrxty8p2#
我不能对接受的答案发表评论,但是为了让它在
tensorflow 2.0
上使用tf.keras
,我必须将循环中的列表变成一个numpy数组:如果需要对不同的输入模型进行不同的加权,则需要将
np.array(w).mean(axis=0)
替换为np.average(np.array(w),axis=0, weights=relative_weights)
,其中relative_weights
是一个数组,每个模型都有一个权重因子。