keras 如何匹配Conv2D自动编码器的输入和输出形状

lymgl2op  于 2022-11-13  发布在  其他
关注(0)|答案(1)|浏览(168)

有一组形状为(1000, 11, 1)的白色图像。我试图修改keras mnist example以处理我的数据,因此我编写了以下代码:

input_img = layers.Input(shape=(1000, 11, 1))

x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)

x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

打印摘要时,我可以看到输出形状与输入形状不同:

Model: "model_16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_18 (InputLayer)        [(None, 1000, 11, 1)]     0         
_________________________________________________________________
conv2d_119 (Conv2D)          (None, 1000, 11, 16)      160       
_________________________________________________________________
max_pooling2d_51 (MaxPooling (None, 500, 6, 16)        0         
_________________________________________________________________
conv2d_120 (Conv2D)          (None, 500, 6, 8)         1160      
_________________________________________________________________
max_pooling2d_52 (MaxPooling (None, 250, 3, 8)         0         
_________________________________________________________________
conv2d_121 (Conv2D)          (None, 250, 3, 8)         584       
_________________________________________________________________
max_pooling2d_53 (MaxPooling (None, 125, 2, 8)         0         
_________________________________________________________________
conv2d_122 (Conv2D)          (None, 125, 2, 8)         584       
_________________________________________________________________
up_sampling2d_51 (UpSampling (None, 250, 4, 8)         0         
_________________________________________________________________
conv2d_123 (Conv2D)          (None, 250, 4, 8)         584       
_________________________________________________________________
up_sampling2d_52 (UpSampling (None, 500, 8, 8)         0         
_________________________________________________________________
conv2d_124 (Conv2D)          (None, 498, 6, 16)        1168      
_________________________________________________________________
up_sampling2d_53 (UpSampling (None, 996, 12, 16)       0         
_________________________________________________________________
conv2d_125 (Conv2D)          (None, 996, 12, 1)        145       
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________

事实上,训练失败的原因是错误:

ValueError: logits and labels must have the same shape ((None, 996, 12, 1) vs (None, 1000, 11, 1))

我做错了什么?我如何修复我的代码以使用我的图像尺寸?

mpbci0fu

mpbci0fu1#

您可以按如下所示修改解码器的网络结构,以匹配编码器的输入形状和解码器的输出形状。Cropping2D层沿着空间维度(即高度和宽度)裁剪。

input_img = layers.Input(shape=(1000, 11, 1))

x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)

x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((4, 4))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
# Add a cropping layer
decoded=layers.Cropping2D(cropping=((0,0),(3,2)))(decoded)

model.summary()的输出:

Model: "model_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_9 (InputLayer)        [(None, 1000, 11, 1)]     0         
                                                                 
 conv2d_49 (Conv2D)          (None, 1000, 11, 16)      160       
                                                                 
 max_pooling2d_24 (MaxPoolin  (None, 500, 6, 16)       0         
 g2D)                                                            
                                                                 
 conv2d_50 (Conv2D)          (None, 500, 6, 8)         1160      
                                                                 
 max_pooling2d_25 (MaxPoolin  (None, 250, 3, 8)        0         
 g2D)                                                            
                                                                 
 conv2d_51 (Conv2D)          (None, 250, 3, 8)         584       
                                                                 
 max_pooling2d_26 (MaxPoolin  (None, 125, 2, 8)        0         
 g2D)                                                            
                                                                 
 conv2d_52 (Conv2D)          (None, 125, 2, 8)         584       
                                                                 
 up_sampling2d_24 (UpSamplin  (None, 250, 4, 8)        0         
 g2D)                                                            
                                                                 
 conv2d_53 (Conv2D)          (None, 250, 4, 8)         584       
                                                                 
 up_sampling2d_25 (UpSamplin  (None, 1000, 16, 8)      0         
 g2D)                                                            
                                                                 
 conv2d_54 (Conv2D)          (None, 1000, 16, 1)       73        
                                                                 
 cropping2d_6 (Cropping2D)   (None, 1000, 11, 1)       0         
                                                                 
=================================================================
Total params: 3,145
Trainable params: 3,145
Non-trainable params: 0
_________________________________________________________________

相关问题