我在训练自动编码CNN时遇到了麻烦。我的目标是以一种无人监督的方式对文档图像(收据、信件等)进行聚类(顺便问一下,除了自动编码器之外,你还有其他算法吗?)
所以我尝试做一个自动编码器,我总是得到奇怪的解码输出,我不知道是什么问题。我从一个非常简单的模型开始,没有太多的压缩:
Layer (type) Output Shape Param #
=================================================================
conv2d_62 (Conv2D) (None, 100, 76, 16) 448
_________________________________________________________________
activation_62 (Activation) (None, 100, 76, 16) 0
_________________________________________________________________
conv2d_63 (Conv2D) (None, 50, 38, 32) 4640
_________________________________________________________________
activation_63 (Activation) (None, 50, 38, 32) 0
_________________________________________________________________
conv2d_64 (Conv2D) (None, 50, 38, 32) 9248
_________________________________________________________________
activation_64 (Activation) (None, 50, 38, 32) 0
_________________________________________________________________
up_sampling2d_26 (UpSampling (None, 100, 76, 32) 0
_________________________________________________________________
conv2d_65 (Conv2D) (None, 100, 76, 16) 4624
_________________________________________________________________
activation_65 (Activation) (None, 100, 76, 16) 0
_________________________________________________________________
up_sampling2d_27 (UpSampling (None, 200, 152, 16) 0
_________________________________________________________________
conv2d_66 (Conv2D) (None, 200, 152, 3) 435
_________________________________________________________________
activation_66 (Activation) (None, 200, 152, 3) 0
=================================================================
Total params: 19,395
Trainable params: 19,395
Non-trainable params: 0
我用少量的输入(~200)进行训练,这样训练就快了,我可以更快地调试。
在20个时期和32个批量之后,模型似乎收敛:
Epoch 1/20
4/4 [==============================] - 5s 1s/step - loss: 0.4359
Epoch 2/20
4/4 [==============================] - 5s 1s/step - loss: 0.4290
Epoch 3/20
4/4 [==============================] - 4s 904ms/step - loss: 0.4192
Epoch 4/20
4/4 [==============================] - 5s 1s/step - loss: 0.4045
Epoch 5/20
4/4 [==============================] - 3s 783ms/step - loss: 0.3886
Epoch 6/20
4/4 [==============================] - 3s 797ms/step - loss: 0.3706
Epoch 7/20
4/4 [==============================] - 5s 1s/step - loss: 0.3393
Epoch 8/20
4/4 [==============================] - 3s 777ms/step - loss: 0.3165
Epoch 9/20
4/4 [==============================] - 3s 850ms/step - loss: 0.2786
Epoch 10/20
4/4 [==============================] - 3s 780ms/step - loss: 0.2436
Epoch 11/20
4/4 [==============================] - 3s 817ms/step - loss: 0.2036
Epoch 12/20
4/4 [==============================] - 3s 771ms/step - loss: 0.1745
Epoch 13/20
4/4 [==============================] - 5s 1s/step - loss: 0.1347
Epoch 14/20
4/4 [==============================] - 3s 820ms/step - loss: 0.1150
Epoch 15/20
4/4 [==============================] - 5s 1s/step - loss: 0.1017
Epoch 16/20
4/4 [==============================] - 3s 792ms/step - loss: 0.0886
Epoch 17/20
4/4 [==============================] - 3s 789ms/step - loss: 0.0868
Epoch 18/20
4/4 [==============================] - 3s 842ms/step - loss: 0.0844
Epoch 19/20
4/4 [==============================] - 3s 762ms/step - loss: 0.0797
Epoch 20/20
4/4 [==============================] - 3s 779ms/step - loss: 0.0768
但输出图像看起来都像这样:
output of autoencoder (example)
对于损失,我使用了平均绝对误差和SGD优化器(其他优化器收敛得不太好)。
我试着增加历元的数量,但损失停滞在0.07左右,没有下降。
我做错了什么?有什么改进的想法吗?先谢了。
编辑:代码如下
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255, zca_whitening=False, rotation_range=0.2, width_shift_range=0.005, height_shift_range=0.005, zoom_range=0.005)
train_generator = datagen.flow_from_directory('fp_img',class_mode='input',target_size=image_dims, batch_size=batch_size,shuffle=True)
import tensorflow.keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Activation, Flatten, Input
from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D, Reshape
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator
input_shape = image_rgb_dims
# Define the model
model = Sequential()
model.add(Conv2D(16, (3, 3), strides=2, padding='same', input_shape=image_rgb_dims))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3), strides=2, padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(32,(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(16,(3, 3), padding='same'))
model.add(Activation('relu'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(3,(3, 3), padding='same'))
model.add(Activation('sigmoid'))
model.summary()
# Compile the model
model.compile(optimizer='adagrad', loss='mean_absolute_error')
# Train the model
model.fit(
train_generator,
steps_per_epoch= n_images // batch_size,
epochs=20)
2条答案
按热度按时间v6ylcynt1#
几点建议:
w6mmgewl2#
您尝试显示的图像是否是uint 8值在0-255范围内的所有矩阵?
我以前在尝试显示图像时看到过白色,这是因为它们在正确的值范围内,但它们仍然是float 32。