我正在使用TensorFlow训练分割模型,在训练过程中遇到了一个错误。大约6秒后,训练停止,并显示以下错误消息:
Epoch 1/100
2023-07-17 08:14:20.618828: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inmodel_3/dropout_15/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
8278/8278 [==============================] - 6s 198us/step - loss: 2.1831 - accuracy: 0.8421 - val_loss: 2.2880 - val_accuracy: 0.8349
字符串
我使用自定义数据生成器(DataGen
)来加载和预处理输入图像和蒙版。错误似乎与模型的布局有关,特别是dropout
层。我不确定为什么值的大小与排列大小不匹配。我认为这可能与数据生成器有关。
我在下面包含了相关的代码片段:
# Data generator
class DataGen(tf.keras.utils.Sequence):
def __init__(self, path_input, path_mask, class_name='person', batch_size=8, image_size=128):
self.ids = os.listdir(path_mask)
self.path_input = path_input
self.path_mask = path_mask
self.class_name = class_name
self.batch_size = batch_size
self.image_size = image_size
self.on_epoch_end()
def __load__(self, id_name):
image_path = os.path.join(self.path_input, id_name)
mask_path = os.path.join(self.path_mask, id_name)
image = cv2.imread(image_path, 1) # 1 specifies RGB format
image = cv2.resize(image, (self.image_size, self.image_size)) # resizing before inserting into the network
mask = cv2.imread(mask_path, -1)
mask = cv2.resize(mask, (self.image_size, self.image_size))
mask = mask.reshape((self.image_size, self.image_size, 1))
# normalize image
image = image / 255.0
mask = mask / 255.0
return image, mask
def __getitem__(self, index):
id_name = self.ids[index]
image, mask = self.__load__(id_name)
if image is not None and mask is not None:
images = np.expand_dims(image, axis=0)
masks = np.expand_dims(mask, axis=0)
else:
images = np.empty((self.image_size, self.image_size, 3))
masks = np.empty((self.image_size, self.image_size, 1))
return images, masks
def on_epoch_end(self):
pass
def __len__(self):
return len(self.ids)
# Configure model
image_size = 128
epochs = 100
batch_size = 10
# Create data generators
train_gen = DataGen(path_input="/kaggle/input/coco-2014-dataset-for-yolov3/coco2014/images/train2014",
path_mask="/kaggle/working/mask_train_2014",
batch_size=batch_size,
image_size=image_size)
val_gen = DataGen(path_input="/kaggle/input/coco-2014-dataset-for-yolov3/coco2014/images/val2014",
path_mask="/kaggle/working/mask_val_2014",
batch_size=batch_size,
image_size=image_size)
# Define model architecture
inputs = Input(shape=(128, 128, 3))
# ...
# Compile and train the model
optimizer = tf.keras.optimizers.Adam(lr=1e-4)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_gen, validation_data=val_gen, steps_per_epoch=train_steps, epochs=epochs)
型
如有任何关于如何解决这一问题的见解或建议,将不胜感激。
我使用的是coco2014 dataset. tf版本'2.12.0'
1条答案
按热度按时间xj3cbfub1#
我有一个非常类似的错误消息来自我的教练的第一个时代:
字符串
我了解到,在我的情况下,我可以忽略这一点。适合例行继续工作,并正确工作.