我有4个数据集,来自4个不同的 Dataframe 。数据集中的2个用于预测图像美学分数,而另外2个用于预测图像质量分数。我想训练一个可以分别预测分数的模型,使用4个单独的输入,并输出4个单独的输出分数。我使用InceptionResNetv2作为基本模型。
第一个月
因此,我决定使用ImageDataGenerators从4个不同的目录中输入图像。这就是我为所有4个数据集准备它们的方式。请注意,尽管它们都具有相同的x_col
ID列,但它们具有不同的命名格式,因为它们来自不同的数据集。
# preprocess the images in train-validation-test, do for all dataset
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
# ava
ava_train_generator = train_datagen.flow_from_dataframe(
dataframe=ava_train_df,
directory=ava_images,
x_col="ID",
y_col="scaled_MOS_aesthetic",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
ava_test_generator = test_datagen.flow_from_dataframe(
dataframe=ava_test_df,
directory=ava_images,
x_col="ID",
y_col="scaled_MOS_aesthetic",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
ava_val_generator = val_datagen.flow_from_dataframe(
dataframe=ava_val_df,
directory=ava_images,
x_col="ID",
y_col="scaled_MOS_aesthetic",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
print('AVA generators complete\n')
# para
para_train_generator = train_datagen.flow_from_dataframe(
dataframe=para_train_df,
directory=para_images,
x_col="ID",
y_col="scaled_MOS_aesthetic",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
para_test_generator = test_datagen.flow_from_dataframe(
dataframe=para_test_df,
directory=para_images,
x_col="ID",
y_col="scaled_MOS_aesthetic",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
para_val_generator = val_datagen.flow_from_dataframe(
dataframe=para_val_df,
directory=para_images,
x_col="ID",
y_col="scaled_MOS_aesthetic",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
print('PARA generators complete\n')
# koniq
koniq_train_generator = train_datagen.flow_from_dataframe(
dataframe=koniq_train_df,
directory=koniq_images,
x_col="ID",
y_col="scaled_MOS_quality",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
koniq_test_generator = test_datagen.flow_from_dataframe(
dataframe=koniq_test_df,
directory=koniq_images,
x_col="ID",
y_col="scaled_MOS_quality",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
koniq_val_generator = val_datagen.flow_from_dataframe(
dataframe=koniq_val_df,
directory=koniq_images,
x_col="ID",
y_col="scaled_MOS_quality",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
print('KoNIQ generators complete\n')
# spaq
spaq_train_generator = train_datagen.flow_from_dataframe(
dataframe=spaq_train_df,
directory=spaq_images,
x_col="ID",
y_col="scaled_MOS_quality",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
spaq_test_generator = test_datagen.flow_from_dataframe(
dataframe=spaq_test_df,
directory=spaq_images,
x_col="ID",
y_col="scaled_MOS_quality",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
spaq_val_generator = val_datagen.flow_from_dataframe(
dataframe=spaq_val_df,
directory=spaq_images,
x_col="ID",
y_col="scaled_MOS_quality",
class_mode="raw",
target_size=(224, 224),
batch_size=32
)
print('SPAQ generators complete\n')
字符串
这是ImageDataGenerator的输出:
Found 2040 validated image filenames.
Found 2039 validated image filenames.
AVA generators complete
Found 4482 validated image filenames.
Found 961 validated image filenames.
Found 961 validated image filenames.
PARA generators complete
Found 5756 validated image filenames.
Found 1234 validated image filenames.
Found 1233 validated image filenames.
KoNIQ generators complete
Found 8243 validated image filenames.
Found 1767 validated image filenames.
Found 1767 validated image filenames.
SPAQ generators complete
型
然后,我继续使用zip()
合并这些生成器,以组合来自多个生成器的数据。我试着model.fit()
:
history = model.fit(x=zip(ava_train_generator, para_train_generator, koniq_train_generator, spaq_train_generator),
steps_per_epoch = max(steps_per_epoch1, steps_per_epoch2, steps_per_epoch3, steps_per_epoch4),
epochs = config.epoch,
validation_data = zip(ava_val_generator, para_val_generator, koniq_val_generator, spaq_val_generator),
validation_steps = max(val_steps1, val_steps2, val_steps3, val_steps4),
callbacks = [
model_checkpoint_callback,
early_stopping_callback
])
型
但是发生了一个错误:
"name": "ValueError",
"message": "Data is expected to be in format `x`, `(x,)`, `(x, y)`, or `(x, y, sample_weight)`
型
我检查了所有的目标大小,它都是一样的。在这种情况下会出现什么问题?在这种情况下,我应该如何从不同的目录创建ImageDataGenerator?
1条答案
按热度按时间wz8daaqr1#
我发现,通过在函数中使用
yield
,可以创建一个用于训练、测试和验证集的组合生成器。一切都如我所愿。字符串
通过创建这个函数,我可以创建一个图像列表,Map到它们各自的分数。例如:图像
batch1[0]
被Map到batch1[1]
分数,图像batch2[0]
被Map到batch2[1]
分数,等等。型
剩下的就是将
combined_train_gen
和combined_val_gen
插入到model.fit()
中型