keras 如何在Customdatagenerator中获取confusion_matrix y_true

lyfkaqu1  于 2023-10-19  发布在  其他
关注(0)|答案(1)|浏览(98)

我想构建confusion_matrix,但总是收到错误消息

ValueError
Found input variables with inconsistent numbers of samples: [0, 62]
  File "C:\Labbb\inceptionResnetV2\InceptionResnetV2_1.py", line 216, in <module>
    sns.heatmap(confusion_matrix(y_true, y_pred),
ValueError: Found input variables with inconsistent numbers of samples: [0, 62]

如何在Customdatagenerator中获取y_true?
我尝试在get_data中附加y_true,并使用def get_y_true返回y_true,但不起作用
下面是CustomDataGenerator代码。

class CustomDataGenerator(Sequence):
    def __init__(self, image_folders, label_folders, dir, dim=(512,512),  batch_size=1,n_classes=7,n_channels=8,shuffle=True):
        self.image_folders = image_folders
        ...
        self.image_paths = []
        self.label_paths = []
        self.y_true = []
        self.on_epoch_end()

    def __len__(self):
        return int(np.ceil(len(self.image_paths) / self.batch_size))  

    def __getitem__(self, index):
        batch_image_paths = self.image_paths[index * self.batch_size: (index + 1) * self.batch_size]
        batch_label_paths = self.label_paths[index * self.batch_size: (index + 1) * self.batch_size]
        batch = zip(batch_image_paths, batch_label_paths)
        return self.get_data(batch)

    def on_epoch_end(self):
        self.image_paths = []
        self.label_paths = []
        for folder in self.image_folders:
            image_folder_path = os.path.join(self.dir, folder)
            image_files = os.listdir(image_folder_path)
            for file_name in image_files:
                self.image_paths.append(os.path.join(image_folder_path, file_name))
        for folder in self.label_folders:
            label_folder_path = os.path.join(self.dir, folder)
            label_files = os.listdir(label_folder_path)
            for file_name in label_files:
                self.label_paths.append(os.path.join(label_folder_path, file_name))
        if self.shuffle:
            np.random.shuffle(self.image_paths)
            np.random.shuffle(self.label_paths)

    def get_data(self, batch):
        X = np.empty((self.batch_size, *self.dim, self.n_channels))
        y = np.empty((self.batch_size, self.n_classes))
        y_true = []

        for i, (image_path, label_path) in enumerate(batch):
            image = np.load(image_path)
            with open(label_path, 'r') as f:
                line = f.readline().strip()
                filepath, label = line.rsplit(' ', 1)
                label = int(label)
                y_true.append(label)
            label_one_hot = to_categorical(label, num_classes=self.n_classes)

            X[i,] = image
            y[i,] = label_one_hot

        return X, y
    
    def get_y_true(self):
        return self.y_true

这里是获取y_true和y_pred,并构建confusion_matrix

train_datagen = CustomDataGenerator(image_folders, label_folders, train_dir, **params, shuffle = True)
val_datagen = CustomDataGenerator(image_folders, label_folders, valid_dir, **params, shuffle = True)

y_true = CustomDataGenerator.get_y_true(val_datagen)
Y_pred = model.predict(val_datagen)
y_pred = np.argmax(Y_pred, axis=1) 
sns.heatmap(confusion_matrix(y_true, y_pred),annot=True, fmt="d", cmap='Greens',ax = ax)
tct7dpnv

tct7dpnv1#

有几件事我想评论一下。
对于你的第一个问题,y_true是空的:self.y_true=[]在类的__init__()中。它永远不会被填满。在get_data(..)中是一个y_true,但它不是self.y_true,所以它不会被存储,并在方法结束时丢失。形状[0, 62]的错误也表明了这一点,这里的self.y_true的形状为0,因此它是空的。
这里有一些关于代码质量的提示。on_epoch_end(..)做得太多了。您不需要在每个epoch重写图像路径。用另一种方法进行初始化,只在on_epoch_end()中进行 Shuffle 。
您还应该小心使用__init__()中的dir参数。dir是一个build-in function of python,除非你知道你在做什么,否则你不应该覆盖它们。这就是为什么它在代码中用橙子突出显示的原因。在这个特定的代码中,它没有造成任何伤害,但要注意它。
你可以调用y_true = val_datagen.get_y_true()而不是y_true = CustomDataGenerator.get_y_true(val_datagen)。它的工作原理是一样的,而且(在我看来)更清晰。坦白说,我从没见过你的记号。
最后一点,您的示例不可复制。我试着运行你的代码,但你似乎省略了代码的某些部分,我得到了错误,不得不猜测来修复它们。当你提交整个(相关的)代码并对其进行注解时,它真的很有帮助。

相关问题