keras 当评估良好时,使用数据集进行的Tensorflow模型预测只是猜测

xghobddn  于 2023-04-30  发布在  其他
关注(0)|答案(1)|浏览(135)

我正在训练一个图像分类器模型来检测啮齿动物。当我拟合模型并对其进行评估时,我有很强的指标表明它表现良好。然而,当我用tf数据集进行预测时,模型似乎只是猜测。当我预测奇异图像时,它再次准确。
下面是执行所有评估的函数:

def evaluate_model(hist, model_fp, model, train, test):
    # plot basic metrics
    metrics_path, _ = os.path.splitext(model_fp)
    train_labels = np.concatenate([y for x, y in train], axis=0)
    test_labels = np.concatenate([y for x, y in test], axis=0)

    plt.figure()
    ... # plot accuracy metrics from history

    metrics = ['loss', 'prc', 'precision', 'recall']
    ... # plot more metrics from history

    results = model.evaluate(test, verbose=0)
    for name, value in zip(model.metrics_names, results):
        print(name, ': ', value)
    print('\n')

    results = model.evaluate(train, verbose=0)
    for name, value in zip(model.metrics_names, results):
        print(name, ': ', value)
    print('\n')

    train_predictions = model.predict(train)
    test_predictions = model.predict(test)

    # PRC, ROC, CM GENERATED FROM THESE PREDICTIONS

下面是抓取和批处理图像的函数

def prepare_images(dir):
    training_data = image_dataset_from_directory(
        dir,
        validation_split=0.2,
        subset="training",
        image_size=IMG_SHAPE,
        batch_size=BATCH_SIZE,
        shuffle=True,
        seed=123
    )

    validation_data = image_dataset_from_directory(
        dir,
        validation_split=0.2,
        subset="validation",
        image_size=IMG_SHAPE,
        batch_size=BATCH_SIZE,
        shuffle=True,
        seed=456
    )

    # create test data set using 20% of validation data
    val_batches = tf.data.experimental.cardinality(validation_data)
    test_data = validation_data.take(val_batches // 5)
    validation_data = validation_data.skip(val_batches // 5)
    return training_data, validation_data, test_data

拟合模型

def train_model(model, train, val, test, epochs, cb, class_weights, model_fp):
    history = model.fit(
        train,
        validation_data=val,
        epochs=epochs,
        batch_size=BATCH_SIZE,
        callbacks=cb,
        class_weight=class_weights,
        verbose=1
    )

因此,当我使用测试数据集评估模型时,输出如下所示:

loss :  0.1926884800195694
tp :  197.0
fp :  20.0
tn :  159.0
fn :  8.0
accuracy :  0.9270833134651184
precision :  0.9078341126441956
recall :  0.9609755873680115
auc :  0.979574978351593
prc :  0.9817492961883545

然而,当用相同批次进行预测时。..

58/58 [==============================] - 6s 91ms/step
3/3 [==============================] - 1s 142ms/step
Legitimate Transactions Detected (True Negatives):  98
Legitimate Transactions Incorrectly Detected (False Positives):  92
Fraudulent Transactions Missed (False Negatives):  83
Fraudulent Transactions Detected (True Positives):  111
Total Fraudulent Transactions:  194

当单独预测10幅图像时,其中9幅是正确的。
为什么是模型。预测看似随机的猜测,当涉及到使用批处理数据集,但不是与个别图像?附近有工作吗?谢谢

93ze6v8z

93ze6v8z1#

好吧,我弄明白了,不确定这是文档中的错误还是Tensorflow中的错误,因为文档明确指出您可以使用 www.example.com 。我甚至确保不指定批量大小,因为它说。解决方法是将数据集分成图像数据和标签数据,如下所示:

for image_batch, labels_batch in test:
    x_test = image_batch.numpy()
    y_test = labels_batch.numpy()

然后只使用x_test调用predict函数,并用这些预测绘制度量值,然后y_test

test_predictions = model.predict(x_test)
plot_cm(test_predictions, y_test, p, metrics_path) # function to plot confusion matrix

相关问题