keras 如何识别“EXIF数据可能已损坏”的图像

ycl3bljg  于 2022-11-13  发布在  其他
关注(0)|答案(3)|浏览(156)

我正在进行一个图像分类Kaggle竞赛,并从Kaggle.com下载了一些训练图像。然后,我在Keras 2.0和Tensorflow(以及Python 3)的背景下,使用迁移学习和ResNet50来处理这些图像。
然而,在总共1281个训练图像中,有258个图像“可能损坏EXIF数据”,在加载到ResNet模型时被忽略,很可能是由于Pillow issue
输出消息如下所示:

/home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data.  Expecting to read 524288 bytes but only got 0. Skipping tag 3
  "Skipping tag %s" % (size, len(data), tag))
/home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data.  Expecting to read 393216 bytes but only got 0. Skipping tag 3
  "Skipping tag %s" % (size, len(data), tag))
/home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data.  Expecting to read 33554432 bytes but only got 0. Skipping tag 4
  "Skipping tag %s" % (size, len(data), tag))
/home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data.  Expecting to read 25165824 bytes but only got 0. Skipping tag 4
  "Skipping tag %s" % (size, len(data), tag))
/home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data.  Expecting to read 131072 bytes but only got 0. Skipping tag 3
  "Skipping tag %s" % (size, len(data), tag))
(more to come ...)

根据输出的消息,我只知道它们在那里,但不知道它们在哪...
我的问题是:我如何识别这258个图像,以便手动将它们从数据集中删除?

nhaq1z21

nhaq1z211#

编辑:要将警告作为您可以捕获的错误,请查看下面的Justas注解。
即使这个问题是超过一年的历史,我想显示我的解决方案,因为我遇到了同样的问题。
我正在编辑错误消息。输出显示了在您的系统上哪里可以找到该文件以及行号。例如,我更改了以下内容:

if len(data) != size:
    warnings.warn("Possibly corrupt EXIF data.  "
                  "Expecting to read %d bytes but only got %d."
                  " Skipping tag %s" % (size, len(data), tag))
    continue

if len(data) != size:
    raise ValueError('Corrupt Exif data')
    warnings.warn("Possibly corrupt EXIF data.  "
                  "Expecting to read %d bytes but only got %d."
                  " Skipping tag %s" % (size, len(data), tag))
    continue

下面是我捕获ValueError的代码。该代码的优点是PIL被中断,不会显示无用的消息。您也可以捕获并使用它,例如通过“except”部分删除相应的文件。

import os
from PIL import Image

imageFolder = /Path/To/Image/Folder
listImages = os.listdir(imageFolder)

for img in listImages:
    imgPath = os.path.join(imageFolder,img)
            
    try:
        img = Image.open(imgPath)
        exif_data = img._getexif()
    except ValueError as err:
        print(err)
        print("Error on image: ", img)

我知道添加ValueError部分既快又脏,但这总比面对所有无用的警告消息要好。

pinkon5k

pinkon5k2#

如果这对将来的任何人都有帮助,下面是我如何从我的数据集中删除所有EXIF数据,这就删除了PIL警告。

# remove corrupt exif data

from PIL import Image

file_names = get_image_files(path)

def remove_exif(image_name):
    image = Image.open(image_name)
    if not image.getexif():
        return
    print('removing EXIF from', image_name, '...')
    data = list(image.getdata())
    image_without_exif = Image.new(image.mode, image.size)
    image_without_exif.putdata(data)

    image_without_exif.save(image_name)

for file in file_names:
    remove_exif(file)
print('done')
7nbnzgx9

7nbnzgx93#

想到的最简单的方法是修改代码,一次处理一个图像,然后遍历每个图像,检查哪个图像生成了警告。

相关问题