HDFS 如何读取存储为1-D数组的hdf 5文件并将其视为图像？

dgenwo3n 于 2024-01-04 发布在 HDFS

关注(0)|答案(3)|浏览(252)

我有一个大的图像分类数据集存储在格式.hdf5。该数据集的标签和图像存储在.hdf5文件。我无法查看图像，因为它们是存储在数组的形式。数据集阅读代码，我已经使用如下，

import h5py
import numpy
f = h5py.File('data/images.hdf5', 'r')
print(list(f.keys()))

字符串
['datasets']个

group = f['datasets']
list(group.keys())

型
['car']个
现在当我读取组cars时，我有以下输出，

data = group['car']
data.shape,data[0].shape,data[1].shape

型
((51,), (383275,), (257120,)个
因此，看起来标签car有51图像，图像存储为383275和257120维数组，没有关于其高度和宽度尺寸的信息。我想再次将图像保存为RGB。接下来，在代码here之后，我尝试读取图像。

import numpy as np
from PIL import Image

# hdf = h5py.File("Sample.h5",'r')
array = data[0]
img = Image.fromarray(array.astype('uint8'), 'RGB')
img.save("yourimage.thumbnail", "JPEG")
img.show()

型
不幸的是，收到以下错误。

File /usr/local/lib/python3.8/dist-packages/PIL/Image.py:784, in Image.frombytes(self, data, decoder_name, *args)
    781 s = d.decode(data)
    783 if s[0] >= 0:
--> 784     raise ValueError("not enough image data")
    785 if s[1] != 0:
    786     raise ValueError("cannot decode image data")

ValueError: not enough image data

型
参考资料我已经检查了hdf group help library等任何帮助将是非常有用的。谢谢。

hdfs

来源：https://stackoverflow.com/questions/77589501/how-can-i-read-hdf5-files-stored-as-1-d-array-and-view-them-as-images

3条答案

按热度按时间

mv1qrgav1#

首先，f['datasets']['car']对象是一个DATASET，而不是一个GROUP。其次，基于这个输出，我认为你的数据集是一个具有可变长度数组行的数据集（也称为“ragged”数组）。

# this is group object reference:
group = f['datasets']
# these are equivalent dataset object references:
data = group['car']  
data = f['datasets']['car']  
# this gives the dataset shape (# of rows), then the shape for data on row 0 and  row 1:
data.shape,data[0].shape,data[1].shape
((51,), (383275,), (257120,)

字符串
这样，我认为你有51行，每行都有一个不同大小的一维数组。
要重建每个图像，您需要图像的原始形状。希望它在文件中的某个地方有文档记录。您有这个文件的模式定义吗？如果没有，您将不得不以某种方式确定它。[如果有的话，有人没有帮你任何忙。这是一个“聪明”的例子，但对未来的用户没有太大帮助。]
希望他们给你留下了一些“面包屑”来获得每个图像的大小。正如@Manoj Bhosle所建议的，可能在另一个数据集中。或者，它们可以保存为属性。无论哪种方式，你都必须询问文件来弄清楚这一点。最简单的方法是使用HDFView打开它并检查它。HDFView可以从The HDF Group下载。
您可以使用以下代码访问'car'数据集上的任何属性：

with h5py.File('data/images.hdf5') as h5f:
    ds = h5f['datasets']['car']
    for k in ds.attrs.keys():
        print(f"{k} => {ds.attrs[k]}")

型
您可以使用以下代码检查存储在'car'数据集中每行的数组的大小：

with h5py.File('data/images.hdf5') as h5f:
    ds = h5f['datasets']['car']
    for row in ds:
        print(row.shape, row[0].shape)

型

赞(0）回复(0）举报 2024-01-04

d8tt03nd2#

import h5py
import numpy as np
from PIL import Image

# Open the HDF5 file
with h5py.File('data/images.hdf5', 'r') as f:
    # Access the dataset containing images
    data = f['datasets']['car']

    # Assuming there's a corresponding dataset for dimensions
    dimensions = f['datasets']['dimensions']

    # Iterate through each image stored as a 1-D array
    for i in range(len(data)):
        # Retrieve the dimensions for the current image
        height, width, channels = dimensions[i]

        # Reshape the 1-D array into a 3-D array with the correct shape
        image_array = np.reshape(data[i], (height, width, channels))

        # Convert the numpy array into a PIL image
        img = Image.fromarray(image_array.astype('uint8'), 'RGB')

        # Save or display the image
        img.save(f'image_{i}.png')
        # Remove the img.show() line if you don't want to display the image
        img.show()

字符串

愿这对你有所帮助

赞(0）回复(0）举报 2024-01-04

jv2fixgn3#

import h5py
import numpy as np
from PIL import Image

# Open the HDF5 file
with h5py.File('data/images.hdf5', 'r') as f:
    # Access the dataset containing images
    data = f['datasets']['car']

    # Iterate through each image stored as a 1-D array
    for i in range(len(data)):
        # Assuming you know the correct dimensions. For example, 256x256 with 3 color channels
        height, width, channels = 256, 256, 3

        # Reshape the 1-D array into a 3-D array with the shape (height, width, channels)
        image_array = np.reshape(data[i], (height, width, channels))

        # Convert the numpy array into a PIL image
        img = Image.fromarray(image_array.astype('uint8'), 'RGB')

        # Save or display the image
        img.save(f'image_{i}.png')
        img.show()  # Remove this line if you don't want to display the image

字符串

赞(0）回复(0）举报 2024-01-04

我来回答

HDFS 如何读取存储为1-D数组的hdf 5文件并将其视为图像？

3条答案

愿这对你有所帮助

相关问题

热门标签

最新问答