python 如何从xarray数据集创建和返回Zarr文件?

jchrr9hc  于 2023-01-04  发布在  Python
关注(0)|答案(1)|浏览(206)

如何从xarray数据集创建并返回文件new_zarr.zarr
我知道xarray.Dataset.to_zarr()存在,但它返回ZarrStore,我必须返回bytes-like对象。
我试过使用tempfile模块,但不确定如何继续,我如何将xarray.dataset写入bytes-like object,以返回可下载的.zarr文件?

mf98qq94

mf98qq941#

Zarr支持多个storage backends(DirectoryStore、ZipStore等),如果你正在寻找单个文件对象,听起来ZipStore就是你想要的。

import xarray as xr
import zarr

ds = xr.tutorial.open_dataset('air_temperature')
store = zarr.storage.ZipStore('./new_zarr.zip')
ds.to_zarr(store)

zip文件可以看作是一个单独的文件zarr存储,可以下载(或作为一个单独的存储移动)。

更新1

如果你想在内存中完成所有这些操作,你可以扩展zarr.ZipStore以允许传入一个BytesIO对象:

class MyZipStore(zarr.ZipStore):
    
    def __init__(self, path, compression=zipfile.ZIP_STORED, allowZip64=True, mode='a',
                 dimension_separator=None):

        # store properties
        if isinstance(path, str):  # this is the only change needed to make this work
            path = os.path.abspath(path)
        self.path = path
        self.compression = compression
        self.allowZip64 = allowZip64
        self.mode = mode
        self._dimension_separator = dimension_separator

        # Current understanding is that zipfile module in stdlib is not thread-safe,
        # and so locking is required for both read and write. However, this has not
        # been investigated in detail, perhaps no lock is needed if mode='r'.
        self.mutex = RLock()

        # open zip file
        self.zf = zipfile.ZipFile(path, mode=mode, compression=compression,
                                  allowZip64=allowZip64)

然后,您可以在内存中创建zip文件:

zip_buffer = io.BytesIO()

store = MyZipStore(zip_buffer)

ds.to_zarr(store)

您会注意到zip_buffer包含一个有效的zip文件:

zip_buffer.read(10)
b'PK\x03\x04\x14\x00\x00\x00\x00\x00'

PK\x03\x04Zip file magic number

相关问题