如何将shapefile从Azure Blob Storage读取到Azure Databricks Notebook？

2guxujil 于 2023-10-22 发布在其他

关注(0)|答案(1)|浏览(122)

正如标题所示，我的Azure Blob存储容器中有一个shapefile（.shp），我正在尝试将其直接读取到Azure Databricks笔记本中，而不将其下载到本地驱动器中。
我能够从Blob存储中读取CSV文件，但我遇到了shapefile问题。在过去的stackoverflow问题中，我一直无法找到解决方案。
下面是我使用的代码：

from io import BytesIO
from azure.storage.blob import BlobClient

blob_data = BlobClient(
    account_url=ACCOUNT_URL,
    container_name=CONTAINER_NAME,
    blob_name=BLOB_NAME, 
    credential=BLOB_STORAGE_CREDENTIAL,
)
blob_data = blob_data.download_blob().readall()
shapefile = BytesIO(blob_data)

shapefile

返回<_io.BytesIO at 0x7f1ad4bbbcc0>。
随后，我试着阅读了Shapefile和Fiona：

# open with gpd
gdf = gpd.read_file(shapefile)

# open with fiona
with fiona.open(shapefile) as shp:
    first_feature = next(iter(shp))
    print(first_feature)

这给出了错误DriverError: '/vsimem/31debcdbc2b0480b9f0567aea3a687d7' not recognized as a supported file format.
Fiona给出了一个类似的错误：DriverError: '/vsimem/04e527ecf5324605bdcf3643ea3b4bd2/04e527ecf5324605bdcf3643ea3b4bd2' not recognized as a supported file format.
文件似乎没有问题。我已经将shapefile上传到我的Azure工作区，从那里读取它很好，但是因为这个文件是用于云上的工作流，所以我不能使用这种方法。

Azure

来源：https://stackoverflow.com/questions/77299224/how-to-read-a-shapefile-from-azure-blob-storage-into-an-azure-databricks-noteboo

1条答案

按热度按时间

ujv3wf0j1#

您可以将存储帐户挂载到数据块并读取shapfile（.shp）。下面是我正在使用的Shapefile。

安装代码。

dbutils.fs.mount(
source = "wasbs://<container-name>@<storage-account-name>.blob.core.windows.net",
mount_point = "/mnt/blob/",
extra_configs = {"fs.azure.account.key.<storage-account-name>.blob.core.windows.net":"<Account_key>"})

使用下面的代码，你可以阅读它。

gdf = geopandas.read_file("/dbfs/mnt/blob/spatial/samp.shp")
gdf

在这里，你可以看到i前缀dbfs的路径，geopandas.read_file检查路径从根，它不是在Spark上下文中。

赞(0）回复(0）举报 2023-10-22

我来回答

如何将shapefile从Azure Blob Storage读取到Azure Databricks Notebook？

1条答案

相关问题

热门标签

最新问答