python 遍历文件夹以读取shapefile

66bbxpm5  于 2023-06-20  发布在  Python
关注(0)|答案(2)|浏览(132)

我想调整一个函数,读取一个特定文件夹中的多个shapefile,以读取多个文件夹中的多个shapefile。
下面是读取一个文件夹中的多个shapefile的函数。

def import_shapes_list(path_to_data:str,shapes_folder:str,crs:str,current_crs:str) ->gpd.GeoDataFrame:
    """
    """
    files = glob.iglob(path_+'*.shp')
    
    gdfs = []
    for file in files:
        print(file)            
        gdf = read_gdf(file,crs,current_crs=current_crs)
        gdf.columns = map(str.lower, gdf.columns)        
        gdfs.append(gdf)
        
   
    geomap = gpd.GeoDataFrame( pd.concat( gdfs, ignore_index=True) )   
    
    return geomap 

geomap_nord=import_shapes_list(path_to_data=path_to_data,shapes_folder=shapes_folder_nord, crs='EPSG:4326',current_crs='EPSG:26191')
The output is this:


  ./Source data/...../shapefile1.shp
    ./Source data/...../shapefile2.shp
    ./Source data/...../shapefile3.shp

我已经试着调整它,使它通过多个文件夹循环。以下是我尝试过的:

import os
path_to_data = './Source data/'  
rootdir = path_to_data + '...2021/'
files = glob.iglob(rootdir+'*.shp')    
gdfs = []
for subdir, dirs, files in os.walk(rootdir):
    for file in files:           
       print(os.path.join(subdir, file))

输出为:

folder1/xxxx.cpg
folder1/xxxx.dbf
folder1/xxxx.prj
folder1/xxxx.qmd
folder1/xxxx.shp
folder1/xxxx.shx
folder2/yyyy.cpg
folder2/yyyy.dbf
folder2/yyyy.prj
folder2/yyyy.shp

我的问题是,当它应该只阅读shapefile(.shp)时,它读取了每个文件夹中的所有内容。
我如何调整上面的函数,使其读取每个文件夹中的shapefile?

kkbh8khc

kkbh8khc1#

我创造了这个作品。

files = glob.iglob(rootdir+ '*.shp')
    print('rootdir is', rootdir)
    gdfs = []
    for subdir, dirs, files in os.walk(rootdir):
        for file in files:
            if file.endswith('.shp'):
                print('file is', file)            
                path_to=os.path.join(subdir, file)
                gdf = read_gdf(path_to, crs,current_crs=current_crs)
                gdf.columns = map(str.lower, gdf.columns)        
                gdfs.append(gdf)

file is x.shp
file is y.shp
file is z.shp
file is aa.shp
file is ab.shp
file is ab.shp
file is ac.shp
file is ad.shp
file is ae.shp
file is af.shp
file is az.shp
file is ad.shp

代码可以工作并读取每个文件夹,但是否有更好的方法来编写它?我不认为这是最佳的,因为它需要很长的时间

qncylg1j

qncylg1j2#

有人提出了以下答案:
除非你使用的是一个古老的Python版本,否则这应该是理想的:

rootdir = ...
gdfs = []

for file in Path(rootdir).rglob("*.shp"):
    print('file is', file)            
    gdf = read_gdf(file, crs,current_crs=current_crs)
    gdf.columns = map(str.lower, gdf.columns)        
    gdfs.append(gdf)

相关问题