python-3.x 将一个excel文件拆分为多个,并在Pandas中指定行数

wko9yo5t  于 2022-11-26  发布在  Python
关注(0)|答案(2)|浏览(107)

假设我有一个101行的excel文件,我需要拆分并写入11 excel文件,每个新文件的行号都是10,但最后一个文件除外,因为只剩下一行。
这是我尝试过的代码,但得到的是KeyError: 11

df = pd.DataFrame(data=np.random.rand(101, 3), columns=list('ABC'))
groups = df.groupby(int(len(df.index)/10) + 1)
for i, g in groups:
    g.to_excel("%s.xlsx" % i, index = False, index_lable = False)

有人可以帮助解决这个问题吗?非常感谢。
参考相关:Split pandas dataframe into multiple dataframes with equal numbers of rows

mcvgt66p

mcvgt66p1#

我认为你需要np.arange

df = pd.DataFrame(data=np.random.rand(101, 3), columns=list('ABC'))
groups = df.groupby(np.arange(len(df.index))//10)
for i, g in groups:
    print(g)
ibrsph3r

ibrsph3r2#

我解决了一个类似的问题,背景是我创建了一个带有HTTP触发器的Azure函数,但是当迭代2k行请求时,端点不堪重负。因此,将原始文件分成50行:

import pandas as pd
import logging

INXL = pd.read_excel('split/031022.xlsx', engine="openpyxl")

row_count = (len(INXL.index))
#make sure we are dealing with a table bigger than 50    
if row_count >= 51:
    row_start = (row_count -50)
else:
   row_start = 1

def extract(rs, rc):
   while rc >= 51: #loop body
        # set the extraction to be between the row start and ending index
        row_extract = INXL.iloc[rs:rc]
        with pd.ExcelWriter(f'output_{rc}.xlsx') as writer: 
            row_extract.to_excel(writer,index=False)
        rc -= 50
        rs -= 50
        

extract(row_start, row_count)
if row_count < 51:
    row_extract = INXL.iloc[row_start:row_count]
    with pd.ExcelWriter(f'output_{row_count}.xlsx') as writer: 
        row_extract.to_excel(writer,index=False) 
        logging.info("extract completed")

相关问题