在python pandas中分割excel工作表,而不会丢失源格式

d7v8vwbk  于 2023-09-29  发布在  Python
关注(0)|答案(1)|浏览(144)

我试图通过导入pandas在pycharm中分割excel文件。我的代码是基于键值拆分工作表,并正确复制目标中的文本。但未复制源文件格式。我希望工作表被拆分沿着源格式。示例文件位于链接中,屏幕截图位于Master fileSplit file
我的代码是:

  1. import pandas as pd
  2. df = pd.read_excel(r'mypath\myfile.xlsx',sheet_name='Sheet2',engine='openpyxl')
  3. target_folder = 'my target folder'
  4. keys = df.groupby('STD')
  5. key_value = keys.groups.keys()
  6. for key in key_value:
  7. splitdf = keys.get_group(key)
  8. splitdf.to_excel(str(target_folder)+str(key)+".xlsx",sheet_name=str(key),index=False)
  9. print("Files created in " + str(target_folder) + "Folder")
7qhs6swi

7qhs6swi1#

AFIK,这不能用Pandas来做。必须使用openpyxl
下面是基于您的主fileitertools.groupby的建议。

  1. from openpyxl import load_workbook
  2. from itertools import groupby
  3. wb = load_workbook("myfile.xlsx")
  4. ws = wb.active # or wb["Sheet2"]
  5. data = ws.rows
  6. header = [(h._style, h.value) for h in next(data)]
  7. pairs = [[(r._style, r.value) for r in rows] for rows in data]
  8. groups = {
  9. std: list(g) for std, g in groupby(
  10. sorted(pairs, key=lambda x: x[1][1]), key=lambda x: x[1][1])
  11. }
  12. for std, vals in groups.items():
  13. std_ws = wb.create_sheet(std)
  14. for idxr, rows in enumerate(*[[header] + vals], 1):
  15. for idxc, (s, v) in enumerate(rows, 1):
  16. wb[std].cell(idxr, idxc).value = v
  17. wb[std].cell(idxr, idxc)._style = s
  18. # https://stackoverflow.com/a/52736133/16120011
  19. from copy import copy
  20. for idx, rd in ws.column_dimensions.items():
  21. std_ws.column_dimensions[idx] = copy(rd)
  22. del wb["Sheet2"] # could be unnecessary
  23. wb.save("newfile.xlsx")

输出量:

展开查看全部

相关问题