pandas 根据列值拆分字符串并导出到不同的Excel字符串

fnatzsnv  于 2024-01-04  发布在  其他
关注(0)|答案(2)|浏览(117)

来源之前问:
Pandas: Iterate through a list of DataFrames and export each to excel sheets
Splitting dataframe into multiple dataframes
我设法做到了这一切:

  1. # sort the dataframe
  2. df.sort(columns=['name'], inplace=True)
  3. # set the index to be this and don't drop
  4. df.set_index(keys=['name'], drop=False,inplace=True)
  5. # get a list of names
  6. names=df['name'].unique().tolist()
  7. # now we can perform a lookup on a 'view' of the dataframe
  8. joe = df.loc[df.name=='joe']
  9. # now you can query all 'joes'

字符串
我已经设法使这个工作-joe = df.loc[df.name=='joe']和它给出了确切的结果,我正在寻找的。
作为解决方案,使其工作的大量数据,我发现这个潜在的解决方案。

  1. writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
  2. List = [Data , ByBrand]
  3. for i in List:
  4. i.to_excel(writer, sheet_name= i)
  5. writer.save()


目前我有:

  1. teacher_names = ['Teacher A', 'Teacher B', 'Teacher C']


DF =

  1. ID Teacher_name Student_name
  2. Teacher_name
  3. Teacher A 1.0 Teacher A Student 1
  4. Teacher A NaN Teacher A Student 2
  5. Teacher B 0.0 Teacher B Student 3
  6. Teacher C 2.0 Teacher C Student 4


如果我使用-test = df.loc[df.Teacher_name=='Teacher A']-将收到准确的结果。

**问题:**如何优化,它会自动保存“测试”结果到(为每个教师单独的)excel文件(.to_excel(writer, sheet_name=Teacher_name)与教师的名字,并会做它为所有现有的数据库中的教师?

4szc88ey

4szc88ey1#

这应该对你有用。你几乎就在那里了,你只需要重新定义names列表并每次过滤你的框架。

  1. names = df['name'].unique().tolist()
  2. writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
  3. for myname in names:
  4. mydf = df.loc[df.name==myname]
  5. mydf.to_excel(writer, sheet_name=myname)
  6. writer.close()

字符串

**编辑:**Pandas 1.5版本后,只需要使用close()而不是保存()。

pexxcrt2

pexxcrt22#

@jpp,文本“sheetname”将被替换为“sheet_name”。此外,一旦“name”变量被转换为list,在运行for循环以基于唯一名称值创建多个工作表时,我会得到以下错误:

  1. InvalidWorksheetName: Invalid Excel character '[]:*?/\' in sheetname '['.

字符串
基于列值(通过函数)创建多个列(在单个Excel文件中)的替代方法:

  1. def writesheet(g):
  2. a=g['name'].tolist()[0]
  3. g.to_excel(writer,sheet_name = str(a),index=False)
  4. df.groupby('name').apply(writesheet)
  5. writer.save()


来源:How to split a large excel file into multiple worksheets based on their given ip address using pandas python

相关问题