excel 写入无索引的 Dataframe 时,格式错误

cigdeys3  于 2023-02-05  发布在  其他
关注(0)|答案(2)|浏览(176)

下面是我的数据框:

df
                     year        2022       2021
0  return on equity (roe)     160.90%    144.10%
1          average equity  62027.9677  65704.372
2       net profit margin      0.2531     0.2588
3                turnover      1.1179     1.0422
4                leverage       5.687     5.3421

我想把它写进excel里,不带索引:

df.to_excel('/tmp/test.xlsx',index=False)

为什么test.xlsx文件左上角有一个空单元格?

如何使用to_excel方法获得以下格式?

添加标头参数没有用。

df.to_excel('/tmp/test.xlsx', index=False, header=True)

现在从excel中读取:

new_df = pd.read_excel('/tmp/test.xlsx',index_col=False)
new_df
               Unnamed: 0        year       2022  2021
0  return on equity (roe)     160.90%    144.10%   NaN
1          average equity  62027.9677  65704.372   NaN
2       net profit margin      0.2531     0.2588   NaN
3                turnover      1.1179     1.0422   NaN
4                leverage       5.687     5.3421   NaN

阅读时无法添加标头参数:

new_df = pd.read_excel('/tmp/test.xlsx',index_col=False,header=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/debian/.local/lib/python3.9/site-packages/pandas/util/_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "/home/debian/.local/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/home/debian/.local/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 490, in read_excel
    data = io.parse(
  File "/home/debian/.local/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 1734, in parse
    return self._reader.parse(
  File "/home/debian/.local/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 732, in parse
    validate_header_arg(header)
  File "/home/debian/.local/lib/python3.9/site-packages/pandas/io/common.py", line 203, in validate_header_arg
    raise TypeError(
TypeError: Passing a bool to header is invalid. Use header=None for no header or header=int or list-like of ints to specify the row(s) making up the column names

wswtfjt7

wswtfjt71#

将标头参数包含为true:

df.to_excel('test.xlsx', index=False, header=True)

回到df,将index_col参数设置为none:

new_df = pd.read_excel('test.xlsx',index_col=None)
print(new_df)

                     year        2022       2021
0  return on equity (roe)     160.90%    144.10%
1          average equity  62027.9677  65704.372
2       net profit margin      0.2531     0.2588
3                turnover      1.1179     1.0422
4                leverage       5.687     5.3421
c3frrgcw

c3frrgcw2#

我找到了原因,示例的 Dataframe 很特殊:

df.columns
MultiIndex([('year',),
            ('2022',),
            ('2021',)],
           )

这不是一个单一的指数。

df.columns = ['year', '2022', '2021']
df.to_excel('/tmp/test.txt',index=False)

奇怪的现象终于消失了。dataframe与多索引[('year',),('2022',),('2021',)]显示相同的外观,如单索引['year', '2022', '2021']在我的情况下。

相关问题