pandas 如何将一列透视为多列,同时该列的其余部分成为透视列下的嵌套列

nr7wwzry  于 2023-01-28  发布在  其他
关注(0)|答案(2)|浏览(110)

我有这样一个数据:

year     energy_products  consumption_ktoe    value_ktoe
0   2009       Coal and Peat               3.0      3.300000
1   2009           Crude Oil               0.0  49079.900000
2   2009         Electricity            3338.1   3594.203691
3   2009         Natural Gas             867.8   6656.700000
4   2009              Others               0.0      0.000000
..  ...            .......           .......       .........

我想将energy_product透视到列中,并将consumption_ktoe和value_ktoe作为每个透视列下的子列。
所以我的欲望输出是这样的:

energy_products  Coal and Peat                 Crude Oil                     \
    year          consumption_ktoe  value_ktoe  consumption_ktoe  value_ktoe
0   2009                       3.0    3.300000                 3           4

energy_products   Electricity                   Natural Gas                   \
    year          consumption_ktoe  value_ktoe  consumption_ktoe  value_ktoe  
0   2009                       3.0    3.300000                 3           4

energy_products   Others
    year          consumption_ktoe  value_ktoe  
0   2009                       3.0    3.300000

在我旋转它之后,我得到了这个:
一个二个一个一个
我试着把他们的等级

finalConImportMerge = finalConImportMerge.swaplevel(axis=1)

但结果是这样的

energy_products    Coal and Peat        Crude Oil ...... Coal and Peat   Crude Oil .......
year            consumption_ktoe  consumption_ktoe  ......   value_ktoe  value_ktoe .....
2009                           3                 3                    3           3

然后我想到了合并顶级列,但是如果我使用groupby,我就得对它求和,消费col和值col就会求和,这不是我想要的。

finalConImportMerge = finalConImportMerge.groupby(level=[0], axis=1)

最后我得出这样的结论:

energy_products Coal and Peat   Crude Oil   Electricity Natural Gas Others  \
year                        
2009                      6.3     49079.9   6932.303691      7524.5    0.0  
....                     ....         ....              ....            ...         ..
energy_products     Petroleum Products
year                        
2009                           96774.9 
....                             ....

有什么方法可以达到我想要的结果吗?或者分组列而不求和?

icnyk63a

icnyk63a1#

这看起来有点笨拙,但我不知道除此之外还有其他方法(我使用pandas pivot documentation中的示例,因为您没有提供重新创建示例DataFrame的代码)。

import pandas as pd

df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
                           'two'],
                   'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'baz': [1, 2, 3, 4, 5, 6],
                   'zoo': ['x', 'y', 'z', 'q', 'w', 't']})

pivoted_df = df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
goal_df = pivoted_df.swaplevel(axis=1)[sorted(pivoted_df.swaplevel(axis=1))]

goal_df看起来像这样

bar A       B       C
    baz zoo baz zoo baz zoo
foo                     
one 1   x   2   y   3   z
two 4   q   5   w   6   t
2izufjch

2izufjch2#

旋转、重置、拆分、交换、重置、放置、设置索引、转置和重置又如何呢

finalConImportMerge.pivot(index=['year'], columns=['energy_products']).reset_index().unstack().swaplevel(1,0).reset_index().drop('level_2',axis=1).set_index('energy_products').T.reset_index(drop=True)
Out[47]: 
energy_products             Coal_and_Peat         Crude_Oil       Electricity       Natural_Gas            Others Coal_and_Peat   Crude_Oil  Electricity Natural_Gas      Others
0                  year  consumption_ktoe  consumption_ktoe  consumption_ktoe  consumption_ktoe  consumption_ktoe    value_ktoe  value_ktoe   value_ktoe  value_ktoe  value_ktoe
1                2009.0               3.0               0.0            3338.1             867.8               0.0           3.3     49079.9  3594.203691      6656.7         0.0

相关问题