基于嵌套字典的pandas Dataframe 重命名

91zkwejq  于 2023-06-04  发布在  其他
关注(0)|答案(3)|浏览(196)

我有一本字典:

groupings = {'Group_A':{'species':['plantain', 'banana'], 'shape':'oblong'},
'Group_B':{'species':['lemon', 'orange'], 'shape':'round'}
}

我也有一个dataframe:

d = pd.dataframe(data={'orange': [1, 2], 'bannana': [3, 4], 'lemon': [3, 4]})

我想根据字典重命名数据框并求和:

#something like this:
d.rename(columns=groupings, inplace=True)
d = d.groupby(by=d.columns, axis=1).sum()

预期输出:

Group_A  Group_B
3          4
4          6

我如何重塑字典以便它被pandas.DataFrame.rename()识别?
谢谢!

ryevplcw

ryevplcw1#

您可以使用字典理解来扁平化species的值

d = {species:k
    for k, v in groupings.items()
    for species in v['species']}

out = (df.rename(columns=d)
       .groupby(level=0, axis=1).sum())
$ print(d)

{'plantain': 'Group_A', 'banana': 'Group_A', 'lemon': 'Group_B', 'orange': 'Group_B'}

$ print(out)

   Group_A  Group_B
0        3        4
1        4        6
6psbrbz9

6psbrbz92#

您需要创建dataframe中的列名到字典中指定的所需列名的Map:

groupings = {
    'Group_A': {'species': ['plantain', 'banana'], 'shape': 'oblong'},
    'Group_B': {'species': ['lemon', 'orange'], 'shape': 'round'}
}

d = pd.DataFrame(data={'orange': [1, 2], 'bannana': [3, 4], 'lemon': [3, 4]})

column_mapping = {}
for group, attributes in groupings.items():
    species_list = attributes['species']
    for species in species_list:
        column_mapping[species] = group

d.rename(columns=column_mapping, inplace=True)
d = d.groupby(by=d.columns, axis=1).sum()
8iwquhpp

8iwquhpp3#

试试这个:
这应该适用于多个组具有相同物种的情况。

df2 = pd.DataFrame({k1:df.reindex(v1.get('species'),axis=1).sum(axis=1) for k1,v1 in groupings.items()}).astype(int)

输出:

Group_A  Group_B
0        3        4
1        4        6

相关问题