pandas groupby列表内容

jrcvhitl 于 2023-04-04 发布在其他

关注(0)|答案(2)|浏览(217)

我有以下 Dataframe ：

import pandas as pd
d1 = {'id': ["car", "car", "bus", "plane", "plane"], 'value': [["a","b"], ["b","a"], ["a","b"], ["c","d"], ["d","c"]]}
df1 = pd.DataFrame(data=d1)
df1

id  value
0   car  [a, b]
1   car  [b, a]
2   bus  [a, b]
3   plane[c, d]
4   plane[d, c]

我想根据值列表的内容对id进行分组。元素的顺序应该无关紧要。之后，我想根据组的大小对它们进行排序，所以我得到如下内容：
一个二个一个一个
我尝试使用Counter（）将我的列表转换为字典，然后获取组的大小。然而，我得到了以下错误：

import collections

df1["temp"] = list(map(collections.Counter,  df1["value"]))
df1 = df1.groupby('temp').size().sort_values(ascending = True)

TypeError：不可哈希的类型：'计数器'

pandas

来源：https://stackoverflow.com/questions/75890260/groupby-contents-of-list

2条答案

按热度按时间

jdzmm42g1#

你可以对列表进行排序以忽略顺序。list类型是不可散列的，将它们转换为tuple然后你可以groupby。

for _, g in df1.groupby(df1['value'].map(lambda x: tuple(sorted(x)))) :
    print(g)

输出：

id   value
0  car  [a, b]
1  car  [b, a]
2  bus  [a, b]
      id   value
3  plane  [c, d]
4  plane  [d, c]

赞(0）回复(0）举报 2023-04-04

55ooxyrt2#

对value列进行排序，并将其转换为字符串，然后将其用作分组标准：

groups = df.assign(val_str=df['value'].apply(sorted).str.join(',')).groupby('val_str')

for _, g in groups:  # separate groups
    g = g.drop('val_str', axis=1)
    print(g)

id   value
0  car  [a, b]
1  car  [b, a]
2  bus  [a, b]
      id   value
3  plane  [c, d]
4  plane  [d, c]

赞(0）回复(0）举报 2023-04-04

我来回答

pandas groupby列表内容

2条答案

相关问题

热门标签

最新问答