python-3.x 创建按其他列值分组的新列

yyhrrdl8 于 2022-11-26 发布在 Python

关注(0)|答案(1)|浏览(146)

我有以下 Dataframe

df1 = pd.DataFrame({'sentence': ['A', "A", "A", "A", 'A', 'B', "B", 'B'], 'entity': ['Stay home', "Stay home", "WAY", "WAY", "Stay home", 'Go outside', "Go outside", "purpose"], 'token' : ['Severe weather', "raining", "smt", "SMT0", "Windy", 'Sunny', "Good weather", "smt"]
})

    sentence        entity      token
0   A               Stay home   Severe weather
1   A               Stay home   raining
2   A               Way         smt
3   A               Way         SMT0
4   A               Stay home   Windy
5   B               Go outside  Sunny
6   B               Go outside  Good weather
7   B               Purpose     smt

当entity列中存在Way和Purpose时，我想对sentences的值执行group by运算并创建新的columns
预期成果：

sentence entity      token                          Way       Purpose
0   A        Stay home  Severe weather, raining, Windy smt, SMTO Nan
1   B        Go outside Sunny, Good weather            Nan       smt

python-3.x

来源：https://stackoverflow.com/questions/74532513/create-new-column-group-by-values-of-other-column

1条答案

按热度按时间

dauxcl2d1#

在boolean indexing中按Series.isin筛选不匹配的行，其中~用于反转掩码，聚合join并使用DataFrame.join筛选匹配列表中的行，其中DataFrame.pivot_table：

vals = ['WAY','purpose']

m = df1['entity'].isin(vals)

df2 = df1[m].pivot_table(index='sentence',columns='entity',values='token', aggfunc=','.join)
df3 = df1[~m].groupby(['sentence','entity'])['token'].agg(', '.join).reset_index()

df = df3.join(df2, on='sentence')
print (df)
  sentence      entity                           token       WAY purpose
0        A   Stay home  Severe weather, raining, Windy  smt,SMT0     NaN
1        B  Go outside             Sunny, Good weather       NaN     smt

赞(0）回复(0）举报 2022-11-26

我来回答

python-3.x 创建按其他列值分组的新列

1条答案

相关问题

热门标签

最新问答