python-3.x 创建按其他列值分组的新列

yyhrrdl8  于 2022-11-26  发布在  Python
关注(0)|答案(1)|浏览(146)

我有以下 Dataframe

df1 = pd.DataFrame({'sentence': ['A', "A", "A", "A", 'A', 'B', "B", 'B'], 'entity': ['Stay home', "Stay home", "WAY", "WAY", "Stay home", 'Go outside', "Go outside", "purpose"], 'token' : ['Severe weather', "raining", "smt", "SMT0", "Windy", 'Sunny', "Good weather", "smt"]
})

    sentence        entity      token
0   A               Stay home   Severe weather
1   A               Stay home   raining
2   A               Way         smt
3   A               Way         SMT0
4   A               Stay home   Windy
5   B               Go outside  Sunny
6   B               Go outside  Good weather
7   B               Purpose     smt

entity列中存在WayPurpose时,我想对sentences的值执行group by运算并创建新的columns
预期成果:

sentence entity      token                          Way       Purpose
0   A        Stay home  Severe weather, raining, Windy smt, SMTO Nan
1   B        Go outside Sunny, Good weather            Nan       smt
dauxcl2d

dauxcl2d1#

boolean indexing中按Series.isin筛选不匹配的行,其中~用于反转掩码,聚合join并使用DataFrame.join筛选匹配列表中的行,其中DataFrame.pivot_table

vals = ['WAY','purpose']

m = df1['entity'].isin(vals)

df2 = df1[m].pivot_table(index='sentence',columns='entity',values='token', aggfunc=','.join)
df3 = df1[~m].groupby(['sentence','entity'])['token'].agg(', '.join).reset_index()

df = df3.join(df2, on='sentence')
print (df)
  sentence      entity                           token       WAY purpose
0        A   Stay home  Severe weather, raining, Windy  smt,SMT0     NaN
1        B  Go outside             Sunny, Good weather       NaN     smt

相关问题