我有一个关于电影信息的数据框。我对“制作公司”和“奖项”这两栏感兴趣。我必须按公司对数据框进行分组,并计算它们获得的奖项数量,然后显示最有价值的三家公司。我这样做了,得到了一个系列:
mv['Award'] = mv['Award'].astype('string')
mv['Winners'] = mv['Award'] == 'Winner'
mv['Winners']=mv['Winners'].astype('string')
mv['Winners'] = mv['Winners'].str.replace('False','')
mv['Winners'] = mv['Winners'].notnull()
mv['Winners']=mv['Winners'].astype('string')
mv['Winners']=mv['Winners'].str.replace('True','Oscar')
Number_Award = mv.groupby(['Production Company'])['Winners'].value_counts().sort_values(ascending = False).head(3)
print(Number_Award)
**Result:**
Production Company Winners
Paramount Pictures Oscar 33
Warner Bros. Pictures Oscar 29
MGM Home Entertainment Oscar 28
Name: Winners, dtype: int64
然后,我尝试转换为dataframe并为列分配名称,但我不能这样做,因为只有一个名为“Winners”的列。它看起来像这样:
Number_Award = Number_Award.to_frame()
Number_Award
**Result:**
Winners
Production Company Winners
Paramount Pictures Oscar 33
Warner Bros. Pictures Oscar 29
MGM Home Entertainment Oscar 28
我如何重新创建dataframe,使其看起来像这样:
Production Company Award Number of award
Paramount Pictures Oscar 33
Warner Bros. Pictures Oscar 29
MGM Home Entertainment Oscar 28
1条答案
按热度按时间y53ybaqx1#
可以在
Series.to_frame
中指定列名,然后重置索引并重命名列