此问题已在此处有答案:
Pandas percentage of total with groupby(16个答案)
4天前关闭。
用一个例子更容易解释,比如我有一个示例 Dataframe ,其中包含year
、cc_rating
和number_x
。
df = pd.DataFrame({"year":{"0":2005,"1":2005,"2":2005,"3":2006,"4":2006,"5":2006,"6":2007,"7":2007,"8":2007},"cc_rating":{"0":"2","1":"2a","2":"2b","3":"2","4":"2a","5":"2b","6":"2","7":"2a","8":"2b"},"number_x":{"0":9368,"1":21643,"2":107577,"3":10069,"4":21486,"5":110326,"6":10834,"7":21566,"8":111082}})
df
year cc_rating number_x
0 2005 2 9368
1 2005 2a 21643
2 2005 2b 107577
3 2006 2 10069
4 2006 2a 21486
5 2006 2b 110326
6 2007 2 10834
7 2007 2a 21566
8 2007 2b 111082
问题
我怎样才能得到每年number_x的百分比?含义:
直接除法不能工作,因为年份不能设置为原始df中的索引,因为它不是唯一的。
现在我正在做下面的事情,但是效率很低,我相信有更好的方法。
df= pd.merge(df, df.groupby('year').sum(), left_on='year',right_index=True)
df['%'] = round((df['number_x'] / df['number_y'])*100 , 2)
df = df.drop('number_y', axis=1)
谢谢!
1条答案
按热度按时间7eumitmz1#
可能的解决方案:
输出: