pandas Python panda sort_values无法正常工作

dwbf0jvd  于 2023-03-11  发布在  Python
关注(0)|答案(1)|浏览(186)

当我尝试按列值对DataFrame排序并使用白色()函数打印时,它显示重复的行,而不是所需的结果

regions = country_features['world_region']
happines = []
counts = []
reg = []

for region in regions:
    hap = country_features.loc[country_features['world_region'] == region, 'happiness_score'].mean()
    count = len(country_features[country_features['world_region'] == region])
    happines.append(hap)
    counts.append(count)
    reg.append(region)

region_happines = pd.DataFrame({'region':reg,
                                'happiness_score' : happines,
                                'country_count':counts})

region_happines
region_happines.happiness_score = pd.to_numeric(region_happines.happiness_score)
sorted = region_happines.sort_values(by='happiness_score', ascending=False)

sorted.head(5)

我想按列值对DataFrame进行排序,并且希望它能正确排序

1tu0hz3e

1tu0hz3e1#

应简化解决方案的第一部分:

print (country_features)
  world_region  happiness_score
0         reg1                5
1         reg1                1
2         reg2               10
3         reg2                1
4         reg2                3

region_happines = (country_features.groupby('world_region',as_index=False)
                                   .agg(happiness_score= ('happiness_score','mean'),
                                        country_count= ('happiness_score','size'))
                                   .rename(columns={'world_region':'region'}))
print (region_happines)
  region  happiness_score  country_count
0   reg1         3.000000              2
1   reg2         4.666667              3

因为happiness_score列是每组的平均值,未转换为数值。

out = region_happines.sort_values(by='happiness_score', ascending=False)

相关问题