dataframe-rank的答案不同于sql-rank

evrscar2  于 2021-07-24  发布在  Java
关注(0)|答案(1)|浏览(382)

我在学习sql中的rank函数时发现,它使用的排序方法与pandas方法不同。如何得到相同的答案?
问题链接:https://www.windowfunctions.com/questions/ranking/1

sql代码

  1. select rank() over (order by weight desc) as ranking, weight, name from cats order by ranking, name

我的Pandas代码

  1. df[['weight','name']].assign(ranking=df['weight'].rank(method='dense',ascending=False)).sort_values('weight',ascending=False)
  2. # I am getting a different answer than SQL
  3. # How to get the same answer?

数据

  1. df = pd.DataFrame({'name': ['Molly', 'Ashes', 'Felix', 'Smudge', 'Tigger', 'Alfie', 'Oscar', 'Millie', 'Misty', 'Puss', 'Smokey', 'Charlie'],
  2. 'breed': ['Persian', 'Persian', 'Persian', 'British Shorthair', 'British Shorthair', 'Siamese', 'Siamese', 'Maine Coon', 'Maine Coon', 'Maine Coon', 'Maine Coon', 'British Shorthair'],
  3. 'weight': [4.2, 4.5, 5.0, 4.9, 3.8, 5.5, 6.1, 5.4, 5.7, 5.1, 6.1, 4.8],
  4. 'color': ['Black', 'Black', 'Tortoiseshell', 'Black', 'Tortoiseshell', 'Brown', 'Black', 'Tortoiseshell', 'Brown', 'Tortoiseshell', 'Brown', 'Black'],
  5. 'age': [1, 5, 2, 4, 2, 5, 1, 5, 2, 2, 4, 4]})

我的输出

  1. weight name ranking
  2. 6 6.1 Oscar 1.0
  3. 10 6.1 Smokey 1.0
  4. 8 5.7 Misty 2.0 # this should be 3 and so on
  5. 5 5.5 Alfie 3.0
  6. 7 5.4 Millie 4.0
  7. 9 5.1 Puss 5.0
  8. 2 5.0 Felix 6.0
  9. 3 4.9 Smudge 7.0
  10. 11 4.8 Charlie 8.0
  11. 1 4.5 Ashes 9.0
  12. 0 4.2 Molly 10.0
  13. 4 3.8 Tigger 11.0
6yt4nkrj

6yt4nkrj1#

如注解所示,我们可以使用以下方法在sql和pandas中得到相同的答案:

  1. df[['weight','name']].assign(ranking=df['weight'].rank(method='min',ascending=False)).sort_values('weight',ascending=False)

相关问题