pandas 简单的方法来找到3个最大的值为一个给定的列与相应的值在另一列在我自己的格式

hiz5n14c 于 2023-10-14 发布在其他

关注(0)|答案(2)|浏览(151)

假设我有一个DataFrame...

data = {'PVOL': [190, 105, 100, 150, 100, 170], 'STKS': [2000, 2500, 3000, 3500, 4000, 4500],
'CVOL': [64, 179, 98, 281, 86, 90]}
df = pd.DataFrame(data)

现在我想找到所有其他列的3个最大值（PVOL，CVOL...等，因为我的df也可能有多个其他列）和它们对应的列STKS的值，方式如下（作为字符串/打印）：

PVOL - 2000[190], 4500[170], 3500[150]
CVOL - 3500[281], 2500[179], 3000[98]

我试着在DF格式中得到2个最大的值。

columns_name = list(df.columns)
columns_name.remove('STKS')
data_dict = {}
for col in columns_name:
      temp=[]
      data=df.sort_values(col, ascending=False)[:2][[col,'STKS']].values
      for row in data:
        temp.append(row[1])
        temp.append(row[0])
      data_dict[col]=temp
new_df1=pd.DataFrame(data_dict,index="STK VOL STK VOL".split())
new_df1.set_axis(["PVOL", "CVOL"], axis='columns', inplace=True)
Vol_df = new_df1[["PVOL", "CVOL"]]
print(Vol_df)

有没有简单的方法可以做到这一点？？？我也读到过...

df.nlargest()

谢谢.

pandas

来源：https://stackoverflow.com/questions/77229442/simple-way-to-find-3-largest-values-for-a-given-column-with-corresponding-value

2条答案

按热度按时间

dz6r00yl1#

可以，你可以像这样使用nlargest方法：

import pandas as pd
data = {'PVOL': [190, 105, 100, 150, 100, 170],
        'STKS': [2000, 2500, 3000, 3500, 4000, 4500],
        'CVOL': [64, 179, 98, 281, 86, 90]}
df = pd.DataFrame(data)
result = {}
for col in df.columns:
    if col != 'STKS':
        top_values = df.nlargest(3, col)
        result[col] = list(zip(top_values['STKS'], top_values[col]))
result_df = pd.DataFrame(result)
print(result_df)

输出：

PVOL         CVOL
0  (2000, 190)  (3500, 281)
1  (4500, 170)  (2500, 179)
2  (3500, 150)   (3000, 98)

【编辑】：
以实现所需的特定输出格式

import pandas as pd
data = {'PVOL': [190, 105, 100, 150, 100, 170],
        'STKS': [2000, 2500, 3000, 3500, 4000, 4500],
        'CVOL': [64, 179, 98, 281, 86, 90]}
df = pd.DataFrame(data)
result = {}
for col in df.columns:
    if col != 'STKS':
        top_values = df.nlargest(3, col)
        result[col] = ', '.join([f"{stk}[{val}]" for stk, val in zip(top_values['STKS'], top_values[col])])
result_df = pd.Series(result)
print(result_df)

输出：

PVOL    2000[190], 4500[170], 3500[150]
CVOL     3500[281], 2500[179], 3000[98]

展开查看全部

赞(0）回复(0）举报 2023-10-14

hmtdttj42#

在nlargest中使用自定义函数：

def f(s, n=3):
    x = s.nlargest(n)
    return ', '.join(f'{a}[{b}]' for a,b in zip(x.index, x))
df.set_index('STKS').apply(f)

输出量：

PVOL    2000[190], 4500[170], 3500[150]
CVOL     3500[281], 2500[179], 3000[98]
dtype: object

如果你想要字符串：

for key, col in df.set_index('STKS').items():
    x = col.nlargest(3)
    s = ', '.join(f'{a}[{b}]' for a,b in zip(x.index, x))
    print(f'{key} - {s}')

输出量：

PVOL - 2000[190], 4500[170], 3500[150]
CVOL - 3500[281], 2500[179], 3000[98]

仅适用于列的子集：

cols = ['CCHOI', 'PIV']
for key, col in df.set_index('STKS')[cols].items():
    x = col.nlargest(3)
    s = ', '.join(f'{a}[{b}]' for a,b in zip(x.index, x))
    print(f'{key} - {s}')

展开查看全部

赞(0）回复(0）举报 2023-10-14

我来回答

pandas 简单的方法来找到3个最大的值为一个给定的列与相应的值在另一列在我自己的格式

2条答案

仅适用于列的子集：

相关问题

热门标签

最新问答