查找numpy数组的N个最大值的第一个索引，而无需就地排序

z9gpfhce 于 2024-01-08 发布在其他

关注(0)|答案(4)|浏览(181)

我需要找到numpy数组的N个最大值的第一个索引，而不进行就地排序。示例：

1) Find indices of 2 largest values in an array: [0, 0.5, 1, 0.5]
   result: [1,2] (or [2,1], order of returned indices doesn't matter])

2) Find indices of 2 largest values in an array: [0.5, 0.5, 0, 0.5, 0.5]
   result: [0,1]

3) Find indices of 3 largest values in an array:[0.3, 0.9, 0.6, 0.75, 0.55, 0.75, 0.75, 0.8, 0.5]
   result: [1, 7, 3]

4) Find indices of 6 largest values in an array:[0.3, 0.9, 0.6, 0.75, 0.55, 0.75, 0.75, 0.8, 0.5]
    result: [1, 7, 3, 5, 6, 2]

字符串
如果要返回的最小数字不是唯一的，那么函数返回最接近数组开头的数字的索引是很重要的。
我试着这样做：

example = np.array([0.5, 0.5, 0.,  0.5, 0.5])
ind = np.argpartition(example, -N)[-N:]

型
但是对于N=2，它返回ind = [1，4]。在np.argpartition（）中有一个order参数，但是我不知道应该如何使用它。
90%的时间数组大小将在25左右，9%的时间大小将不大于100，边缘情况限制为1000个元素。
因为数组的大小总是很小，所以我构建了一个简单的解决方案，它可以工作，但它肯定是次优的：

import numpy as np

def nlar_idx(array, n):
    d = dict()
    for value, key in enumerate(array):
        if key not in d.keys():
            d[key] = [value]
        else:
            d[key].append(value)   
    do = dict(sorted(d.items(), reverse=True))
    out = [x for v in do.values() for x in v][:n]
    return out

array = np.array([0.3, 0.9, 0.6, 0.75, 0.55, 0.75, 0.75, 0.8, 0.5])
n = 3
res = nlar_idx(array, n)
print(res)
# [1, 7, 3]

型
有什么方法可以让它与argpartition（）一起工作，或者构建一个更好的函数版本吗？我们只想优化执行时间，内存无关紧要。

numpy

来源：https://stackoverflow.com/questions/77517247/find-first-indices-of-the-n-largest-values-of-a-numpy-array-without-sorting-in-p

4条答案

按热度按时间

t9aqgxwy1#

你应该使用np.argsort，但在列表的相反位置，因为如果有多个，argsort返回最后一个出现，而不是第一个。

import numpy as np

examples = [
    [2, [0, 0.5, 1, 0.5]],
    [2, [0.5, 0.5, 0, 0.5, 0.5]],
    [3, [0.3, 0.9, 0.6, 0.75, 0.55, 0.75, 0.75, 0.8, 0.5]],
    [6, [0.3, 0.9, 0.6, 0.75, 0.55, 0.75, 0.75, 0.8, 0.5]]
]

def compute(n_max, data):
    data = -1*np.array(data)
    return np.argsort(data)[:n_max]

for ex in examples:
    print(compute(ex[0], ex[1]))

字符串
结果是：

[2 1]
[0 1]
[1 7 3]
[1 7 3 5 6 2]

型

赞(0）回复(0）举报 2024-01-08

zhte4eai2#

对于n = 4，返回[1, 7, 3, 5]。

import numpy as np

def n_largest(arr, n):
    # Use partition to get the indices of the n largest elements
    indices = np.argpartition(arr, -n)[-n:]

    # Sort the selected indices based on their corresponding values
    # (you can skip this step if you don't want them in order)
    indices = indices[np.argsort(arr[indices])[::-1]]

    # Return the n largest elements and their indices
    return arr[indices], indices

# Example usage:
arr = np.array([0.3, 0.9, 0.6, 0.75, 0.55, 0.75, 0.75, 0.8, 0.5])
n = 4
result, indices = n_largest(arr, n)

print(f"The {n} largest elements are: {result}")
print(f"Their indices are: {indices}")

字符串
最大的4个元素是：[0.9 0.8 0.75 0.75]
他们的指数是：[1 7 3 5]

赞(0）回复(0）举报 2024-01-08

6tqwzwtp3#

考虑到你的列表相对较短，我倾向于使用普通的Python方法。heapq模块提供了相对有效地收集n极值的方法。

import heapq

def nlargest(lst, n):
    what = zip(lst, range(0, -len(lst), -1))
    return [-i for _, i in heapq.nlargest(n, what)]

字符串
这有点复杂，因为你希望得到最大的值，但最小的指数。

赞(0）回复(0）举报 2024-01-08

jmo0nnb34#

np.argsort似乎是一个解决方案。

idx = np.argsort(array)[-N:]

字符串
np.argsort返回一个索引列表，该列表将对数组进行升序排序。然后我们以只剩下最后N个索引的方式进行切片。
我认为这是允许的：是的，我们执行排序，但不是原地排序（因为目标数组保持不变）。
我们可以稍后使用np.where来查找最接近数组开头的元素。

import numpy as np

arr = np.array([0.5, 0.5, 0, 0.5, 0.5])
N=2
idx = np.flip(np.argsort(arr)[-N:])
out = np.zeros(0,dtype = int)
for ind in idx:
    sub_out = np.where(arr == arr[ind])[0]
    out = np.concatenate([out,sub_out])
    if len(out)>N:
        out = out [:N]
        break
Output: [0,1]

型
代码在idx数组上迭代，提供arr[ind]-- biggest值。稍后，我们使用np.where扫描目标数组以查找这些值，并连接结果（相同的值，但更接近开始）。如果我们获得了足够的元素（>N），我们将在所需的N值处中断并截断数组。

赞(0）回复(0）举报 2024-01-08

我来回答

查找numpy数组的N个最大值的第一个索引，而无需就地排序

4条答案

相关问题

热门标签

最新问答