python Numpy:给定一组范围,有没有一种有效的方法可以找到与所有其他范围不相交的范围集合?

icomxhvb  于 2023-01-24  发布在  Python
关注(0)|答案(2)|浏览(94)

有没有一种优雅的方法可以从numpy中的一组值域中找到不相交值域的集合?

ranges = [[0,3], [2,4],[5,10]] # there are about 50 000 elements
disjoint_ranges = [] # these are all disjoint
adjoint_ranges = [] # these do not all have to be mutually adjoint
for index, range_1 in enumerate(ranges):
    i, j = range_1 # all ranges are ordered s.t. i<j
    for swap_2 in ranges[index+1:]: # the list of ranges is ordered by increasing i
        a, b, _ = swap_2
        if a<j and a>i:
            adjoint_swaps.append(swap)
            adjoint_swaps.append(swap_2)
    else:
        if swap not in adjoint_swaps:
            swaps_to_do.append(swap)
print(adjoint_swaps)
print(swaps_to_do)
tf7tbtn2

tf7tbtn21#

在numpy数组上循环有点违背了使用numpy的目的。你可以通过利用accumulate方法来检测不相交的范围。
当你的范围按下限排序后,你可以累计上限的最大值来确定先前范围对后续范围的覆盖。然后比较每个范围的下限和先前范围的覆盖范围来了解是否存在前向重叠。然后你只需要比较每个范围的上限和下一个范围的下限来检测后向重叠。向前和向后重叠的组合将允许您标记所有重叠的范围,并且通过消除,找到与其他范围完全不相交的范围:

import numpy as np

ranges = np.array( [ [1,8], [10,15], [2,5], [18,24], [7,10] ] )
ranges.sort(axis=0)

overlaps       = np.zeros(ranges.shape[0],dtype=np.bool)
overlaps[1:]   = ranges[1:,0] < np.maximum.accumulate(ranges[:-1,1])
overlaps[:-1] |= ranges[1:,0] < ranges[:-1,1]

disjoints = ranges[overlaps==False]

print(disjoints)
   
[[10 15]
 [18 24]]
byqmnocz

byqmnocz2#

我不确定numpy的情况,但pandas的情况如下:

from functools import reduce
import pandas as pd

ranges = [
    pd.RangeIndex(10, 20),
    pd.RangeIndex(15, 25),
    pd.RangeIndex(30, 50),
    pd.RangeIndex(40, 60),
]

disjoints = reduce(lambda x, y : x.symmetric_difference(y), ranges)
disjoints
Int64Index([10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31, 32, 33, 34, 35, 36,
            37, 38, 39, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
           dtype='int64')

相关问题