Python:计算用于查找匹配项的比较总数

0s0u357o  于 2022-10-30  发布在  Python
关注(0)|答案(3)|浏览(130)

我有两个数组,一个有10个索引,一个有2个索引。
我想检查大数组是否具有小数组的精确值。
总共需要进行9次比较。
如何为不同大小的数组计算此值?
我需要这个值来操作控制流程。

largeArr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
smallArr = [9, 10]

在第九次比较时,这将是真的。

bzzcjhmw

bzzcjhmw1#

如果没有找到,则需要进行len(largeArr) - len(smallArr) + 1次比较;如果找到了,则平均需要一半的时间,但这取决于条目的统计信息,所以这是O(n),其中n = len(largeArr).
然而,如果largeArr按照示例所示排序,那么对smallArr[0]进行二分查找会更有效,这将使检查结果为O(log(n))
另一种方法,如果你想检查一个给定的largeArr的许多不同的smallArr,它会快得多:对从largeArr中提取的每个长度为n = len(smallArr)的连续切片生成一个哈希值,并将这些哈希值放入一个setdict中,然后通过计算特定smallArr的哈希值并检查预先计算的setdict中的成员关系,可以非常快速地检查该特定smallArr是否存在。
下面是后一种方法的示例:

largeArr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
smallArr = [9, 10]
n = len(smallArr)
match = set()
for i in range(0, len(largeArr) - n + 1):
    match.add(tuple(largeArr[i:i+n]))
print(tuple(smallArr) in match)

现在检查的时间复杂度接近O(1),或者至少和set测试成员关系的速度一样快(实际上,n测试成员关系的速度会很慢,具体取决于实现)。

0wi1tuuw

0wi1tuuw2#

这是另一个解决方案,上面的解决方案是完美的,我的解决方案恰好在恒定空间和线性时间复杂度下运行。
那就是;
时间复杂度:O(N)
时间复杂度O(1)

from typing import List  # for types annotation

# You can solve this in a linear fashion like this...

def mapable(universe: List[int], elements: List[int], from_indx: int) -> bool:
  # tries to address worst case
  last_mapping_indx: int = from_indx + (len(elements) - 1)
  if last_mapping_indx >= len(universe) or not(elements[-1] == universe[last_mapping_indx]):
    return False

  # why use a loop? using a loop is more dynamic, in case elements can change in size
  # tries to match a subset in a set
  for num in elements:
    if from_indx >= len(universe) or not (num == universe[from_indx]):
      return False
    from_indx += 1
  return True

# T = typedVar('T')

# you can find a creative way to use zip(itr[T], itr[T]) here to achieve the same

def a_in_b(larger: List[int], smaller: List[int]) -> bool:
  for indx, num in enumerate(larger):
    if num == smaller[0]:
      if mapable(larger, smaller, indx):
        return True
        # return indx + (len(smaller))  # this is the solution if you only care about how many comparison were made
  return False

# this code will check that a list is found in a larger one-dimentional list in linear faction. If you look at the helper-method(mapable) the worst case scenario would be the following

# larger: [8, 8, 8, 8, 8, 8, 8, 8, 8, 9]

# smaller: [8, 9]

# where it tries to iterate through smaller n-1 times. This would drop our complexity from O(N) to O(n * m) where m = len(smaller). Hence why we have an if statement at the beginning of mapable.

largeArr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
smallArr = [8, 9, 10]
print(a_in_b(largeArr, smallArr))  # True
dhxwm5r4

dhxwm5r43#

由于您有numpy标记,请使用一种简单的方法:

from numpy.lib.stride_tricks import sliding_window_view as swv

largeArr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
smallArr = [9, 10]

out = (swv(largeArr, len(smallArr)) == smallArr).any()

# True

中级:

swv(largeArr, len(smallArr))

array([[ 1,  2],
       [ 2,  3],
       [ 3,  4],
       [ 4,  5],
       [ 5,  6],
       [ 6,  7],
       [ 7,  8],
       [ 8,  9],
       [ 9, 10]])
重复比较

如果需要进行多次比较:

from numpy.lib.stride_tricks import sliding_window_view as swv

existing = set(map(tuple, swv(largeArr, len(smallArr))))
tuple(smallArr) in existing

# True

tuple([12, 4]) in existing

# False

相关问题