python-3.x 加密安全的精确加权采样

kuuvgm7e  于 2022-12-01  发布在  Python
关注(0)|答案(1)|浏览(132)

如何在以下条件下选择带替换和权重的k元素?

  • 随机性必须是加密安全的,例如在secrets模块中使用的随机性。
  • 权重必须精确,即使用整数而不是浮点运算。

自行编写的代码可能比可用的实现更不安全和更有效。据我所知,以下实现不符合我的要求。

xdnvmnnf

xdnvmnnf1#

我会把choices实现从random模块中分离出来,比如:

from random import SystemRandom
from itertools import accumulate as _accumulate, repeat as _repeat
from bisect import bisect as _bisect

def choices(population, weights, *, k=1):
    randrange = SystemRandom().randrange
    n = len(population)
    cum_weights = list(_accumulate(weights))
    if len(cum_weights) != n:
        raise ValueError('The number of weights does not match the population')
    total = cum_weights[-1]
    if not isinstance(total, int):
        raise ValueError('Weights must be integer values')
    if total <= 0:
        raise ValueError('Total of weights must be greater than zero')
    bisect = _bisect
    hi = n - 1
    return [population[bisect(cum_weights, randrange(total), 0, hi)]
            for i in _repeat(None, k)]

其可以被测试为:

from collections import Counter

draws = choices([1, 2, 3], [1, 2, 3], k=1_000_000)
print(dict(sorted(Counter(draws).items())))

给了我:

{1: 166150, 2: 333614, 3: 500236}

看起来差不多。
更新:只是想检查一个错误,它似乎很好在这里:

print(
    choices([1, 2, 3], [1, 0, 0], k=5),
    choices([1, 2, 3], [0, 1, 0], k=5),
    choices([1, 2, 3], [0, 0, 1], k=5),
)

给出:

[1, 1, 1, 1, 1] [2, 2, 2, 2, 2] [3, 3, 3, 3, 3]

这似乎也是对的

相关问题