scipy 如何在Python中使用rv_discrete实现Polya-Aeppli分布？

qnzebej0 于 2023-05-17 发布在 Python

关注(0)|答案(1)|浏览(118)

我尝试在Python中实现Polya-Aeppli（几何泊松）分布，通过将scipy.stats.rv_discrete类子类化来指定PMF。
我已经实现了如下：

from scipy.stats import rv_discrete
import numpy as np
import math

class PolyaAeppli(rv_discrete):

    def _pmf(self, k: np.ndarray, lambda_, theta_) -> np.ndarray:
        """
        Probability mass function for Polya-Aeppli distribution. Extension of Poisson distribution for
        moddeling group arrivals.

        :param k: internal parameter of rv_discrete
        :param lambda_: arrival rate param of Poisson dist: [0, inf). The higher the lambda, the more arrivals
        :param theta_: probability param of Geometric dist: [0, 1]. The LOWER the theta, the more arrivals
        :return: probability values for each k
        """

        if isinstance(lambda_, np.ndarray or list):
            lambda_ = lambda_[0]. # depending on the situation, this is either a number or a list of the same number, so correct accordingly
        if isinstance(theta_, np.ndarray or list):
            theta_ = theta_[0]
        k = np.asarray(k, dtype=int)
        res = np.zeros(len(k))
        for ix, k_ in enumerate(k):  # sorry, cannot vectorize
            if k_ == 0:
                res[ix] = np.exp(-lambda_)
            else:
                res[ix] = np.exp(-lambda_) * np.sum([(np.power(lambda_, i) / math.factorial(i)) *
                                                     ((1 - theta_) ** (k_ - i)) * (theta_ ** i) *
                                                     (comb(k_ - 1, i - 1)) for i in range(1, k_ + 1)])
        return res

这有点古怪，因为k在馈送到_pmf时是np.ndarray，但这对range()不起作用。数学公式似乎是正确的，但是当绘制具有不同参数的样本时，结果似乎不正确：

PA = PolyaAeppli(name='polya_aeppli')

    # sample from distribution
    resa = PA.rvs(lambda_=0.5, theta_=0.5, size=1000)
    resb = PA.rvs(lambda_=0.5, theta_=0.9, size=1000)
    resc = PA.rvs(lambda_=0.9, theta_=0.5, size=1000)
    resd = PA.rvs(lambda_=0.9, theta_=0.9, size=1000)

    # plot histogram

    plt.hist(resa, label='a', alpha=0.5)
    plt.hist(resb, label='b', alpha=0.5)
    plt.hist(resc, label='c', alpha=0.5)
    plt.hist(resd, label='d', alpha=0.5)
    plt.legend()
    plt.show()

polya-Aeppli results
我预计d的平均到达人数最多，因为团队规模和到达率都是最高的。但情况似乎并非如此。
是我的代码有问题，还是我误解了分布/结果？
谢谢！

scipy

来源：https://stackoverflow.com/questions/76262984/how-to-implement-the-polya-aeppli-distribution-in-python-using-rv-discrete