numpy 如何利用“重叠”窗口阵列切片计算大小阵列的Pearson相关系数

zbdgwd5y 于 2023-01-26 发布在其他

关注(0)|答案(1)|浏览(133)

假设我有两个非常简单的数组numpy：

import numpy as np
reference=np.array([0,1,2,3,0,0,0,7,8,9,10])
probe=np.zeros(3)

我想找出数组reference的哪一个切片与数组probe具有最高的皮尔逊相关系数，为此，我想使用在for循环中重叠的某种子数组对数组reference进行切片，这意味着我每次移动reference的一个元素，并将其与数组probe进行比较。我使用下面的非优雅代码进行切片：

from statistics import correlation
for i in range(0,len(reference)):
  #get the slice of the data 
  sliced_data=reference[i:i+len(probe)]
  #only calculate the correlation when probe and reference have the same number of elements 
  if len(sliced_data)==len(probe):
      my_rho = correlation(sliced_data, probe)

我对这样一个准则有一个问题：
1-一旦我运行代码，我有下面的错误：

my_rho = correlation(sliced_data, probe)
  File "/usr/lib/python3.10/statistics.py", line 919, in correlation
    raise StatisticsError('at least one of the inputs is constant')
statistics.StatisticsError: at least one of the inputs is constant

2-有没有一种更优雅的方法来使用python进行这种切片？

numpy

来源：https://stackoverflow.com/questions/75221654/how-to-slice-and-calculate-the-pearson-correlation-coefficient-between-one-big-a

1条答案

按热度按时间

j2datikz1#

您可以使用sliding_window_view获取连续值，对于相关性的矢量化计算，请使用custom function：

from numpy.lib.stride_tricks import sliding_window_view as swv

def np_corr(X, y):
    # adapted from https://stackoverflow.com/a/71253141
    denom = (np.sqrt((len(y) * np.sum(X**2, axis=-1) - np.sum(X, axis=-1) ** 2)
                       * (len(y) * np.sum(y**2) - np.sum(y)**2)))
    return np.divide((len(y) * np.sum(X * y[None, :], axis=-1) - (np.sum(X, axis=-1) * np.sum(y))),
                     denom, where=denom!=0
                    )

corr = np_corr(swv(reference, len(probe)), probe)

输出：

array([ 1.        ,  1.        , -0.65465367, -0.8660254 ,  0.        ,
        0.8660254 ,  0.91766294,  1.        ,  1.        ])

赞(0）回复(0）举报 2023-01-26

我来回答

numpy 如何利用“重叠”窗口阵列切片计算大小阵列的Pearson相关系数

1条答案

相关问题

热门标签

最新问答