pandas 如何对numpy数组中的多个变量求和?

cgvd09ve  于 2023-03-28  发布在  其他
关注(0)|答案(2)|浏览(147)

我有一个numpy数组,它看起来像这样:
test = numpy.array([0, 0, 1, 3, 5, 0, 0, 0, 15, 16, 2, 0, 0])
我想得到每个“数字簇”的总和:
应该是这样的:[0 0 0 9 0 0 0 0 0 33 0 0 0 ]
我正在搜索pandas或numpy模块来完成此操作。是否有任何建议或选项?

编辑:

澄清我的问题:我有一个真实的数据集的这一部分(见下文):
我试着对所有零之间的值求和,并将和值放在数字序列的中间,和值周围的值应该变为零。我这样做是为了得到一个茎图,其中所有的棍子都在每条曲线的中间。
完全可再现的示例:

numpy.array([0.14615127632512193, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.029740091488616338, 0.09063089178836162, 0.1380136511666047, 0.17187288438267243, 0.19248433518089703, 0.2003245168693058, 0.19614351292272647, 0.18088710080137402, 0.15564787198250443, 0.1216984367737226, 0.08039857005072917, 0.033215997285686215, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.025134855935682095, 0.10513055014366987, 0.18598353864085884, 0.26609407961465364, 0.34387046651092235, 0.4177887209889943, 0.4863527667154762, 0.5482299378679611, 0.6021788910022772, 0.6472089430838289, 0.6825618586582046, 0.7076575672260416, 0.7222015685948489, 0.726197524625621, 0.7199170030883423, 0.7038805041890686, 0.6788731622372104, 0.64591328542169, 0.6061815461069726, 0.5610267432268025, 0.5119138691552906, 0.46032794110916975, 0.4077539662033596, 0.3556375992681274, 0.3052450205826604, 0.2577024305616884, 0.21393481676710369, 0.1746001292236276, 0.14010959314624474, 0.11057572712877385, 0.08582868020007114, 0.06542672933651948, 0.04873479433640581, 0.03487701609889477, 0.022821459765064396, 0.011439409983895869, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.005721667561715918, 0.0449689855458949, 0.08893101739902065, 0.13746419804461318, 0.19022591227262747, 0.24667121039905493, 0.30610548480575106, 0.36754094275306776, 0.42986242949461123, 0.4916586003062767, 0.5514777126712833, 0.6077363354125331, 0.6587203970535883, 0.7027794706280279, 0.7383152350047645, 0.7638602773480453, 0.7781542289716459, 0.7801776822283781, 0.7692426483788086, 0.7450439442534996, 0.7076142017184802, 0.6574157716388517, 0.5953064199112776, 0.5225129829154233, 0.4406541001312699, 0.3516257544870717, 0.25753393194644153, 0.1606954827162166, 0.06348731824600296, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
mccptt67

mccptt671#

您可以确定每个集群的开始/停止,计算np.ad.reduce与集群大小的总和,然后向集群的中间分配:

test = np.array([1, 1, 1, 3, 5, 0, 0, 0, 15, 16, 2, 0, 99, 0, 1, 2, 3, 4, 5, 0, 10, 20, 30, 40, 0, 1])

# identify null values
m = test == 0

# get positions of null/non-null change
idx = np.flatnonzero(np.diff(np.r_[True, m]))

# set up output array
out = np.zeros_like(test)

# compute size of each non-null cluster
cluster_size = np.diff(np.r_[idx, len(test)])[::2]

# assign their sum to the middle point of ach cluster
out[idx[::2]+(cluster_size)//2] = np.add.reduceat(test, idx)[::2]

out
# array([  0,   0,  11,   0,   0,   0,   0,   0,   0,  33,   0,   0,  99,
#          0,   0,   0,  15,   0,   0,   0,   0,   0, 100,   0,   0,   1])

测试阵列上的可视输出:

bq3bfh9z

bq3bfh9z2#

import numpy as np

test = np.array([0, 0, 1, 3, 5, 0, 0, 0, 15, 16, 2, 0, 0])

test2 = np.zeros(len(test))

ind = np.concatenate((np.argwhere(test).squeeze(), [0]))

S=0
for i in range(len(ind)):
    now, prv = ind[i], ind[i-1]
    if now == prv+1:
        S += test[prv]
    else:
        S += test[prv]
        test2[prv] = S
        S=0

print(test2)
array([ 0.,  0.,  0.,  0.,  9.,  0.,  0.,  0.,  0.,  0., 33.,  0.,  0.])

相关问题