python 节点值快速插值

o4tp2gmn 于 2023-03-21 发布在 Python

关注(0)|答案(1)|浏览(107)

我有这样一个数据集：
时间|节点标签|价值观
有可能在同一时间有2个不同的值为同一节点.我想比较这两行的值，然后用一个新的第一行替换.第二行必须删除.
只有两个值的示例：

Time | Node Label | Values

 1           3       10    5
 1           5       15    11
 1           3       -6    7
 2           3        8    4
 2           5        3    9
 2           3        1    1

它变成：

Time | Node Label | Values

 1           3       2    6
 1           5       15   11
 2           3       4.5  2.5
 2           5        3   9

最后，我需要在一定时间内将对应于唯一节点标签的每一行按升序排序。为了比较数组并创建新的插入对象，我简单地使用了np.mean函数。
我提出了这个解决方案：

time_col = data[:, 0]
label_col = data[:, 1]

unique_labels, label_indices = np.unique(label_col, return_inverse=True)
unique_times, time_indices = np.unique(time_col, return_inverse=True)

grouped_indices = np.ravel_multi_index((time_indices, label_indices), dims=(len(unique_times), len(unique_labels)))

grouped_data = [data[grouped_indices == i] for i in range(len(unique_times) * len(unique_labels))]
    
# apply a function to each group to select the row with the highest values
highest_value = np.array([np.mean(group, 0) for group in grouped_data])
    
# create a new numpy array from the highest_value array
data = np.concatenate([highest_value[:, :2], highest_value[:, 2:]], axis=1)

它可以工作，但非常慢。显然，因为我有多个显式for循环，当然我也在循环不必要的元素。我只能使用numpy库。
例如，使用此数据集可能需要几个小时：https://shorturl.at/myIY9

python

来源：https://stackoverflow.com/questions/75793329/fast-interpolation-of-nodal-values

1条答案

按热度按时间

c3frrgcw1#

首先，您可以尝试使用this解决方案。
创建组后，可以使用显式for循环计算每个组的平均值（这仍然比计算[data[grouped_indices == i] for i in range(len(unique_times) * len(unique_labels))]更好）。
另一种解决方案可能更快，使用numpy.bincount对数据进行分组和平均。
这里有一个例子

import numpy as np

# Create a sample 2D NumPy array
arr = np.array([[1, 3], [2, 4], [1, 5], [3, 5], [2, 6]])

# Get unique values and their corresponding indices
unq, inv = np.unique(arr[:, 0], return_inverse=True)

# Compute mean values for each group
weights = np.ones_like(inv) / inv.size
mean_arr = np.column_stack((unq, np.bincount(inv, weights=weights * arr[:, 1], minlength=len(unq)) / np.bincount(inv, weights=weights, minlength=len(unq))))

print(mean_arr)

对于具有多个值列的数组，我认为应该对每列重复该过程（不重复unique调用）。

赞(0）回复(0）举报 2023-03-21

我来回答

python 节点值快速插值

1条答案

相关问题

热门标签

最新问答