我有一小段代码需要修改,我无法确切地找到为什么np.ean()在特定情况下工作,而np.min()在Pandas列由嵌套列表组成的情况下不能工作。也许这里有人能澄清一下?
下面这段代码可以完美地工作:
import pandas as pd
import numpy as np
def transformation(custom_df):
dic = dict(zip(custom_df['customers'], custom_df['values']))
custom_df['values'] = np.where(custom_df['values'].isna() & (custom_df['valid_neighbors'] >= 1),
custom_df['neighbors'].apply(
lambda row: np.mean([dic[v] for v in row if dic.get(v)])),
custom_df['values'])
return custom_df
customers = [1, 2, 3, 4, 5, 6]
values = [np.nan, np.nan, 10, np.nan, 11, 12]
neighbors = [[6], [3], [], [3, 5], [6], [5]]
vn = [1, 1, 0, 2, 1, 1]
df2 = pd.DataFrame({'customers': customers, 'values': values, 'neighbors': neighbors, 'valid_neighbors': vn})
customers values neighbors valid_neighbors
0 1 NaN [6] 1
1 2 NaN [3] 1
2 3 10.0 [] 0
3 4 NaN [3, 5] 2
4 5 11.0 [6] 1
5 6 12.0 [5] 1
df2 = transformation(df2)
结果是:
customers values neighbors valid_neighbors
0 1 12.0 [6] 1
1 2 10.0 [3] 1
2 3 10.0 [] 0
3 4 10.5 [3, 5] 2
4 5 11.0 [6] 1
5 6 12.0 [5] 1
但是,如果我在“change()”函数上将np.ean()更改为np.min(),它将返回一个ValueError,这让我纳闷为什么在调用np.ean()函数时没有发生这种情况:
ValueError: zero-size array to reduction operation minimum which has no identity
我想知道我没有满足哪些条件,我可以做些什么来获得预期的结果,这将是:
customers values neighbors valid_neighbors
0 1 12.0 [6] 1
1 2 10.0 [3] 1
2 3 10.0 [] 0
3 4 10.0 [3, 5] 2
4 5 11.0 [6] 1
5 6 12.0 [5] 1
3条答案
按热度按时间mftmpeh81#
使用以下代码并获得结果:
产量(平均值):
您可以将
mean
更改为min
:输出(分钟):
对
value
列执行所需结果mspsb9vt2#
在您的
neighbors
列中有一个空列表,这会对np.min
抛出错误,但即使对于空列表,np.mean
也可以。bxpogfeg3#
最好使用
neighbors
列中的空数组调整来更新transformation
函数。这里有一个可能奏效的变通办法。