Pandas分组两列并填入值[duplicate]

oalqel3c  于 2022-12-16  发布在  其他
关注(0)|答案(1)|浏览(95)

此问题在此处已有答案

Pandas groupby fillna with first valid value (window functions)(2个答案)
昨天关门了。
我想首先groupby product和Code列,然后用第一个非空值填充suppliers列
我问了chatGPT,但它不能提供解决方案。
代码如下所示:

# load the Pandas library
import pandas as pd

# create a dataframe with sample data
test = pd.DataFrame({'product': ['apple', 'apple', 'apple','apple','banana', 'banana', 'orange', 'orange','orange'],
                     'Code':[1,2,1,1,3,3,4,5,4],
                   'supplier': [None, None,None, 'Acme Inc.', 'Cotsco ', None, None, 'Target', None],
                   'quantity': [99,58,100, 200, 150, 50, 300, 20,400]})

# group the dataframe by the 'product' and 'Code' columns
test_grouped = test.groupby(['product', 'Code'])

# get the first non-null value in the 'supplier' column for each group
suppliers = test_grouped['supplier'].first()

# fill missing values in the 'supplier' column for each group using the first non-null value
test = test_grouped.apply(lambda x: x.assign(supplier=x['supplier'].fillna(suppliers[x.name])))

# print the updated dataframe
print(test)

代码错误:

ValueError: Must specify a fill 'value' or 'method'.
wvyml7n5

wvyml7n51#

GroupBy.transform用于与原始 Dataframe 大小相同的Series,并传递给Series.fillna

test_grouped = test.groupby(['product', 'Code'])
test['supplier'] = test['supplier'].fillna(test_grouped['supplier'].transform('first'))
print (test)
  product  Code   supplier  quantity
0   apple     1  Acme Inc.        99
1   apple     2       None        58
2   apple     1  Acme Inc.       100
3   apple     1  Acme Inc.       200
4  banana     3    Cotsco        150
5  banana     3    Cotsco         50
6  orange     4       None       300
7  orange     5     Target        20
8  orange     4       None       400

相关问题