python-3.x Pandas - Lambda inside适用于返回一行

yquaqz18 于 2024-01-10 发布在 Python

关注(0)|答案(2)|浏览(183)

当在Pandas DataFrame中的apply中使用lambda函数时，我希望得到整行，但看起来我得到了一个“单个元素”。
看看代码：

# Data sample
reviews_2 = pd.DataFrame({
    'price': {0: None, 1: 15.0, 2: 14.0, 3: 13.0}, 
    'country': {0: 'Italy', 1: 'Portugal', 2: 'US', 3: 'US'}, 
    'points': {0: 87, 1: 87, 2: 87, 3: 87}
})

print(reviews_2)

mean_price_2 = reviews_2.price.mean() # a value to centering

def remean_points(row):
    row.price = row.price - mean_price_2
    return row

centered_price_2 = reviews_2.apply(remean_points, axis='columns') # returns a DataFrame

print(centered_price_2)

字符串
这个“apply”返回一个DataFrame。这是我期望的输出！
所以，我尝试使用一个lambda函数，做：

reviews_2 = pd.DataFrame({
    'price': {0: None, 1: 15.0, 2: 14.0, 3: 13.0}, 
    'country': {0: 'Italy', 1: 'Portugal', 2: 'US', 3: 'US'}, 
    'points': {0: 87, 1: 87, 2: 87, 3: 87}
})
print(reviews_2)

mean_price_2 = reviews_2.price.mean()

centered_price_2 = reviews_2.apply(lambda p: p.price - mean_price_2, axis='columns') # returns a Serie!

print(centered_price_2)

型
但现在，“应用”返回一个系列！
我知道apply试图识别类型。
我在等待得到一行，但它看起来返回一个“单一元素”.
所以我的问题是：
lambda函数中的p不应该是一行？
有趣的是：
如果我做centered_price_2 = reviews_2.apply(lambda p: p, axis='columns')，
我得到一个 Dataframe .
然而：
如何使用lambda和apply函数并确定输出类型？！

python-3.x

来源：https://stackoverflow.com/questions/65496775/pandas-lambda-inside-apply-to-return-a-row

2条答案

按热度按时间

wn9m85ua1#

它不是很清楚什么是确切的输出预期，所以我希望这是你正在寻找的？
newcol将具有price-mean price。

>>> reviews_2['newcol'] = reviews_2['price'].apply(lambda x: x - reviews_2.price.mean())

   price   country  points  newcol
0    NaN     Italy      87     NaN
1   15.0  Portugal      87     1.0
2   14.0        US      87     0.0
3   13.0        US      87    -1.0

字符串

赞(0）回复(0）举报 2024-01-10

ippsafx72#

这个问题是在2020年做的，现在，在2024年，回顾我的开放性问题，我对Pandas的理解多了一点（只是一点）！
所以...
我的“错误”在这里：

mean_price_2 = reviews_2.price.mean()

centered_price_2 = reviews_2.apply(lambda p: p.price - mean_price_2, axis='columns') # returns a Serie!

字符串
我解释说：
1.就像我刚才说的，apply试图识别使用的类型。

mean_price_2 = reviews_2.price.mean()是Serie。
1.所以，即使p是一个完整的DataFrame，我的lambda函数表达式centered_price_2 = reviews_2.apply(lambda p: p.price - mean_price_2, axis='columns')也返回一个Serie！
1.因为，p.price - mean_price_2返回一个Serie。
在2020年，我错误地认为lambda p:...应该总是返回DataFrame，因为p是一个DataFrame。lambda返回的类型来自评估的表达式...
一个解决方案来 * 修复 * 我的代码将是：

reviews_2 = pd.DataFrame({
    'price': {0: None, 1: 15.0, 2: 14.0, 3: 13.0}, 
    'country': {0: 'Italy', 1: 'Portugal', 2: 'US', 3: 'US'}, 
    'points': {0: 87, 1: 87, 2: 87, 3: 87}
})

print(reviews_2)

mean_price_2 = reviews_2.price.mean()

# note the next two lines
centered_price_2 = reviews_2 # 'Copy' the DataFrame
centered_price_2.price = reviews_2.apply(lambda p: p.price - mean_price_2, axis='columns') # Only change the desired column!

print(centered_price_2)

型
2024年快乐！

赞(0）回复(0）举报 2024-01-10

我来回答

python-3.x Pandas - Lambda inside适用于返回一行

2条答案

相关问题

热门标签

最新问答