如何在python pandas中编写一个函数来在循环中追加dataframe中的行？

icomxhvb 于 2023-01-19 发布在 Python

关注(0)|答案(1)|浏览(188)

我被提供了一个数据集，我正在写一个函数。我的对象很简单。我有一个air bnb数据库，有很多列。我的对象很简单。我在邻居组列表（我创建的）上使用一个for循环，我试图在一个空的 Dataframe 中提取（添加）与特定元素相关的数据。
示例：

import pandas as pd
import numpy as np

dict1 = {'id' : [2539,2595,3647,3831,12937,18198,258838,258876,267535,385824],'name':['Clean & quiet apt home by the park','Skylit Midtown Castle','THE VILLAGE OF HARLEM....NEW YORK !','Cozy Entire Floor of Brownstone','1 Stop fr. Manhattan! Private Suite,Landmark Block','Little King of Queens','Oceanview,close to Manhattan','Affordable rooms,all transportation','Home Away From Home-Room in Bronx','New York City- Riverdale Modern two bedrooms unit'],'price':[149,225,150,89,130,70,250,50,50,120],'neighbourhood_group':['Brooklyn','Manhattan','Manhattan','Brooklyn','Queens','Queens','Staten Island','Staten Island','Bronx','Bronx']}

df = pd.DataFrame(dict1)
df

我创建了一个函数，如下所示

nbd_grp = ['Bronx','Queens','Staten Islands','Brooklyn','Manhattan']

# Creating a function to find the cheapest place in neighbourhood group

dfdf = pd.DataFrame(columns = ['id','name','price','neighbourhood_group'])

def cheapest_place(neighbourhood_group):
  for elem in nbd_grp:
    data =  df.loc[df['neighbourhood_group']==elem]
    cheapest = data.loc[data['price']==min(data['price'])]
    dfdf = cheapest.copy()
cheapest_place(nbd_grp)

我的预期输出为：
| 身份证|姓名|价格|邻域群|
| - ------|- ------|- ------|- ------|
| 小行星267535|家外之家-布朗克斯客房|五十|布朗克斯|
| 小行星18198|皇后区的小国王|七十|皇后区|
| 小行星258876|经济实惠的客房，所有交通工具|五十|斯塔顿岛|
| 小行星3831|舒适的整个楼层|八十九|布鲁克林|
| 小行星3647|哈莱姆村......纽约!|一百五十|曼哈顿|

pandas

来源：https://stackoverflow.com/questions/75148411/how-to-write-a-function-in-python-pandas-to-append-the-rows-in-dataframe-in-a-lo

1条答案

按热度按时间

9lowa7mx1#

我的建议是，无论何时，当您在数据库或 Dataframe 中工作时，如果您认为 “我需要循环”，您应该重新考虑。
在 Dataframe 中，你处于一个基于集合的逻辑世界中，很可能有一个更好的基于集合的方法来解决这个问题。在你的情况下，你可以groupby()你的neighbourhood_group，并得到price列的min()，然后merge或join，结果集回到你的原始 Dataframe ，得到你的id和name列。
这看起来像这样：

df_min_price = df.groupby('neighbourhood_group').price.agg(min).reset_index().merge(df, on=['neighbourhood_group','price'])

+-----+---------------------+-------+--------+-------------------------------------+
| idx | neighbourhood_group | price |   id   |                name                 |
+-----+---------------------+-------+--------+-------------------------------------+
|   0 | Bronx               |    50 | 267535 | Home Away From Home-Room in Bronx   |
|   1 | Brooklyn            |    89 |   3831 | Cozy Entire Floor of Brownstone     |
|   2 | Manhattan           |   150 |   3647 | THE VILLAGE OF HARLEM....NEW YORK ! |
|   3 | Queens              |    70 |  18198 | Little King of Queens               |
|   4 | Staten Island       |    50 | 258876 | Affordable rooms,all transportation |
+-----+---------------------+-------+--------+-------------------------------------+

赞(0）回复(0）举报 2023-01-19

我来回答

如何在python pandas中编写一个函数来在循环中追加dataframe中的行？

1条答案

相关问题

热门标签

最新问答