Iterate Two Pandas Dataframes只在日期较新时保留行

x33g5p2x  于 2023-05-27  发布在  其他
关注(0)|答案(2)|浏览(133)

基于匹配的ID列迭代两个 Dataframe ,但用DF1中日期晚于DF2的行构建新的dict,最快的方法是什么?

df1
ID  date
1   1/1/2020
2   1/1/2021
3   1/1/2020

df2
ID  date
1   1/1/2020
2   1/1/2020
3   1/1/2020

new DF1
ID  date
2   1/1/2021

for index1,row1 in df1.iterrows():
    for index2,row2 in df2.iterrows():
        if row1['ID'] == row2['ID'] and row1['last_edited_date'] > row2['last_edited_date']:
            newdict = row1.to_dict()
dzjeubhm

dzjeubhm1#

另一种可能的解决方案:

# df1['date'] = pd.to_datetime(df1['date'])
# df2['date'] = pd.to_datetime(df2['date'])

df1.set_index('ID', inplace=True)
df2.set_index('ID', inplace=True)

df2[df1['date'].lt(df2['date'])].reset_index()

输出:

ID       date
0   2 2021-01-01
gywdnpxw

gywdnpxw2#

您可以使用经典的merge,然后过滤行:

#uncomment in case of non parsed dates
# df1["date"] = pd.to_datetime(df1["date"])
# df2["date"] = pd.to_datetime(df2["date"])
​
out = (
    df1.merge(df2, on="ID", suffixes=("_", ""))
        .query("date > date_")[df1.columns]
        .assign(date= lambda x: x["date"].dt.strftime("%#m/%#d/%Y")) #optional ?
        # .to_dict("list") #uncomment in case you need a dictionnary
)

输出:

print(out)

   ID      date
1   2  1/1/2021

相关问题