python-3.x 根据Pandas中的日期和标志条件删除 Dataframe 行

bogh5gae  于 2023-01-18  发布在  Python
关注(0)|答案(2)|浏览(127)

我有一个数据框

df = pd.DataFrame([["A","13-02-2022","B","FALSE"],["A","13-02-2022","C","FALSE"],["A","14-02-2022","D","FALSE"],
                   ["A","14-02-2022","E","FALSE"],["A","16-02-2022","A","TRUE"],["A","16-02-2022","F","FALSE"],
                   ["A","17-02-2022","G","FALSE"],["A","17-02-2022","H","FALSE"],["A","18-02-2022","I","FALSE"],
                   ["A","18-02-2022","J","FALSE"]],columns=["id1","date","id2","flag"])
id1   date     id2  flag
A   13-02-2022  B   FALSE
A   13-02-2022  C   FALSE
A   14-02-2022  D   FALSE
A   14-02-2022  E   FALSE
A   16-02-2022  A   TRUE
A   16-02-2022  F   FALSE
A   17-02-2022  G   FALSE
A   17-02-2022  H   FALSE
A   18-02-2022  I   FALSE
A   18-02-2022  J   FALSE

我想删除上一个工作日、下一个工作日和flag为TRUE的日期的所有行。
例如,此处2月16日标志为TRUE,因此删除前一个工作日2月14日、下一个工作日2月17日和2月16日的所有行。如果TRUE是在2月28日的最后一天,其中没有下一个工作日,则仅删除TRUE标志日和前一个工作日的行。

    • 预期产出:**

一个二个一个一个
怎么做呢?

4zcjmb1e

4zcjmb1e1#

您可以使用布尔索引:

# ensure boolean and datetime
df['flag'] = df['flag'].eq('TRUE')
df['date'] = pd.to_datetime(df['date'], dayfirst=True)

bday = pd.offsets.BusinessDay(1)

drop = pd.concat([dates+bday, dates-bday])

out = df[~(df['date'].isin(drop) | df['flag'])]

输出:

id1       date id2   flag
0   A 2022-02-13   B  False
1   A 2022-02-13   C  False
2   A 2022-02-14   D  False
3   A 2022-02-14   E  False
5   A 2022-02-16   F  False
8   A 2022-02-18   I  False
9   A 2022-02-18   J  False
8gsdolmq

8gsdolmq2#

您可以尝试创建过滤器数据框并选择其中不包含的所有内容:

df['date'] = pd.to_datetime(df['date'], format="%d-%m-%Y")

dates = df[df.flag == 'TRUE']['date']
to_drop = pd.concat([dates, dates + pd.offsets.BusinessDay(1), dates - pd.offsets.BusinessDay(1)])
df_out = df[~df['date'].isin(to_drop)]
df_out

相关问题