标记行的Pandas适合条件

sc4hvdpw  于 2023-04-04  发布在  其他
关注(0)|答案(1)|浏览(99)

我有一个简单的数据集,我想用下面的条件标记行,如果:

  • “关闭”和“结束”列都比现在早
  • 和[One,Two,Three]中的列Stage
  • 并且,“项目编号”列不为空

我想到了下面的线条,但它不起作用。

import pandas as pd
import datetime
import numpy as np
from io import StringIO

csvfile = StringIO("""
ID  Stage   Close   Project Number  End"
A899    One 26/08/2019  KL1468  30/08/2019"
A572    Two 31/12/2020  KL1493  17/12/2019"
A778    Three   26/08/2019      16/08/2019"
A704    Four    31/12/2020  KL1036  01/12/2019"
A650    One 31/12/2020  KL1522  23/12/2019"
A830    Two 31/08/2021  KL1535  03/08/2021"
A669    Three   18/08/2021  KL1536  03/08/2021"
A892    Four    31/08/2021  KL1534  03/08/2021"
A789    One 31/05/2021  KL1537  04/08/2021"
A821    Two 31/12/2020  KL1578  03/11/2019"
A992    Three   29/07/2019  KL1609  26/06/2019"
A550    Four    31/12/2020  KL1243  30/11/2019"
A707    One 31/12/2020  KL1523  29/11/2019"
A740                
A917    Three   31/07/2021  KL1072  29/07/2021"
A627    Four    30/06/2021  KL1577  15/06/2021"

""")

df = pd.read_csv(csvfile, sep = '\t', engine='python')

def condition_1(s):
    if (df['Project Number'].any() and s['Expiry_1'] < datetime.datetime.now() and s['Close_1'] < datetime.datetime.now() and np.where(df['Stage'].isin(['One','Two','Three']))):
        return "Overdue"
    else:
        return ''

df['Expiry_1'] = pd.to_datetime(df['End'].str[3:5] + '/' + df['End'].str[:2] + '/' + df['End'].str[-4:])
df['Close_1'] = pd.to_datetime(df['Close'].str[3:5] + '/' + df['Close'].str[:2] + '/' + df['Close'].str[-4:])
df["Overdue Project"] = df.apply(condition_1, axis=1)

df.to_excel("c:\\Projects\\output.xlsx", index = False)

什么地方出了问题,正确的方法是什么?

xzlaal3s

xzlaal3s1#

首先通过to_datetime使用format参数将列转换为datetime:

df['Expiry_1'] = pd.to_datetime(df['End'], format='%d/%m/%Y')
df['Close_1'] = pd.to_datetime(df['Close'], format='%d/%m/%Y')

然后测试两列是否小于DataFrame.lt或大于DataFrame.gt,并且需要测试两列是否都是True s对于第一个掩码m1使用DataFrame.all,对于第二个掩码Series.isin和对于最后一个掩码Series.notna使用DataFrame.all,对于逐位AND,最后通过&将它们链接起来,并传递到numpy.where
x一个一个一个一个x一个一个二个x

相关问题