pandas 将数字放入类列,按日期分隔数据框

qgelzfjb  于 2023-05-12  发布在  其他
关注(0)|答案(1)|浏览(66)

它有一个数据框,其中有一列型号和日期,如下所示

df = pd.DataFrame({'model':['A','B','C','D', 'E','F','G','I','J','K'],
           'date':['2022-10-28  12:10:28 AM','2022-12-07  12:12:07 AM','2022-12-07  12:12:07 AM','2022-12-07  12:12:07 AM',
                   '2022-12-08  12:12:08 AM','2022-12-10  12:12:10 AM','2023-02-22  12:02:22 AM','2023-02-22  12:02:22 AM',
                   '2023-02-24  12:02:24 AM','2023-03-04  12:03:04 AM']})

我想区分每个月的1号和15号以及每个月的16号和31号(或30号),并将数字放在一个类列中,如下所示

这可能吗?

m0rkklqb

m0rkklqb1#

可以使用pd.cut

# Find begin and end dates that enclose your dates
start = df['date'].min().date() - pd.offsets.MonthBegin(1)
end = df['date'].max().date() + pd.offsets.MonthEnd()

# Create the range and bin values
bins = pd.date_range(start, end, freq='MS')
bins = sorted(bins.tolist() + list(bins + pd.DateOffset(days=15)))
df['class'] = pd.factorize(pd.cut(df['date'], bins=bins, labels=False))[0]
print(df)

# Output
  model                date  class
0     A 2022-10-28 00:10:28      0
1     B 2022-12-07 00:12:07      1
2     C 2022-12-07 00:12:07      1
3     D 2022-12-07 00:12:07      1
4     E 2022-12-08 00:12:08      1
5     F 2022-12-10 00:12:10      1
6     G 2023-02-22 00:02:22      2
7     I 2023-02-22 00:02:22      2
8     J 2023-02-24 00:02:24      2
9     K 2023-03-04 00:03:04      3

相关问题