根据Pandas的两个阈值赋值

zc0qhyus  于 2022-12-09  发布在  其他
关注(0)|答案(3)|浏览(129)

I have a pandas Dataframe named df and it has a column named logvalues. I want to create a new column, violatedInstances based on these log values.
If Max >= logvalue >= Min assign 0 (Not violated) If logvalue > Max or logvalue < Min assign 1 (Violated)

#create DataFrame
df_x = pd.DataFrame({'logvalue': ['20', '20.5', '18.5', '2', '10'],
                     'ID': ['1', '2', '3', '4', '5']})

Max = 20
min = 15

Output should look like below.
| logvalue | ID | violatedInstances |
| ------------ | ------------ | ------------ |
| 20 | 1 | 0 |
| 20.5 | 2 | 1 |
| 18.5 | 3 | 0 |
| 2 | 4 | 1 |
| 10 | 5 | 1 |
Sorry for asking this simple question. I tried several methods but nothing worked. How can I do this in pandas?

h22fl7wq

h22fl7wq1#

首先,我会将logvalue转换为float,以便您可以执行比较
df_x['logvalue'] = df_x['logvalue'].astype('float')
那么你可以这样使用numpy:

import numpy as np
df_x['violatedInstances'] = np.where(((df_x['logvalue'] > Max) | (df_x['logvalue'] < Min)), 1, 0)

其输出:

xxhby3vn

xxhby3vn2#

您的logvalue类型是string,因此您必须转换为float:

df_x['violatedInstances'] = df_x['logvalue'].astype(float).apply(lambda x: 1 if (x > Max or x < Min) else 0)
wj8zmpe1

wj8zmpe13#

cond1 = pd.to_numeric(df_x['logvalue']).gt(20)
cond2 = pd.to_numeric(df_x['logvalue']).lt(15)
df_x.assign(violatedInstances= (cond1 | cond2).astype('int'))

实验结果:

logvalue    ID  violatedInstances
0   20      1   0
1   20.5    2   1
2   18.5    3   0
3   2       4   1
4   10      5   1

相关问题