pandas 如果另一列包含某个单词,Python会限制如何更新值为1的列

ocebsuys  于 2023-02-17  发布在  Python
关注(0)|答案(3)|浏览(148)

df looks like this:
| description and keybenefits (14) | brand_cooltouch (1711) | brand_easylogic (1712) |
| ------------ | ------------ | ------------ |
| Lorem Ipsum cooltouch Lorem Ipsum | | |
| Lorem Ipsum easylogic Lorem Ipsum | | |
| Lorem Ipsum Lorem Ipsum | | |
What I want:

  • When column description and keybenefits (14) contains the value 'cooltouch' column brand_cooltouch (1711) needs to be set to value 1 (int).
  • When column description and keybenefits (14) contains the value 'easylogic' column brand_easylogic (1712) needs to be set to value 1 (int).

Output that I want:
| description and keybenefits (14) | brand_cooltouch (1711) | brand_easylogic (1712) |
| ------------ | ------------ | ------------ |
| Lorem Ipsum cooltouch Lorem Ipsum | 1 | |
| Lorem Ipsum Lorem Ipsum easylogic | | 1 |
| Lorem Ipsum Lorem Ipsum | | |
Any help is very much appreciated.

8zzbczxx

8zzbczxx1#

可以使用pandas.Series.str.contains
对于字符串cooltouch,请执行以下操作

df['brand_cooltouch (1711)'] = df['description and keybenefits (14)'].str.contains('cooltouch', case=False).astype(int)

[Out]:

    description and keybenefits (14)  brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                       1                     None
1  Lorem Ipsum easylogic Lorem Ipsum                       0                     None
2            Lorem Ipsum Lorem Ipsum                       0                     None

对于字符串easylogic,请执行以下操作

df['brand_easylogic (1712)'] = df['description and keybenefits (14)'].str.contains('easylogic', case=False).astype(int)

[Out]:

    description and keybenefits (14)  brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                       1                     0
1  Lorem Ipsum easylogic Lorem Ipsum                       0                     1
2            Lorem Ipsum Lorem Ipsum                       0                     0
    • 注:**
  • case=False是使它不区分大小写。
a8jjtwal

a8jjtwal2#

可以使用np.where。我建议用NaN或0填充所有不满足条件的单元格。下面是使用np.nan的解决方案

df["brand_cooltouch (1711)“] = np.where(df["description and keybenefits (14)“].str.contains("cooltouch"), 1, np.nan)
df["brand_easylogic (1712)“] = np.where(df["description and keybenefits (14)“].str.contains("easylogic"), 1, np.nan)
v7pvogib

v7pvogib3#

使用Series.str.contains-

df['brand_cooltouch (1711)'] = df['description and keybenefits (14)'].str.contains("cooltouch").astype(int)
    • 产出**
description and keybenefits (14)  brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                       1                     NaN
1  Lorem Ipsum easylogic Lorem Ipsum                       0                     NaN
2            Lorem Ipsum Lorem Ipsum                       0                     NaN

如果您不希望结果列是1和0-您还可以执行以下操作-

df.loc[df['description and keybenefits (14)'].str.contains("cooltouch"), ['brand_cooltouch (1711)']] = '1'
df.loc[~df['description and keybenefits (14)'].str.contains("cooltouch"), ['brand_cooltouch (1711)']] = ''
    • 产出**
description and keybenefits (14) brand_cooltouch (1711)  brand_easylogic (1712)
0  Lorem Ipsum cooltouch Lorem Ipsum                      1                     NaN
1  Lorem Ipsum easylogic Lorem Ipsum                                            NaN
2            Lorem Ipsum Lorem Ipsum                                            NaN

相关问题