包含字符串的列的Pandas布尔掩码

js4nwp54 于 2023-09-29 发布在其他

关注(0)|答案(2)|浏览(87)

Python新手在这里，试图创建一个布尔掩码子集数据集。我将感谢任何指导，如何使这个面具的工作和我做错了什么。谢谢你，谢谢
我的df看起来像这样：

import pandas as pd

df= pd.DataFrame({'A': ['A-dog', "B-dog","C-cat" , "E-snake", "F-hamser"],
                  'B': ['F-dog', "B-parrot","C-snake" , "E-cat", "F-bird"],
                  'C': [1, 2, 3, 4, 5],
                  'D': [22,23,24,25,26],
                  'E': ['A-snake', "B-dog","C-snake" , "E-snake", "F-snake"],
                  'Flag': [0,0,0,0,0]})

df

我想评估列A，B和E，并将以“狗”和“猫”结尾的单元格替换为“”，并将我进行替换的行的“标志”列更改为1。
我想创建一个布尔掩码，这样我就可以替换字符串并将“Flag”更改为1，但我的掩码不起作用。
这就是我所尝试的：

cols=['A','B','E']
mask=df[cols].apply(lambda x: 'dog'  or 'cat' in x[-3:])
# x[-3:] to select the last three characters of the string.
# If the mask were working, I would change the flag variable in this way
df.loc(mask.any(axis=1),'Flag')=1

我得到的df看起来像这样：

res= pd.DataFrame({'A': ['A-', "B-","C-" , "E-snake", "F-hamser"],
                  'B': ['F-', "B-parrot","C-snake" , "E-", "F-bird"],
                  'C': [1, 2, 3, 4, 5],
                  'D': [22,23,24,25,26],
                  'E': ['A-snake', "B-","C-snake" , "E-snake", "F-snake"],
                  'Flag': [1,1,1,1,0]})

res

pandas

来源：https://stackoverflow.com/questions/77100884/pandas-boolean-mask-for-columns-that-contain-string

2条答案

按热度按时间

xoefb8l81#

验证码

使用str.endswith函数为Flag创建一个序列。

s = df.select_dtypes('object').apply(lambda x: x.str.endswith(('dog', 'cat'))).any(axis=1).astype('int')

s

0    1
1    1
2    1
3    1
4    0
dtype: int32

然后使用replace with regex & change Flag column

df.replace({'dog$':'', 'cat$':''}, regex=True).assign(Flag=s)

产出：

A           B           C   D   E       Flag
0   A-          F-          1   22  A-snake 1
1   B-          B-parrot    2   23  B-      1
2   C-          C-snake     3   24  C-snake 1
3   E-snake     E-          4   25  E-snake 1
4   F-hamser    F-bird      5   26  F-snake 0

赞(0）回复(0）举报 2023-09-29

kmbjn2e32#

你可以用.str.endswith创建掩码（这个函数也接受值的元组）：

cols = ["A", "B", "E"]
mask = df[cols].apply(lambda x: x.str.endswith(("dog", "cat")))

df["Flag"] = mask.any(axis=1).astype(int)
df[mask] = df[mask].apply(lambda x: x.str[:-3] if x.notna().any() else x)
print(df)

图纸：

A         B  C   D        E  Flag
0        A-        F-  1  22  A-snake     1
1        B-  B-parrot  2  23       B-     1
2        C-   C-snake  3  24  C-snake     1
3   E-snake        E-  4  25  E-snake     1
4  F-hamser    F-bird  5  26  F-snake     0

赞(0）回复(0）举报 2023-09-29

我来回答

包含字符串的列的Pandas布尔掩码

2条答案

相关问题

热门标签

最新问答