python 根据B列中是否存在A列值来设置值

xa9qqrwz 于 2022-12-10 发布在 Python

关注(0)|答案(2)|浏览(143)

我有一个包含多个列的 Dataframe ，我正在设置一个lambda函数，根据一系列逻辑来设置另一个列。
Dataframe 如下所示：
| 供应商|所属部门|方案1|备选办法2|
| - -|- -|- -|- -|
| 阿克梅|阿克梅|星期一|星期二|
| 阿克梅|阿克梅|星期三||
| 阿克梅|人力资源|星期一|星期三|
| 阿克梅|人力资源|星期二|星期三|
| 阿克梅|人力资源|星期二|星期四|
等等。
我想添加一个列与文本保持某一行或不。

full_quote_df['Keep Row'] = full_quote_df.apply(
                lambda item:
                   'Keep Row' if item.Supplier == item.Department and pd.notna(item['Option 1']) else \ #If supplier and department match, and the have an option, keep
                   'Do Not Keep Row' if item.Supplier == item.Department and pd.isna(item['Option 1'])  else \ #If they leave it blank, exclude it
                   'Keep Row' if item.Supplier != item.Department and item['Option 2'].isin(list(item['Option 1'].values()))  else \ #If the supplier does not match the department, but the second option exists in the first option, keep row
                   'Do Not Keep Row' if item.Supplier != item.Department and item['Option 2'].notin(item['Option 1'])  else \#If the supplier does not match the department, but the second option does not exist in the first option, remove row
                   None,
                axis=1
            )

因此，我的最终 Dataframe 应该如下所示：
| 供应商|所属部门|方案1|备选办法2|保留行|
| - -|- -|- -|- -|- -|
| 阿克梅|阿克梅|星期一|星期二|保留|
| 阿克梅|阿克梅|星期三||保留|
| 阿克梅|人力资源|星期一|星期三|保留|
| 阿克梅|人力资源|星期二|星期三|保留|
| 阿克梅|人力资源|星期二|星期四|不保留|
使用上面的函数，它给了我一个错误'str'对象没有属性'isin'。我不想把选项2列变成一个列表，因为我需要相对于整个列来评估特定的行，但是我不知道下一步该怎么做。

python

来源：https://stackoverflow.com/questions/74748520/set-value-based-on-if-column-a-value-exists-in-column-b

2条答案

按热度按时间

wi3ka0sx1#

可以将numpy.where与pandas.Series.isin一起使用：

m = (df["Option 2"].isna()) | (df["Option 2"].isin(df["Option 1"]))
    
df["Keep Row"] = np.where(m, "Keep", "Do Not Keep")

#输出：

print(df)
  Supplier Department   Option 1   Option 2     Keep Row
0     Acme       Acme     Monday    Tuesday         Keep
1     Acme       Acme  Wednesday        NaN         Keep
2     Acme         HR     Monday  Wednesday         Keep
3     Acme         HR    Tuesday  Wednesday         Keep
4     Acme         HR    Tuesday   Thursday  Do Not Keep

赞(0）回复(0）举报 2022-12-10

dtcbnfnu2#

如果您转而考虑将列（系列）作为一个整体而不是单个元素进行操作，那么使用pandas会有更好的效果。
您可以构建保存中间结果的Series，这样就不会每次都重新计算它们，而且它还可以使逻辑更易于阅读。

supplier_dept_match = full_quote_df['Supplier'] == full_quote_df['Department']
opt1_is_na = full_quote_df['Option 1'].isna()

# Note: the following logic is a bit unclear from your question.
# This approach will look for the option 2 values in the entire option 1 column.
# If you want to look for a certain option 1 subset, then you should clarify how this should behave.
opt2_in_any_opt1 = full_quote_df['Option 2'].isin(full_quote_df['Option 1'])

# Then, you can build up your logic all at once with boolean expressions:
keep = (supplier_dept_match & ~opt1_is_na) | (~supplier_dept_match & opt2_in_any_opt1)
full_quote_df['Keep Row'] = keep.map({True: "Keep", False: "Do Not Keep"})

希望这能有所帮助！

赞(0）回复(0）举报 2022-12-10

我来回答

python 根据B列中是否存在A列值来设置值

2条答案

#输出：

相关问题

热门标签

最新问答