pandas 基于其他具有多个值的列添加df列Map到相同的新列值

7dl7o3gd 于 2023-06-04 发布在其他

关注(0)|答案(3)|浏览(176)

我有一个这样的dataframe：

df1 = pd.DataFrame({'col1' : ['cat', 'cat', 'dog', 'green', 'blue']})

我需要一个新列来给出类别，像这样：

dfoutput = pd.DataFrame({'col1' : ['cat', 'cat', 'dog', 'green', 'blue'],
                         'col2' : ['animal', 'animal', 'animal', 'color', 'color']})

我知道我可以使用.loc低效地完成它：

df1.loc[df1['col1'] == 'cat','col2'] = 'animal'
df1.loc[df1['col1'] == 'dog','col2'] = 'animal'

如何将cat和dog合并为animal？这不起作用：

df1.loc[df1['col1'] == 'cat' | df1['col1'] == 'dog','col2'] = 'animal'

pandas

来源：https://stackoverflow.com/questions/54031812/adding-a-df-column-based-on-other-column-with-multiple-values-map-to-the-same-ne

3条答案

按热度按时间

kx7yvsdv1#

构建dict，然后执行map

d={'dog':'ani','cat':'ani','green':'color','blue':'color'}
df1['col2']=df1.col1.map(d)
df1
    col1   col2
0    cat    ani
1    cat    ani
2    dog    ani
3  green  color
4   blue  color

赞(0）回复(0）举报 2023-06-04

plicqrtu2#

由于多个项目可能属于一个类别，我建议您从字典Map类别到项目开始：

cat_item = {'animal': ['cat', 'dog'], 'color': ['green', 'blue']}

你可能会发现这更容易维护。* 然后 * 使用字典理解来反转你的字典，后面跟着pd.Series.map：

item_cat = {w: k for k, v in cat_item.items() for w in v}

df1['col2'] = df1['col1'].map(item_cat)

print(df1)

    col1    col2
0    cat  animal
1    cat  animal
2    dog  animal
3  green   color
4   blue   color

您也可以使用pd.Series.replace，但通常是less efficient。

赞(0）回复(0）举报 2023-06-04

dba5bblo3#

您也可以尝试使用np.select，如下所示：

options = [(df1.col1.str.contains('cat|dog')), 
           (df1.col1.str.contains('green|blue'))]

settings = ['animal', 'color']

df1['setting'] = np.select(options,settings)

我发现即使在非常大的 Dataframe 中，这种方法也能很快地工作

赞(0）回复(0）举报 2023-06-04

我来回答

pandas 基于其他具有多个值的列添加df列Map到相同的新列值

3条答案

相关问题

热门标签

最新问答