Pandas:查找列中的字符串是否包含列表中的单词,并将所有字符串替换为单词

v09wglhw  于 2023-03-11  发布在  其他
关注(0)|答案(2)|浏览(126)

我的问题是:
我有颜色列表
color_list =['红色','黑色','白色','粉红色','蓝色']
和列具有复杂字符串的 Dataframe
我需要找到如果在字符串中有颜色从我的列表和
1将所有字符串替换为列表中的颜色2如果查找并替换,则将列添加为True,如果未找到,则添加为False
谢谢你!

color_list = ['red', 'black', 'white', "a'pink", 'blue']
print(color_list)

import pandas as pd
data = ['on red', 'aup red gone', 'black with bi', "ao a' pink", 'dgfh blu', 'a black pen']

df = pd.DataFrame(data, columns=['colors'])

# print dataframe.

print('my data frame is: ')
print(df)
print(30*"#")
print('i need the answer be but i dont know how put regex for strings from list: ')

data_2 = [['red','True'], ['red','True'], ['black','True'], ["a' pink",'True'] ,['dgfh blu','False'],['black','True']]

df_2 = pd.DataFrame(data_2, columns=[['colors','is it']])
print(df_2)



['red', 'black', 'white', "a'pink", 'blue']
my data frame is: 
          colors
0         on red
1   aup red gone
2  black with bi
3     ao a' pink
4       dgfh blu
5    a black pen
##############################
i need the answer be but i dont know how put regex for strings from list: 
     colors  is it
0       red   True
1       red   True
2     black   True
3   a' pink   True
4  dgfh blu  False
5     black   True
end
uqjltbpv

uqjltbpv1#

您可以编写一个函数来检查DF值是否包含任何字符串,如果包含,则更改colors列。注意下面的代码丢失了a' pink中的空格e以生成所需的结果。我不确定这是否更广泛地适用于您的语言结构。

import pandas as pd

color_list = ['red', 'black', 'white', "a'pink", 'blue']
pattern = '|'.join(color_list)

data = ['on red', 'aup red gone', 'black with bi', "ao a' pink", 'dgfh blu', 'a black pen']

df = pd.DataFrame(data, columns=['colors'])

def func(row):
    for word in color_list:
    if word in row['colors'].replace("' ", "'"):
            row['is it'] = True
            row['colors'] = word
            break
    else:
        row['is it'] = False
    return row

df = df.apply(func, axis = 1)

print(df)

其给出:

colors  is it
0       red   True
1       red   True
2     black   True
3    a'pink   True
4  dgfh blu  False
5     black   True
cs7cruho

cs7cruho2#

您应该能够合并字典D={ }
为便于说明:

>>>
>>> import re
>>>
>>> D = {}
>>> D['red'] = 'True'
>>> D['black'] = 'True'
>>> D["a' pink"] = 'True'
>>> D['dgfh blu'] = 'False'
>>>
>>> column = '''
...          on red
...    aup red gone
...   black with bi
...      ao a' pink
...        dgfh blu
...     a black pen
... '''
>>>
>>> def repl(m):
...   return m.group(1)+"  "+D[m.group(1)]
...
>>> column = re.sub(r"(?m)^.*?(red|black|a'[ ]pink|dgfh[ ]blu)\b.*?$",repl,column)
>>>
>>> print(column)

red  True
red  True
black  True
a' pink  True
dgfh blu  False
black  True

>>>

相关问题