python-3.x 如果列表中的元素出现在列的每一行中,则对其进行计数,添加到新列(Pandas)

bxgwgixi  于 2022-11-26  发布在  Python
关注(0)|答案(3)|浏览(145)

我有一个Pandasdf是这样的:

MEMBERSHIP
[2022_K_, EWREW_NK]
[333_NFK_,2022_K_, EWREW_NK, 000]

我有一张钥匙的清单:

list_k = ["_K_","_NK_","_NKF_","_KF_"]

我想添加并创建一个列,该列对该列中是否存在任何元素进行计数。所需输出为:

MEMBERSHIP                        | COUNT
[2022_K_, EWREW_NK]               | 2
[333_NFK_,2022_K_, EWREW_NK, 000] | 3

你能帮我吗?

w46czmvw

w46czmvw1#

IIUC,您可以使用panda.str访问方法和正则表达式:

import pandas as pd
df = pd.DataFrame({'MEMBERSHIP':[['2022_K_', 'EWREW_NK'],
                                ['333_NFK_','2022_K_', 'EWREW_NK', '000']]})

list_k = ["_K_","_NK","_NFK_","_KF_"] #I changed this list a little
reg = '|'.join(list_k)
df['count'] = df['MEMBERSHIP'].explode().str.contains(reg).groupby(level=0).sum()
print(df)

输出量:

MEMBERSHIP  count
0                 [2022_K_, EWREW_NK]      2
1  [333_NFK_, 2022_K_, EWREW_NK, 000]      3
huwehgph

huwehgph2#

可以使用lambda函数:

def check(x):
    total=0
    for i in x:
        if type(i) != str: #if value is not string pass.
            pass
        else:
            for j in list_k:
                if j in i:
                    total+=1
    return total
                
df['count']=df['MEMBERSHIP'].apply(lambda x: check(x))
zwghvu4y

zwghvu4y3#

我想出了这个愚蠢的密码

count_row=0
df['Count']= None
for i in df['MEMBERSHIP_SPLIT']:
  count_element=0

  for sub in i:
    for e in list_k:
      if e in sub:
        count_element+=1
        df['Count'][count_row]=count_element
  count_row += 1

相关问题