pandas 从字符串数组中提取其中包含子字符串的字符串(Python)

c9x0cxw0 于 2023-02-06 发布在 Python

关注(0)|答案(2)|浏览(144)

Python（3.9.5）和Pandas中的一个问题：
假设我有一个字符串数组x，我想提取包含某个子字符串的所有元素，例如feb05，有没有一种Python方法可以在一行中完成，包括使用Pandas函数？
举例说明我的意思：

x = ["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"]
must_contain = "feb05"
desired_output = ["2023_feb05", "2024_feb05"]

我可以运行一个循环，

import numpy as np
import pandas as pd

desired_output = []
indices_bool = np.zeros(len(x))
for idx, test in enumerate(x):
   if must_contain in test:
      desired_output.append(test)
      indices_bool[idx] = 1

但我想用一种更像Python的方式来做。
在我的应用程序中，x是Pandas Dataframe 中的一列，因此也欢迎使用Pandas函数的答案，目的是过滤所有在x字段中包含must_contain的行（例如x = df["names"]）。

pandas

来源：https://stackoverflow.com/questions/75354820/extracting-from-an-array-of-strings-strings-that-contain-a-substring-in-them-p

2条答案

按热度按时间

yh2wf1be1#

既然你和Pandas在一起，你可以使用str.contains来得到布尔条件：

import pandas as pd
df = pd.DataFrame({'x': ["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"]})
must_contain = "feb05"

df.x.str.contains(must_contain)
#0    False
#1    False
#2    False
#3     True
#4     True
#Name: x, dtype: bool

按条件筛选：

df[df.x.str.contains(must_contain)]
#            x
#3  2023_feb05
#4  2024_feb05

赞(0）回复(0）举报 2023-02-06

lyfkaqu12#

没有Pandas

list(filter(lambda y: must_contain in y,x))

["2023_feb05", "2024_feb05"]

Pandas

series=pd.Series(["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"])
must_contain = "feb05"
series[series.str.contains(must_contain)].to_list()

["2023_feb05", "2024_feb05"]

赞(0）回复(0）举报 2023-02-06

我来回答

pandas 从字符串数组中提取其中包含子字符串的字符串(Python)

2条答案

相关问题

热门标签

最新问答