pandas 根据子字符串位置在两个未知索引之间的列中检索单元格字符串值

ccrfmcuu 于 2023-02-02 发布在其他

关注(0)|答案(2)|浏览(90)

我需要找到单词“then”出现在Words表中的第一个位置。我正在尝试获取代码，以合并“text”列中从此位置开始的所有字符串，直到包含子字符串“666”或“999”的第一个文本为止（在本例中是它们的stoma 22、fe 156、sligh 334、pain 666的组合（所需的subtrings_output = 'theirfe 156 sligh 334 pain 666'）。我尝试过：

their_loc = np.where(words['text'].str.contains(r'their', na =True))[0][0]
666_999_loc = np.where(words['text'].str.contains(r'666', na =True))[0][0]
subtrings_output = Words['text'].loc[Words.index[their_loc:666_999_loc]]

正如您所看到的，我不确定如何扩展666_999_loc的条件以包含子字符串666或999，而且分割两个变量之间的索引也会导致错误。
单词表：
| 页码|正文|字型|
| - ------|- ------|- ------|
| 1个|他们|无|
| 1个|吃|无|
| 1个|苹果|无|
| 第二章|以及|无|
| 第二章|那么|1个|
| 第二章|他们的|无|
| 第二章|气孔22|无|
| 第二章|铁156|1个|
| 第二章|sligh334|无|
| 第二章|疼痛666|1个|
| 第二章|给定|无|
| 第二章|该|1个|
| 三个|果实|无|

pandas

来源：https://stackoverflow.com/questions/75279012/retrieve-cell-string-values-in-a-column-between-two-unknown-indexes-based-on-sub

2条答案

按热度按时间

kknvjkwl1#

您只需要为切片的末尾添加一个条件，并使用|操作符将or条件添加到666_or_999_loc的np.where。

text_col = words['text']

their_loc = np.where(text_col.str.contains(r'their', na=True))[0][0]

contains_666_or_999_loc = np.where(text_col.str.contains('666', na=True) |
                                   text_col.str.contains('999', na=True))[0][0]

subtrings_output = ''.join(text_col.loc[words.index[their_loc:contains_666_or_999_loc + 1]])

print(subtrings_output)

输出：

theirstoma22fe156sligh334pain666

赞(0）回复(0）举报 2023-02-02

zbq4xfa02#

IIUC，使用pandas.Series.idxmax和"".join()。
Series.idxmax(axis=0, skipna=True, *args, **kwargs)

- 返回最大值的行标签**。如果多个值等于最大值，则返回具有该值的第一个行标签。

因此，假设（Words）是您的 Dataframe ，请尝试以下操作：

their_loc = Words["text"].str.contains("their").idxmax()

_666_999_loc = Words["text"].str.contains("666").idxmax()

subtrings_output = "".join(Words["text"].loc[Words.index[their_loc:_666_999_loc+1]])

输出：

print(subtrings_output)
#theirstoma22fe156sligh334pain666

#their stoma22 fe156 sligh334 pain666 # <- with " ".join()

赞(0）回复(0）举报 2023-02-02

我来回答

pandas 根据子字符串位置在两个未知索引之间的列中检索单元格字符串值

2条答案

相关问题

热门标签

最新问答