Python中的正则表达式帮助

vbopmzt1 于 2023-03-28 发布在 Python

关注(0)|答案(3)|浏览(180)

我有一个特定的用例来识别以冒号结尾（:）和以句号开头（.）的句子
我有另一个条件，如果句号后面跟着一个数字，那么它应该查找前一个句号。
下面是一个例子
输入语句：but I have been in a loop. two and fro: he has been facing some issues. I am more of a morning person (10 to 9.30):
下面是我目前在python中使用的正则表达式：[\w\s\(\)(\,\-]+(?<!\d)(?<![0-9]\)):(?<!\d)(?<![0-9]\))|(\w)+\s(\w)+\s(?<!\d)(?<![0-9]\)):|(?<!\d)(\w)+\s(((\w)\s)*(\w))\s(?<!\d)(?<![0-9]\)):
这只匹配two and fro:我想匹配2个语句，它们是：
1.两个来回：
1.我是一个早起的人（10到9.30）：
我看待问题陈述的方式是，我从找到一个冒号开始，然后往回遍历，直到找到一个句号，然后检查这个句号后面是否有一个数字，如果是，那么我需要进一步往回遍历，找到另一个没有数字的句号。

python

来源：https://stackoverflow.com/questions/75852540/regex-occurrence-help-in-python

3条答案

按热度按时间

k4emjkb11#

下面是我的尝试：

(?<=\. ).*?:

演示：regex101
它将匹配.后面的任何内容，后跟空格``，并以冒号:结束

(?<=\. )正后视(?<=)一个点\.和一个空格``
.*?匹配任何.非贪婪*?（尽可能少）
:以冒号结尾

赞(0）回复(0）举报 2023-03-28

svdrlsy42#

使用re.findall()
该模式使用正向后查找(?<=\. )来匹配前面有句点和空格的任何字符
\w.*?:匹配any word character后跟any characters零次或多次，直到找到冒号。

import re

string = "but I have been in a loop. two and fro: he has been facing some issues. I am more of a morning person (10 to 9.30):"

pattern = re.compile(r'(?<=\. )\w.*?:')
target = re.findall(pattern, string)
print(target)

['two and fro:', 'I am more of a morning person (10 to 9.30):']

您也可以使用split()来获得相同的输出：

data = string.split(". ")
target = [f'{i.split(":")[0]}:' for i in data if ":" in i]
print(target)

赞(0）回复(0）举报 2023-03-28

f8rj6qna3#

下面是一个简单的正则表达式，它可以完成你所要求的任务。

>>> import re
>>> text = '''but I have been in a loop. two and fro: he has been facing some issues. I am more of a morning person (10 to 9.30):'''
>>> re.findall(r'(?<=\.(?=\D)\s*)[^:]*:', text)
['. two and fro:', '. I am more of a morning person (10 to 9.30):']

lookahead \.(?=\D)表示句号后面必须紧跟一个非数字。
如果你想省略前一句中的句号，你可以把它变成一个lookbehind。

(?<=\.(?=\D))[^:]*:

Python re不允许在lookbehind中添加一个像\s*这样的可变宽度表达式，但是第三方regex库允许这样做。

import regex
regex.findall(r'(?<=\.(?=\D)\s*)[^:]*:', text)

也许你实际上也想在文本的开头匹配一个句子？

>>> regex.findall(r'(?<=\A|\.(?=\D)\s*)[^\s:.](?:[^:.]|\.\d)*:', 'help: ' + text)
['help:', 'two and fro:', 'I am more of a morning person (10 to 9.30):']

这被重构为在实际匹配中允许的内容更加挑剔。

赞(0）回复(0）举报 2023-03-28

我来回答

Python中的正则表达式帮助

3条答案

相关问题

热门标签

最新问答