Regex模式-夫妇的名字

f4t66c6m 于 2023-10-22 发布在其他

关注(0)|答案(2)|浏览(95)

我正在寻找一个正则表达式模式，它匹配文本中间的“Jane and John Smith”（我的意思是名和姓）这样的模式。它不应该匹配“简·史密斯和约翰·史密斯”。
然后，我想用“Jane Smith and John Smith”（简·史密斯和约翰·史密斯）代替所有找到的“Jane and John Smith”（简·史密斯和约翰·史密斯）。我的意思是任何匹配[A-Z][a-z]+的词。
我这样写道：

r'([A-Z][a-z]+)\s+and\s+([A-Z][a-z]+)\s+([A-Z][a-z]+)'

但这也符合“简·史密斯和约翰·史密斯”我不知道如何排除“和”之前的“两个字”。我用python。如果你能帮忙的话，我将不胜感激。

regex

来源：https://stackoverflow.com/questions/77207553/regex-pattern-couples-names

2条答案

按热度按时间

abithluo1#

正如@trincot建议的那样，你应该首先了解名称可能具有的所有形状。
顺便说一句，假设你提供的例子，我试图想出一个正则表达式的情况下，“[一个单一的名字]和[一个单一的名字] [一个单一的姓氏]"。

建议的解决方案

我认为你正在寻找一种向后看的方法，但是它变成了向后看/后不接受+或 * 运算符。通过使用一个负后向查找的标题词不precedented由点或一个积极的后向查找的标题词precedented由点+空间，我认为这个问题是解决。
换句话说：

匹配”。但简和约翰·史密斯”
它不匹配“. But.”或“Jane Smith and John Smith”。

((?<=\.\s[A-Za-z])|(?<![A-Za-z]))[a-z]+\s([A-Z]\w+)\sand\s([A-Z][a-z]+)\s([A-Z][a-z]+)

这能解决你的问题吗？Please let me know.

免责声明

我给了你一个示例性问题的示例性答案（即，在简单的“[一个名字]和[一个名字] [一个姓氏]”模式上），请遵循@trincot评论以获得良好的通用结果。

赞(0）回复(0）举报 2023-10-22

91zkwejq2#

s = "Jane and John Smith something else  Mary Schmidt and Bill Schmidt"

# first find 4 words strings:  ['Jane and John Smith ', 'Schmidt and Bill Schmidt']
lst = re.findall(r'[A-Z]\w+\s+and\s+[A-Z]\w+\s+[A-Z]\w+\s*',s)

# then make substitution if the first word and the last word do not match
for it in lst:
        lst = it.split()
        word1 = lst[0] 
        word2 = lst[-1:].pop()
        if  word1 == word2:
            continue
        else:
            print(re.sub(fr'({word1})', rf'\1 {word2}',s))

# Jane Smith and John Smith something else  Mary Schmidt and Bill Schmidt

赞(0）回复(0）举报 2023-10-22

我来回答

Regex模式-夫妇的名字

2条答案

相关问题

热门标签

最新问答