regex 正则表达式在某些字符串中包含大括号

我有一个电子表格，其中包含一个“查找”列和一个“替换”列。注意，有些字符串是其他字符串的子集。

Find                    Replace
{Example1}              {50M00_Dewirer_South\Example1}
{Example1\Alarm}        {50M00_Dewirer_South\Example1\Alarm}
{Example1\AlarmHigh}    {50M00_Dewirer_South\Example1\AlarmHigh}
{Example1\AlarmLow}     {50M00_Dewirer_South\Example1\AlarmLow}
Example2                50M00_Dewirer_South\Example2
Example2foo             50M00_Dewirer_South\Example2foo
Example2foobar          50M00_Dewirer_South\Example2foobar
ATag                    Device_Shortcut\DirectReference
Another_Tag             Winder\Local:50:I.Data.0
Another\Tag             Winder\Local:12:O.Data.1

我需要在文件目录中搜索每个搜索词，并将搜索到的词替换为其关联的替换词。我正在搜索的文件可能还包含大小写错误。{Example 1}可能显示为{example 1}，{ExAmPle 1}，{exAMplE}1，或任何其它大小写字符的组合。我尝试使用正则表达式，因为我的蛮力搜索所有文件的尝试太慢了。
我已经成功地将一个正则表达式组合在一起，它可以处理不包含花括号{}的字符串。但是，如果字符串包含花括号，我的搜索函数将无法在正在搜索的文件中找到任何内容。

pattern = re.compile(
    r'\b(?:%s)\b' % '|'.join([re.escape(term) for term in replace_dict]),
    re.IGNORECASE
)

我应该如何构造正则表达式以将花括号作为搜索词的一部分？另外，如果我的搜索词不包含花括号，新的正则表达式是否仍然可以使用，或者我是否必须恢复到当前模式？
编辑：我可能应该扩大这个问题，因为我还没有发现所有可能的特殊字符，我需要搜索.可以创建一个正则表达式，可以潜在地包含任何组合的特殊字符？

以下是当前版本问题的解答

在最初的尝试中，你使用re.escape()的方法是正确的。为了清楚起见，我将下面的列表解析转换为for循环。查找模式在列表find_patterns中，替换在replacements中，示例输入字符串在sample_inputs中：

import re

find_patterns = [
    "{Example1}",
    "{Example1\Alarm}",
    "{Example1\AlarmHigh}",
    "{Example1\AlarmLow}",
    "Example2",
    "Example2foo",
    "Example2foobar",
    "ATag",
    "Another_Tag",
    "Another\Tag"
]
replacements = [
    "{50M00_Dewirer_South\Example1}",
    "{50M00_Dewirer_South\Example1\Alarm}",
    "{50M00_Dewirer_South\Example1\AlarmHigh}",
    "{50M00_Dewirer_South\Example1\AlarmLow}",
    "50M00_Dewirer_South\Example2",
    "50M00_Dewirer_South\Example2foo",
    "50M00_Dewirer_South\Example2foobar",
    "Device_Shortcut\DirectReference",
    "Winder\Local:50:I.Data.0",
    "Winder\Local:12:O.Data.1"
]
sample_inputs = [
    "blAh(blah}{&^%$#@!blah{Example1}more_stuff&$^%{",
    "blAh(blah}{&^%$#@!blah{Example1\Alarm}more_stuff&$^%{",
    "blAh(blah}{&^%$#@!blah{Example1\AlarmHigh}more_stuff&$^%{",
    "blAh(blah}{&^%$#@!blah{Example1\AlarmLow}more_stuff&$^%{",
    "blAh(blah}{&^%$#@!blahExample2more_stuff&$^%{",
    "blAh(blah}{&^%$#@!blahExample2foomore_stuff&$^%{",
    "blAh(blah}{&^%$#@!blahExample2foobarmore_stuff&$^%{",
    "blAh(blah}{&^%$#@!blahATagmore_stuff&$^%{",
    "blAh(blah}{&^%$#@!blahAnother_Tagmore_stuff&$^%{",
    "blAh(blah}{&^%$#@!blahAnother\Tagmore_stuff&$^%{"
]
new_strings = []
for find_pattern, replacement, sample_input in zip(find_patterns, replacements, sample_inputs):
    new_string = re.sub(re.escape(find_pattern), re.escape(replacement), sample_input, flags=re.IGNORECASE)
    new_strings.append(new_string)
    print(f"{sample_input}\n{new_string}\n")

输出：

blAh(blah}{&^%$#@!blah{Example1}more_stuff&$^%{
blAh(blah}{&^%$#@!blah\{50M00_Dewirer_South\Example1\}more_stuff&$^%{

blAh(blah}{&^%$#@!blah{Example1\Alarm}more_stuff&$^%{
blAh(blah}{&^%$#@!blah\{50M00_Dewirer_South\Example1\Alarm\}more_stuff&$^%{

blAh(blah}{&^%$#@!blah{Example1\AlarmHigh}more_stuff&$^%{
blAh(blah}{&^%$#@!blah\{50M00_Dewirer_South\Example1\AlarmHigh\}more_stuff&$^%{

blAh(blah}{&^%$#@!blah{Example1\AlarmLow}more_stuff&$^%{
blAh(blah}{&^%$#@!blah\{50M00_Dewirer_South\Example1\AlarmLow\}more_stuff&$^%{

blAh(blah}{&^%$#@!blahExample2more_stuff&$^%{
blAh(blah}{&^%$#@!blah50M00_Dewirer_South\Example2more_stuff&$^%{

blAh(blah}{&^%$#@!blahExample2foomore_stuff&$^%{
blAh(blah}{&^%$#@!blah50M00_Dewirer_South\Example2foomore_stuff&$^%{

blAh(blah}{&^%$#@!blahExample2foobarmore_stuff&$^%{
blAh(blah}{&^%$#@!blah50M00_Dewirer_South\Example2foobarmore_stuff&$^%{

blAh(blah}{&^%$#@!blahATagmore_stuff&$^%{
blAh(blah}{&^%$#@!blahDevice_Shortcut\DirectReferencemore_stuff&$^%{

blAh(blah}{&^%$#@!blahAnother_Tagmore_stuff&$^%{
blAh(blah}{&^%$#@!blahWinder\Local:50:I\.Data\.0more_stuff&$^%{

blAh(blah}{&^%$#@!blahAnother\Tagmore_stuff&$^%{
blAh(blah}{&^%$#@!blahWinder\Local:12:O\.Data\.1more_stuff&$^%{

查找模式和替换模式都需要转义，而输入字符串不需要转义。
编辑：以下是我对original question的原始回答。
你想多了-你不需要担心括号，大括号或其他特殊字符。你所需要做的就是找到r"Example\d"，其中\d是一个数字，大小写被忽略，然后用r"50M00_Dewirer_South\\\g<0>"替换它，其中\\是转义的\字符，\g<0>是整个匹配的子字符串。在Python中，其中strings是“查找”字符串的列表：

import re

strings = ["{Example1}",
"{Example1\\Alarm}",
"{Example1\\AlarmHigh}",
"{Example1\\AlarmLow}",
"Example2",
"Example2foo",
"Example2foobar"]

pattern = re.compile(r"Example\d", flags=re.IGNORECASE)
replaced_strings = [re.sub(pattern, r"50M00_Dewirer_South\\\g<0>", item) for item in strings]

输出：

['{50M00_Dewirer_South\\Example1}', 
 '{50M00_Dewirer_South\\Example1\\Alarm}',
 '{50M00_Dewirer_South\\Example1\\AlarmHigh}',
 '{50M00_Dewirer_South\\Example1\\AlarmLow}',
 '50M00_Dewirer_South\\Example2',
 '50M00_Dewirer_South\\Example2foo',
 '50M00_Dewirer_South\\Example2foobar']

这些只是字符串的Python表示，当它们被写入文件时，它们只包含一个\。

regex 正则表达式在某些字符串中包含大括号

1条答案

相关问题

热门标签

最新问答