regex 用字符数替换字符

vjrehmav  于 2023-05-19  发布在  其他
关注(0)|答案(4)|浏览(132)

我有XXXXXXX99,我想用X(7)99替换它,因为有7个X字符后跟99

>>> import re
>>> s = "XXXXXXX99"
>>> re.sub(r"(X+)", "X(the count)", s)
'X(the count)99'

这两个:

>>> re.sub(r"(X+)", "X(" + len(\1) + ")", s)
>>> re.sub(r"(X+)", "X(" + len(\\1) + ")", s)

给予:SyntaxError: unexpected character after line continuation character
在一般情况下,字符串可能更复杂,例如XXXX99_999999XX99.99。我将重点关注大于5的重复,这意味着这个例子将成为XXXX99_9(6)XX99.99

3lxsmp7m

3lxsmp7m1#

这是itertools.groupby更好地处理的事情:

from itertools import groupby

s = "XXXXXXX99"
r = "".join(f"{c}({len(g)})" if len(g)>2 else c*len(g) 
            for c,(*g,) in groupby(s))

print(r) # X(7)99
hsgswve4

hsgswve42#

你可以在len中使用lambda函数:

>>> import re
>>> p = 'XXXX99_999999XX99.99'
print (re.sub(r'(.)\1{4,}', lambda m: f'{m.group(1)}({len(m.group())})', p))
XXXX99_9(6)XX99.99

>>> s = "XXXXXXX99"
>>> print (re.sub(r'(.)\1{4,}', lambda m: f'{m.group(1)}({len(m.group())})', p))
'X(7)99'

Code Demo

RegEx详情:

  • (.):匹配任意字符
  • \1{4,}:匹配相同字符的4个或更多重复
oalqel3c

oalqel3c3#

你可以试试这样的方法:

def replace_with_count(s, char):
    count = s.count(char)
    return f'{char}({count}){s.replace(char, "")}'

s = 'XXXXXXX99'
print(replace_with_count(s, 'X'))

如果成功了告诉我。

guicsvcw

guicsvcw4#

code_1 = "XXXXXXX99"

code_2 = "AAAABB99"

def simplify(code:str)->str:
    num_part = []
    str_part = []
    for char in code[::-1]:
        if char.isdigit():
            num_part.append(char)
        else:
            str_part.append(char)
    prefix = ""
    for i in sorted(set(str_part), key = str_part.index, reverse=True):
        prefix += f"{i}({str_part.count(i)})"
    return prefix + "".join(num_part)

simplified_code = simplify(code_1)

print(s_code)

相关问题