python 使用Regex合并以引号开头的行

qncylg1j  于 2022-11-21  发布在  Python
关注(0)|答案(1)|浏览(139)

我想合并两行,只有一个换行符\n,有时下一行以引号开始。我尝试使用以下代码来合并它们,用\"查找引号,

comb_nextline = re.sub(r'(?<=[^\.][A-Za-z,-])\n[ ]*(?=[a-zA-Z0-9\(\"])', ' ', txt)

但是对于以引号开头的行不起作用。有没有办法合并以引号开头的行?谢谢!
我的文本如下所示:

import re
 
txt= '''
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output
(I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called 
"chip joining", RTC offers both a near infrared or forced convection oven.
'''

comb_nextline = re.sub(r'(?<=[^\.][A-Za-z,-])\n[ ]*(?=[a-zA-Z0-9\(\"])', ' ', txt)
print(comb_nextline)

我希望能得到这个

txt = 
'''
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output (I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called "chip joining", RTC offers both a near infrared or forced convection oven.
'''
xytpbqjk

xytpbqjk1#

也可以在匹配换行符之前匹配可选空格

(?<=[^.][A-Za-z,-]) *\n *(?=[a-zA-Z0-9(\"])

Regex demo|Python demo
或者使用取反字符类[^\S\n]匹配所有空格而不使用换行符

(?<=[^.][A-Za-z,-])[^\S\n]*\n[^\S\n]*(?=[a-zA-Z0-9(\"])

Regex demo

import re

txt = '''
The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output
(I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called 
"chip joining", RTC offers both a near infrared or forced convection oven.
'''

comb_nextline = re.sub(r'(?<=[^.][A-Za-z,-]) *\n *(?=[a-zA-Z0-9(\"])', ' ', txt)
print(comb_nextline)

输出量

The first process, called wafer bumping, involves a reflow solder process to form the solder balls on all of the input/output (I/O) pads on the wafer. Because of the extremely small geometries involved, in some instances this process is best accomplished in a hydrogen atmosphere. RTC offers a high temperature furnace for this application, equipped with the hydrogen package, providing a re-flow process in a 100 hydrogen atmosphere. For a second process, called "chip joining", RTC offers both a near infrared or forced convection oven.

相关问题