我的input.txt如下
\n\n \n \n\n \n\n\r\n\r\n \r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n \r\n \r\n \r\n\r\n\r\n \r\n\r\n\r\n\r\n \r\n\r\n\r\n\r\n\r\n\r\n\r\n hello boys\r\n boys\r\n boys\r\n\r\n Download PDF\r\n PDF\r\n X\r\n Close Window\r\n\r\n\r\n\r\n
This boys text has undergone conversion so that it is mobile and web-friendly. This may have created formatting or alignment issues. Please refer to the PDF copy for a print-friendly version.
\r\n\r\n\r\n\r\n\r\n\r\n\r\n \r\n
BOYS CLUB AUSTRALIA
\r\n
26 July 2019
\r\n
hello boys
\r\n
\r\nhello boys
\r\n
--------------------------------------------------------------------------------------------------------------------------------------
\r\n
Introduction
\r\n\r\n
1. \r\n
This letter to let you know that your application has been successful with our school
\r\n
我正在尝试删除不必要的模式,如“\n\n”、“\r\r”、“\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n”。分析时,我希望删除所有特殊模式,并只使用文本和数字。
我已经试过了。
with open (data, "r", encoding='utf-8') as myfile:
for line in myfile:
line.rstrip()
line = re.sub(r'\r\n', '', line)
with open("out.txt", "a", encoding='utf-8') as output:
output.write(line)
但即使是'\r\n'也不会从输出档中移除。谢谢。
3条答案
按热度按时间k4ymrczo1#
您可以使用
replace()
方法,以空字串取代\r\n
子字串。o7jaxewo2#
在Python中,要删除文本文件中的特殊字符,如换行符(\n),可以使用str类的replace()方法,该方法允许用其他字符或字符串替换特定的字符或字符串。
x7yiwoj43#
您可以使用replace,也可以使用
str.isprintable
来过滤掉不可打印的字符:因此,输出为: