如何使用Python(Python 3.9首选)高效地读取和删除带有自定义换行符的大型文件中的特定行?

ergxz8rk  于 2023-01-27  发布在  Python
关注(0)|答案(1)|浏览(110)

Similar to this question, but slightly more complex
我有一个很大的txt文件,看起来像这样:
"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
每个换行符都是一个".",文件以换行符结尾,每行正好是14个字符长。GollyJer's answer to the mentioned question很好,但是我有几个额外的要求:
1.我希望能够输入一个特定的行号,然后返回这一行
1.然后,我希望将读取的行从文件中删除。
我无法将真正的文本文件加载到RAM中,因为它超过600GB
我不知道从哪里开始修改代码来完成这个任务。这可能吗?我该怎么做呢?谢谢

aelbi1ox

aelbi1ox1#

我可能会探索海象操作符来清理这个问题,我真的不知道这是否会“足够快”。这个想法是读到你想要的点。读/打印要删除的东西,然后读剩下的:

line_to_delete = 2
with open("in.txt", "rt") as file_in:
    with open("out.txt", "wt") as file_out:
        file_out.write(file_in.read(15 * (line_to_delete -1)))
        print(file_in.read(15))
        file_out.write(file_in.read())

我认为这可能会占用大量内存,因此您可以通过执行以下操作来生成更流畅的结果:

line_to_delete = 2

with open("in.txt", "rt") as file_in:
    current_line = 1
    with open("out.txt", "wt") as file_out:
        while True:
            line = file_in.read(15)
            if not line:
                break

            if current_line == line_to_delete:
                print(line)
            else:
                file_out.write(line)

            current_line += 1

打印BBBBBBBBBBBBBB.并生成如下文件:

AAAAAAAAAAAAAA.CCCCCCCCCCCCCC.DDDDDDDDDDDDDD.EEEEEEEEEEEEEE.FFFFFFFFFFFFFF.GGGGGGGGGGGGGG.HHHHHHHHHHHHHH.IIIIIIIIIIIIII.JJJJJJJJJJJJJJ.KKKKKKKKKKKKKK.

相关问题