python-3.x 在开始词和停止词之间遍历文本文件以组装 Dataframe

4uqofj5v  于 2023-06-07  发布在  Python
关注(0)|答案(1)|浏览(92)

我有一个文本文件报告,需要遍历它,捕获一个行号以及与该行相关的所有错误和警告。使用下面的示例,我需要捕获RowID(仅限整数),然后捕获在对象名称行下面找到的所有错误和警告(忽略文本文件的其余部分)。

***********************************************************
Date/Time:    6/5/2023
FileName:    somefile.txt
Report:        Standard Report
***********************************************************
Success:    1234
Failures:    1234

RowID:    100
Name:    Smith, John
    This person did not meet the criteria because of x,y,z.
RowID:    101
Name:    Smith, Susie
    This is a warning.
    This is an error.
    Criteria was not met.
RowID:    103
Name:    Jones, Bob
    This person had invalid characters in the email field.

我尝试了以下的不同变化。

search_string = "RowID:"
next_search_string = "Name:"

with open('report.txt') as y:
    for line in y:
        if line.startswith(search_string):
            print(line.split(':')[1].strip())
        if line.startswith(next_search_string):
            print(next(y))
            while not (next(y)).startswith(search_string):
                print(next(y))
        if (next(y)).startswith(search_string):
            pass

我想要的输出是:
100,此人因x,y,z不符合标准。
第101章这是警告这是一个错误。不符合标准。
103、此人在电子邮件字段中有无效字符。

jum4pzuy

jum4pzuy1#

下面的代码可能会有所帮助:

search_string = "RowID:"
next_search_string = "Name:"

with open('report.txt') as file:
    row_id = None
    errors_warnings = []

    for line in file:
        if line.startswith(search_string):
            if row_id is not None:
                # Print the captured row ID and errors/warnings
                print(f"{row_id}, {', '.join(errors_warnings)}")
                errors_warnings = []  # Reset the list for the next row

            row_id = line.split(':')[1].strip()
        elif line.startswith(next_search_string):
            # Skip the "Name" line and start capturing errors/warnings
            next(file)
            for next_line in file:
                if next_line.startswith(search_string):
                    # Break the loop if a new row ID is found
                    break
                else:
                    errors_warnings.append(next_line.strip())

    # Print the last captured row ID and errors/warnings
    if row_id is not None:
        print(f"{row_id}, {', '.join(errors_warnings)}")

然后,输出将如所期望的那样:

100, This person did not meet the criteria because of x,y,z.
101, This is a warning. This is an error. Criteria was not met.
103, This person had invalid characters in the email field.

相关问题