递归遍历xml并将输出打印到excel/csv

kh212irz  于 2021-09-08  发布在  Java
关注(0)|答案(1)|浏览(223)

我的xml如下所示:

<?xml version="1.0" encoding="utf-8"?>
<DEFTABLE xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Folder.xsd">
    <FOLDERA DC="123" VR="A1" PT="UN">
        <TASK TASKID="1" APLN="StuffA" DATE="20211117" Name="StuffA1" >
            <COMMONA NAME="1233" TYPE="E" />
            <COMMONB NAME="ABCD" />
        </TASK>
        <TASK TASKID="2" APLN="StuffA" DATE="20211117" Name="StuffA2" >
        </TASK>
    </FOLDERA>
    <FOLDERB DC="123" VR="A1" PT="UN" ATTIA="UN" ATTIB="UN">
        <TASK TASKID="3" APLN="StuffB" DATE="20211117" Name="StuffA1" >
            <COMMONA NAME="1233" TYPE="E" />
            <COMMONB NAME="ABCD" />
        </TASK>
        <TASK TASKID="4" APLN="StuffC" DATE="20211117" Name="StuffA2" >
        </TASK>
    </FOLDERB>
</DEFTABLE>

我正在用电脑读它 ElementTree :

tree = ET.parse("./Test.xml")
root = tree.getroot()

for child in root:
    print(child.tag,child.attrib)
    for x in root.iter('JOB'):
        print(x.tag,x.attrib)

问题是它正在打印所有根值severyti,me:

FOLDERA {'DC': '123', 'VR': 'A1', 'PT': 'UN'}
TASK {'TASKID': '1', 'APLN': 'StuffA', 'DATE': '20211117', 'Name': 'StuffA1'}
TASK {'TASKID': '2', 'APLN': 'StuffA', 'DATE': '20211117', 'Name': 'StuffA2'}
TASK {'TASKID': '3', 'APLN': 'StuffB', 'DATE': '20211117', 'Name': 'StuffA1'}
TASK {'TASKID': '4', 'APLN': 'StuffC', 'DATE': '20211117', 'Name': 'StuffA2'}
FOLDERB {'DC': '123', 'VR': 'A1', 'PT': 'UN', 'ATTIA': 'UN', 'ATTIB': 'UN'}
TASK {'TASKID': '1', 'APLN': 'StuffA', 'DATE': '20211117', 'Name': 'StuffA1'}
TASK {'TASKID': '2', 'APLN': 'StuffA', 'DATE': '20211117', 'Name': 'StuffA2'}
TASK {'TASKID': '3', 'APLN': 'StuffB', 'DATE': '20211117', 'Name': 'StuffA1'}
TASK {'TASKID': '4', 'APLN': 'StuffC', 'DATE': '20211117', 'Name': 'StuffA2'}

我希望得到这样的输出:

FOLDERA {'DC': '123', 'VR': 'A1', 'PT': 'UN'}, TASK {'TASKID': '1', 'APLN': 'StuffA', 'DATE': '20211117', 'Name': 'StuffA1'}
FOLDERA {'DC': '123', 'VR': 'A1', 'PT': 'UN'}, TASK {'TASKID': '2', 'APLN': 'StuffA', 'DATE': '20211117', 'Name': 'StuffA2'}
FOLDERB {'DC': '123', 'VR': 'A1', 'PT': 'UN', 'ATTIA': 'UN', 'ATTIB': 'UN'},TASK {'TASKID': '3', 'APLN': 'StuffB', 'DATE': '20211117', 'Name': 'StuffA1'}
FOLDERB {'DC': '123', 'VR': 'A1', 'PT': 'UN', 'ATTIA': 'UN', 'ATTIB': 'UN'}, TASK {'TASKID': '4', 'APLN': 'StuffC', 'DATE': '20211117', 'Name': 'StuffA2'}

i、 e.每个属性都是表中的一列。
关于如何正确遍历它的任何输入?请注意,对于任务和文件夹,attibutes的数量可能会有所不同

8hhllhi2

8hhllhi21#

如果您只对输出两级文件夹和任务感兴趣,则不需要递归方法。要获取每个文件夹下的任务,只需调用 iter 方法绑定到每个子节点而不是根节点:

for child in root:
    for task in child.iter('TASK'):
        print(*(f'{node.tag} {node.attrib}' for node in (child, task)), sep=', ')

相关问题