python-3.x 如何从xml文件的深层嵌套中检索值?

xbp102n0  于 2023-06-07  发布在  Python
关注(0)|答案(2)|浏览(174)

<q4:Member> <\q4:Member>嵌套中获取<Name>Administrators</Name>值时遇到问题。我有一个用ssh PC03 "gpresult /scope computer /x:\\VBoxSvr\Exchange\PC03.xml命令生成的文件,其中保存了ADDC的GPO设置。
对于像<q4:Account> </q4:Account>这样的文件部分,我使用了pandas,一切正常

import pandas as pd
GPO_file_Path = '.\PC03.xml'
columns = ['Name', 'SettingNumber', 'SettingBoolean']
xpath_Account = './/q :Account'
namespaces_Account = {'q' : 'http://www.microsoft.com/GroupPolicy/Settings/Security'}
df_GPO_Account = pd.read_xml(GPO_file_Path, xpath=xpath_Account, namespaces=namespaces_Account)[columns]
GPO_Account_list = df_GPO_Account.values.tolist()
GPO_Account_dic = {}
for _ in GPO_Account_list:
    if str(_[1]) != 'nan':
        GPO_Account_dic[_[0]] = int(_[1])
    else:
        GPO_Account_dic[_[0]] = _[2]
print(GPO_Account_dic)

帐户设置的输出:

{'MinimumPasswordLength': 14, 'PasswordComplexity': 'true'}

但是当我想对<q4:UserRightsAssignment></q:UserRightsAssignment>使用相同的选项时,我得到了这个:
输入:

columns = ['Name', 'Member']
xpath_UserRightsAssignment = './/q :UserRightsAssignment'
namespaces_UserRightsAssignment = {'q' : 'http://www.microsoft.com/GroupPolicy/Settings/Security'}
df_GPO_Account = pd.read_xml(GPO_file_Path, xpath=xpath_UserRightsAssignment, namespaces=namespaces_UserRightsAssignment)[columns]
print(df_GPO_Account)

输出:

Name  Member
0    SeCreateGlobalPrivilege     NaN

我想进入嵌套成员(一个到几个元素),以便从中获取所有<Name></Name>并将所有内容保存到list

{'SeCreateGlobalPrivilege': {'Name':'Administratorzy', 'Name':'USŁUGA'}}

在开始使用这个文件时,我使用了xml.etree.ElementTree,多亏了它,我才到达了我想要的地方,但是代码变得太复杂太长了。在pandas中有没有更深的选择,或者有没有更快地进入xml.etree.ElementTree的方法?
xml.etree...代码:

import xml.etree.ElementTree as ET
GPO_file_Path = '.\PC03.xml'
SettingName = ""
SettingValue = ""
GPO_tree = ET.parse(GPO_file_Path)
GPO_root = GPO_tree.getroot()
GPO_dic = {}

for rootChild in GPO_root:
    ComputerResults = '{http://www.microsoft.com/GroupPolicy/Rsop}ComputerResults'
    if rootChild.tag == ComputerResults:
        for ComputerResultsChild in rootChild:
            ExtensionData = '{http://www.microsoft.com/GroupPolicy/Rsop}ExtensionData'
            if ComputerResultsChild.tag == ExtensionData:
                for ExtensionDataChild in ComputerResultsChild:
                    Extension = '{http://www.microsoft.com/GroupPolicy/Settings}Extension'
                    if ExtensionDataChild.tag == Extension:
                        for ExtensionChild in ExtensionDataChild:
                            UserRightsAssignment = '{http://www.microsoft.com/GroupPolicy/Settings/Security}UserRightsAssignment'
                            if ExtensionChild.tag == UserRightsAssignment:
                                MemberNameValue =''
                                for UserRightsAssignmentChild in ExtensionChild:
                                    Name = '{http://www.microsoft.com/GroupPolicy/Settings/Security}Name'
                                    Member = '{http://www.microsoft.com/GroupPolicy/Settings/Security}Member'
                                    if UserRightsAssignmentChild.tag == Name:
                                        SettingName = UserRightsAssignmentChild.text
                                    if UserRightsAssignmentChild.tag == Member:
                                        for MemberChild in UserRightsAssignmentChild:
                                            MemberName = '{http://www.microsoft.com/GroupPolicy/Types}Name'
                                            if MemberChild.tag == MemberName:
                                                MemberName = MemberChild.text
                                                MemberNameValue += MemberName + ', '
                                SettingValue = MemberNameValue
                                if SettingName != "":
                                    GPO_dic[SettingName] = SettingValue
                                else:
                                    print(f'Error. Key not set')

print(GPO_dic)

输出:

{'SeCreateGlobalPrivilege': 'Administratorzy, USŁUGA, USŁUGA LOKALNA, USŁUGA SIECIOWA, }

我的期望:
<q4:Name>SeCreateGlobalPrivilege</q4:Name>的帮助,结果将是一个包含数据的字典:{'SeCreateGlobalPrivilege': {'Name':'Administratorzy', 'Name':'USŁUGA', 'Name':'USŁUGA LOKALNA' , 'Name':'USŁUGA SIECIOWA'}} .我并不期待一个现成的解决方案,而是如何处理这个问题的提示。
从我的文件中的例子:

<?xml version="1.0" encoding="utf-16"?>
<Rsop xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.microsoft.com/GroupPolicy/Rsop">
    <ComputerResults>
        <ExtensionData>
            <Extension xmlns:q4="http://www.microsoft.com/GroupPolicy/Settings/Security" xsi:type="q4:SecuritySettings" xmlns="http://www.microsoft.com/GroupPolicy/Settings">
                <q4:Account>
                    <GPO xmlns="http://www.microsoft.com/GroupPolicy/Settings/Base">
                        <Identifier xmlns="http://www.microsoft.com/GroupPolicy/Types">{DE1708C7-FC4D-491C-942D-72CC5693DDC5}</Identifier>
                        <Domain xmlns="http://www.microsoft.com/GroupPolicy/Types">universum.local</Domain>
                    </GPO>
                    <Precedence xmlns="http://www.microsoft.com/GroupPolicy/Settings/Base">1</Precedence>
                    <q4:Name>MinimumPasswordLength</q4:Name>
                    <q4:SettingNumber>14</q4:SettingNumber>
                    <q4:Type>Password</q4:Type>
                </q4:Account>
                <q4:UserRightsAssignment>
                    <GPO xmlns="http://www.microsoft.com/GroupPolicy/Settings/Base">
                        <Identifier xmlns="http://www.microsoft.com/GroupPolicy/Types">{E7712944-FF52-4E72-AE83-7EC5C2D8A959}</Identifier>
                        <Domain xmlns="http://www.microsoft.com/GroupPolicy/Types">universum.local</Domain>
                    </GPO>
                    <Precedence xmlns="http://www.microsoft.com/GroupPolicy/Settings/Base">1</Precedence>
                    <q4:Name>SeCreateGlobalPrivilege</q4:Name>
                    <q4:Member>
                        <Name xmlns="http://www.microsoft.com/GroupPolicy/Types">Administratorzy</Name>
                    </q4:Member>
                    <q4:Member>
                        <Name xmlns="http://www.microsoft.com/GroupPolicy/Types">USŁUGA</Name>
                    </q4:Member>
                    <q4:Member>
                        <Name xmlns="http://www.microsoft.com/GroupPolicy/Types">USŁUGA LOKALNA</Name>
                    </q4:Member>
                    <q4:Member>
                        <Name xmlns="http://www.microsoft.com/GroupPolicy/Types">USŁUGA SIECIOWA</Name>
                    </q4:Member>
                </q4:UserRightsAssignment>
            </Extension>
            <Name xmlns="http://www.microsoft.com/GroupPolicy/Settings">Security</Name>
        </ExtensionData>
    </ComputerResults>
</Rsop>
mgdq6dx1

mgdq6dx11#

xmltodict可能会感兴趣:

import xmltodict
from   pathlib import Path

namespaces = dict.fromkeys([
   'http://www.microsoft.com/GroupPolicy/Rsop', 
   'http://www.microsoft.com/GroupPolicy/Settings', 
   'http://www.microsoft.com/GroupPolicy/Settings/Base',
   'http://www.microsoft.com/GroupPolicy/Settings/Security',
   'http://www.microsoft.com/GroupPolicy/Types', 
])

xmltodict.parse(
   Path('xml/microsoft.xml').read_text(), 
   process_namespaces=True,
   namespaces=namespaces,
   xml_attribs=False
)['Rsop']['ComputerResults']['ExtensionData']['Extension']
{'Account': {'GPO': {'Identifier': '{DE1708C7-FC4D-491C-942D-72CC5693DDC5}',
   'Domain': 'universum.local'},
  'Precedence': '1',
  'Name': 'MinimumPasswordLength',
  'SettingNumber': '14',
  'Type': 'Password'},
 'UserRightsAssignment': {'GPO': {'Identifier': '{E7712944-FF52-4E72-AE83-7EC5C2D8A959}',
   'Domain': 'universum.local'},
  'Precedence': '1',
  'Name': 'SeCreateGlobalPrivilege',
  'Member': [{'Name': 'Administratorzy'},
   {'Name': 'USŁUGA'},
   {'Name': 'USŁUGA LOKALNA'},
   {'Name': 'USŁUGA SIECIOWA'}]}}

如果你想创建一个pandas dataframe,你可以使用pd.json_normalize

siv3szwd

siv3szwd2#

我不完全确定你想做什么(你指的是 Dataframe ,列表,字典等),所以我只会告诉你如何从你的示例xml中提取相关数据,你可以从那里开始。
首先,在本例中,我将使用lxml而不是ElementTree,因为它对XPath的支持更好。

from lxml import etree

Parse your file with lxml
你可能不得不摆弄这个编码,但它应该工作。
然后:

#take care of the namespaces:
ns = {"q4": "http://www.microsoft.com/GroupPolicy/Settings/Security"}
#search for the relevant info:
members = GPO_root.xpath('//q4:Name[text()="SeCreateGlobalPrivilege"]/following-sibling::q4:Member//*',namespaces=ns)
for member in members:
    print(member.text)

示例xml的输出应该是

Administratorzy
USŁUGA
USŁUGA LOKALNA
USŁUGA SIECIOWA

现在,您可以将每个元素添加到所需的结构中。

相关问题