给定以下XML
<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<id>1</id>
<title>Example XML</title>
<published>2021-12-15T00:00:00Z</published>
<updated>2022-01-06T12:44:47Z</updated>
<content type="application/xml">
<articleDoc xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" chemaVersion="1.8" xml:lang="en">
<articleDocHead>
<itemInfo/>
</articleDocHead>
</articleDoc>
</content>
</entry>
我怎样才能得到entry/content/articleDoc属性中的xml:lang属性的值呢?我查过Python文档,但不幸的是它没有涵盖带有名称空间的属性。如果通过手动将名称空间作为字典键写在attribute-name前面找到解决方案,那似乎是错误的。我使用的是Python 3.9.9。
下面是我的代码:
import xml.etree.cElementTree as tree
xml = """<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
<id>1</id>
<title>Example XML</title>
<published>2021-12-15T00:00:00Z</published>
<updated>2022-01-06T12:44:47Z</updated>
<content type="application/xml">
<articleDoc xmlns="" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" schemaVersion="1.8" xml:lang="en">
<articleDocHead>
<itemInfo/>
</articleDocHead>
</articleDoc>
</content>
</entry>"""
ns = {'nitf': 'http://iptc.org/std/NITF/2006-10-18/',
'w3': 'http://www.w3.org/2005/Atom',
'xml': 'http://www.w3.org/XML/1998/namespace'}
root = tree.fromstring(xml)
id = root.find("w3:id", ns).text # works
print(id)
type_attribute = root.find("w3:content", ns).attrib['type'] # works
print(type_attribute)
#language = root.find("w3:content/articleDoc/articleDocHeader[xml:lang']", ns) # doesn't work
language = root.find("w3:content/articleDoc", ns).attrib['{http://www.w3.org/XML/1998/namespace}lang'] # works, but seems wrong
print(language)
任何帮助都是感激的。非常感谢!
1条答案
按热度按时间mf98qq941#
以下是如何使用
lxml.etree
在xml文件中定位的快速指南