如何将整个xml数据库吸收到elasticsearch？

bzzcjhmw 于 2021-06-13 发布在 ElasticSearch

关注(0)|答案(1)|浏览(363)

假设我有20个xml文件，这是整个数据库。有没有可能将这20个xml文件都吸收到ElasticSearch中？如果是，有什么选择？

elasticsearch

来源：https://stackoverflow.com/questions/65119413/how-to-ingest-whole-xml-database-to-elastic-search

1条答案

按热度按时间

krugob8w1#

对于python3，我建议使用xmltodict
跑 pip install xmltodict elasticsearch 我假设xml文件有记录：

<records>
    <record>...</record>
    ...
    <record>...</record>
</records>

所以他们必须被分成记录。
使用以下内容编辑名为“load.py”的脚本：

import sys
import xmltodict
import json
from elasticsearch import Elasticsearch

INDEX="xmlfiles"
TYPE= "record"

def xml_to_actions(xmlcontent):
    for record in xmlcontent["records"]:
        yield ('{ "index" : { "_index" : "%s", "_type" : "%s" }}'% (INDEX, TYPE))
        yield (json.dumps(record, default=int))

e = Elasticsearch()  # no args, connect to localhost:9200
if not e.indices.exists(INDEX):
    raise RuntimeError('index does not exists, use `curl -X PUT "localhost:9200/%s"` and try again'%INDEX)

for f in sys.argv:
    with open(f, "rt") as fin:
        r = e.bulk(xml_to_actions(xmldict.parse(fin)))  # return a dict
        print(f, not r["errors"])

将其用于： python load.py xml1.xml xml2.xml ... xml20.xml

赞(0）回复(0）举报 2021-06-14

我来回答

如何将整个xml数据库吸收到elasticsearch？

1条答案

相关问题

热门标签

最新问答