关于我之前的问题:如何在elasticsearch上只添加新文档或更改的文档?
对于相同的场景,现在,我使用python logstash async将json对象/字典插入日志索引。
示例代码:
test_logger = logging.getLogger('python-logstash-logger')
test_logger.setLevel(logging.INFO)
async_handler = AsynchronousLogstashHandler(host, port, database_path=None)
test_logger.addHandler(async_handler)
extra ={
'record_id':'102',
'status':'started'
}
test_logger.info('python-logstash: records', extra=extra)
回答如下:https://stackoverflow.com/a/63588420/4091956 (适用于不使用logstash的实现)对于我前面的问题,如何使 record_id
在额外参数中传递文档id,以便不插入重复记录?它当前显示为附加字段内的子字段,即源字段内的子字段。
{
"_index": "logstash-2020.08.25",
"_type": "_doc",
"_id": "xy121",
"_version": 1,
"_score": null,
"_source": {
"program": "main.py",
"@timestamp": "2020-08-25T20:11:01.065Z",
"@version": "1",
"logsource": "",
"pid": 13864,
"host": "",
"level": "INFO",
"port": 61210,
"type": "python-logstash",
"extra": {
"record_id": "88943",
"status":"started"
},
"message": "python-logstash: records"
},
"fields": {
"@timestamp": [
"2020-08-25T20:11:01.065Z"
]
},
"sort": [
1690386261065
]
}
这里的目标是确保不通过logstash添加重复的记录。
暂无答案!
目前还没有任何答案,快来回答吧!