我使用的是es v7.3,我运行的是一个5节点的集群,现在我想根据一些日期范围查询从集群中删除文档,数百万个文档必须根据条件删除,所以我正在寻找一种有效的方法来实现这一点,目前我在kibana上使用下面提到的查询,
POST test/_delete_by_query?scroll_size=10000&conflicts=proceed
{
"query": {
"bool": {
"must": [
{
"range": {
"co_receivedAt": {
"lte": "2020-07-18T18:30:00Z"
}
}
}
]
}
}
}
但是这个查询删除文档的速度似乎很慢,有时会突然终止,我在监视这个删除任务,得到了这个响应,
{
"nodes" : {
"ijZYh9CITH6ETFB-IfAEAg" : {
"name" : "ip-1-0-4-113",
"transport_address" : "1.0.4.113:9300",
"host" : "1.0.4.113",
"ip" : "1.0.4.113:9300",
"roles" : [
"ingest",
"master",
"data"
],
"attributes" : {
"ml.machine_memory" : "33245122560",
"rack" : "rack-01",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20"
},
"tasks" : {
"ijZYh9CITH6ETFB-IfAEAg:288887138" : {
"node" : "ijZYh9CITH6ETFB-IfAEAg",
"id" : 288887138,
"type" : "transport",
"action" : "indices:data/write/delete/byquery",
"status" : {
"total" : 10017699,
"updated" : 0,
"created" : 0,
"deleted" : 0,
"batches" : 1,
"version_conflicts" : 0,
"noops" : 0,
"retries" : {
"bulk" : 11,
"search" : 0
},
"throttled_millis" : 0,
"requests_per_second" : -1.0,
"throttled_until_millis" : 0
},
"description" : "delete-by-query [test]",
"start_time_in_millis" : 1605004115156,
"running_time_in_nanos" : 22672616111,
"cancellable" : true,
"headers" : { }
}
}
}
}
}
过了一会儿,这个任务就不存在了,它被终止了,有时它继续运行,但有时它经常失败,因为我注意到我的批量重试次数很高。什么是删除我的文档的有效方法,大约需要删除2亿个文档。
暂无答案!
目前还没有任何答案,快来回答吧!