我为我的客户机公开了一个api,在这里我使用es获取特定时间范围的数据。这个数字是100多万。现在,我必须提供另一个特性,在这里我给它们偏移量和限制,客户机可以从偏移量中获取记录的数量(限制)。
我的es查询是这样形成的 {"from":10000,"size":2001,"timeout":"60s","query":{"bool":{"must":[{"terms":{"tollId":["59850"],"boost":1.0}},{"range":{"updatedAt":{"from":"2020-08-15T00:00:00.000Z","to":null,"include_lower":true,"include_upper":true,"boost":1.0}}},{"range":{"updatedAt":{"from":null,"to":"2020-12-15T22:08:21.000Z","include_lower":true,"include_upper":true,"boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},"sort":[{"updatedAt":{"order":"desc"}}]}
当我在ElasticSearch上执行这个时,我得到
"failed_shards": [
{
"shard": 0,
"index": "companydatabase",
"node": "vQU6NjSVRK6dKNLsWkfqEw",
"reason": {
"type": "query_phase_execution_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [12001]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
}
解决方案是使用scrollapi来获取记录,但是当我必须从某个偏移量获取记录到某个限制时,我不能使用scrollapi。
我错过什么了吗?有没有办法解决这个问题,否则我每次都要得到所有的记录(文档)并过滤结果?
1条答案
按热度按时间jei2mxaa1#
你只需要更新你的索引设置
max_result_window
对于更高的值,默认值是10000
. 举个例子,如果你做了一个小于10000的from+大小的,就可以了,任何超过这个大小的都需要改变你的大小max_result_window
对于该索引:显然,对es使用scrollapi将使这个方法比提高这个方法更有效。