我有以下要求。
{'bool':
{'must': [
{"terms": {"state.keyword": ["Alaska", "Alabama"]}
],
'should': [
{'match': {'abstract': 'Spill and Overfill Prevention 18 AAC 78.045'}},
{'match': {'title': 'Spill and Overfill Prevention 18 AAC 78.045'}},
{'constant_score': {
'filter': {
'match': {'title': 'Spill and Overfill Prevention 18 AAC 78.045'}
}
}}
]}
}
需要计算分数 title
(匹配)。
为此我试着用 constant_score
.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-constant-score-query.html
然而,这并没有达到预期的效果。它只是将每个元素的结果精确地递增1。
这是分析的结果
{'took': 21, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 6, 'relation': 'eq'}, 'max_score'
: 4.754379, 'hits': [{'_index': 'articles', '_type': '_doc', '_id': '483703', '_score': 4.754379, '_source':
这是解释结果
{'_index': 'articles', '_type': '_doc', '_id': '483703', 'matched': True, 'explanation': {'value': 6.6602507, 'description': 'sum of:', 'details': [{'value': 0.150
05009, 'description': 'weight(legal_language:and in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.15005009, 'description': 'score(freq=14.0), compu
ted as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as log(1 +
(N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N
, total number of documents with field', 'details': []}]}, {'value': 0.92034066, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from
:', 'details': [{'value': 14.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation para
meter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value': 504.0, 'description': 'dl, length of field (a
pproximate)', 'details': []}, {'value': 497.5, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 0.3779109, 'description': 'weight(l
egal_language:18 in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.3779109, 'description': 'score(freq=3.0), computed as boost * idf * tf from:', 'd
etails': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.24116206, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:',
'details': [{'value': 5, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with fi
eld', 'details': []}]}, {'value': 0.7122915, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 3.0, 'desc
ription': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.7
5, 'description': 'b, length normalization parameter', 'details': []}, {'value': 504.0, 'description': 'dl, length of field (approximate)', 'details': []}, {'value
': 497.5, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 0.3779109, 'description': 'weight(legal_language:aac in 2) [PerFieldSimi
larity], result of:', 'details': [{'value': 0.3779109, 'description': 'score(freq=3.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'descriptio
n': 'boost', 'details': []}, {'value': 0.24116206, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 5, 'descriptio
n': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.
7122915, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 3.0, 'description': 'freq, occurrences of term
within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normali
zation parameter', 'details': []}, {'value': 504.0, 'description': 'dl, length of field (approximate)', 'details': []}, {'value': 497.5, 'description': 'avgdl, ave
rage length of field', 'details': []}]}]}]}, {'value': 1.0089812, 'description': 'weight(title:spill in 2) [PerFieldSimilarity], result of:', 'details': [{'value':
1.0089812, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 1.02
96195, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 2, 'description': 'n, number of documents containing term'
, 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, computed as fr
eq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value'
: 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value'
: 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value':
0.072622515, 'description': 'weight(title:and in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.072622515, 'description': 'score(freq=1.0), computed
as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as log(1 + (N
- n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, t
otal number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:',
'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation paramete
r', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field', 'detai
ls': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 1.0089812, 'description': 'weight(title:overfill in
2) [PerFieldSimilarity], result of:', 'details': [{'value': 1.0089812, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value':
2.2, 'description': 'boost', 'details': []}, {'value': 1.0296195, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value'
: 2, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}
]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, oc
currences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': '
b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl
, average length of field', 'details': []}]}]}]}, {'value': 1.0089812, 'description': 'weight(title:prevention in 2) [PerFieldSimilarity], result of:', 'details':
[{'value': 1.0089812, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'va
lue': 1.0296195, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 2, 'description': 'n, number of documents contai
ning term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, comp
uted as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}
, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}
, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]},
{'value': 0.072622515, 'description': 'weight(title:18 in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.072622515, 'description': 'score(freq=1.0),
computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as lo
g(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'descriptio
n': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)
) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation
parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field
', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 0.072622515, 'description': 'weight(title:
aac in 2) [PerFieldSimilarity], result of:', 'details': [{'value': 0.072622515, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{
'value': 2.2, 'description': 'boost', 'details': []}, {'value': 0.074107975, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details':
[{'value': 6, 'description': 'n, number of documents containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'det
ails': []}]}, {'value': 0.44543427, 'description': 'tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description':
'freq, occurrences of term within document', 'details': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'descr
iption': 'b, length normalization parameter', 'details': []}, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'descriptio
n': 'avgdl, average length of field', 'details': []}]}]}]}, {'value': 1.5095675, 'description': 'weight(title:78.045 in 2) [PerFieldSimilarity], result of:', 'deta
ils': [{'value': 1.5095675, 'description': 'score(freq=1.0), computed as boost * idf * tf from:', 'details': [{'value': 2.2, 'description': 'boost', 'details': []}
, {'value': 1.5404451, 'description': 'idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:', 'details': [{'value': 1, 'description': 'n, number of documents
containing term', 'details': []}, {'value': 6, 'description': 'N, total number of documents with field', 'details': []}]}, {'value': 0.44543427, 'description': 'tf
, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:', 'details': [{'value': 1.0, 'description': 'freq, occurrences of term within document', 'details
': []}, {'value': 1.2, 'description': 'k1, term saturation parameter', 'details': []}, {'value': 0.75, 'description': 'b, length normalization parameter', 'details
': []}, {'value': 7.0, 'description': 'dl, length of field', 'details': []}, {'value': 6.6666665, 'description': 'avgdl, average length of field', 'details': []}]}
]}]}, {'value': 1.0, 'description': 'ConstantScore(title.keyword:Spill and Overfill Prevention 18 AAC 78.045)', 'details': []}]}}
与 script_score
```
{'query': {
'function_score': {
'query': {
'bool': {
'should': [
{'match': {'legal_language': 'inspections and testing 691'}},
{'match': {'title': 'inspections and testing 691'}}
]
}
},
'script_score': {
'script': {'source': "doc['title'].value"}
}
}
}}
Map
{
"articles" : {
"mappings" : {
"properties" : {
"abstract" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"categories" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"cfr40_part280" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"citation" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"effective_date" : {
"type" : "date"
},
"id" : {
"type" : "long"
},
"legal_language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"local_regulation" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"reference_images" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"state" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"tags" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"unique_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
回溯
Traceback (most recent call last): File
"D:\work_projects\dewey_project\webapp\articles\services\elasticsearch_service.py",
line 103, in retrieve_articles
result = current_app.elasticsearch.search( File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\client\utils.py",
line 84, in wrapped
return func(*args, params=params,**kwargs) File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\client_init.py",
line 1547, in search
return self.transport.perform_request( File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\transport.py",
line 351, in perform_request
status, headers_response, data = connection.perform_request( File
"d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\connection\http_urllib3.py",
line 261, in perform_request
self._raise_error(response.status, raw_data) File "d:\work_projects\dewey_project\venv\lib\site-packages\elasticsearch\connection\base.py",
line 181, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.RequestError: RequestError(400,
'search_phase_execution_exception', 'runtime error')
2条答案
按热度按时间zvokhttg1#
不太清楚您想要达到什么目的,但是看起来您希望仅基于服务器上的匹配获得文档的tf/idf分数
title
现场。而且您还希望对查询添加其他约束。如果是这样,你应该使用filter
合同条款bool
查询。他们不会修改你的分数,但会根据他们的匹配过滤结果。这将返回与原始查询稍有不同的结果,因为它需要匹配
abstract
查询字段Spill and Overfill Prevention 18 AAC 78.045
. 如果希望保持原始查询的行为,则应将其作为常量分数查询移动到should
块然后从结果分数中减去1。
e0bqpujr2#
如果你需要控制得分过程,有一个
function_score
用于自定义和替换原始查询的查询_score
. 你可以看看function_score
在这里查询。