当字段包含~

3hvapo4f  于 2021-06-14  发布在  ElasticSearch
关注(0)|答案(2)|浏览(327)

我有一堆像下面这样的文件。我想过滤projectkey以~开头的数据。我确实读过一些文章说~是弹性查询中的运算符,所以不能用它进行过滤。有人能帮我建立/branch/\u搜索api的搜索查询吗??

{
  "_index": "branch",
  "_type": "_doc",
  "_id": "GAz-inQBJWWbwa_v-l9e",
  "_version": 1,
  "_score": null,
  "_source": {
    "branchID": "refs/heads/feature/12345",
    "displayID": "feature/12345",
    "date": "2020-09-14T05:03:20.137Z",
    "projectKey": "~user",
    "repoKey": "deploy",
    "isDefaultBranch": false,
    "eventStatus": "CREATED",
    "user": "user"
  },
  "fields": {
    "date": [
      "2020-09-14T05:03:20.137Z"
    ]
  },
  "highlight": {
    "projectKey": [
      "~@kibana-highlighted-field@user@/kibana-highlighted-field@"
    ],
    "projectKey.keyword": [
      "@kibana-highlighted-field@~user@/kibana-highlighted-field@"
    ],
    "user": [
      "@kibana-highlighted-field@user@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1600059800137
  ]
}

更新***
我在下面的查询中使用了prerana的答案来使用-prefix
当我使用前缀和范围时仍然有问题-我得到下面的错误-我遗漏了什么??

GET /branch/_search
{
  "query": {
    "prefix": {
      "projectKey": "~"
    },
    "range": {
      "date": {
        "gte": "2020-09-14",
        "lte": "2020-09-14"
      }
    }
  }
}

    {
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
        "line": 6,
        "col": 5
      }
    ],
    "type": "parsing_exception",
    "reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
    "line": 6,
    "col": 5
  },
  "status": 400
}
zrfyljdw

zrfyljdw1#

虽然@hansley answer可以工作,但是它需要您创建一个定制的分析器,并且正如您所提到的,您仍然只希望获得以 ~ 但是在他的结果中我看到所有的文档都包含 ~ ,所以提供了我的答案,它需要非常少的配置和工作所需的。
索引Map是默认的,所以只要在docs和es下面建立索引,就会创建一个默认Map .keyword 所有字段 text 领域
索引示例文档

{
    "title" : "content1 ~"
}

{
    "title" : "~ staring with"
}

{
    "title" : "in between ~ with"
}

搜索查询应该从示例文档中提取明显的第二个文档

{
  "query": {
    "prefix" : { "title.keyword" : "~" }
  }
}

和搜索结果

"hits": [
            {
                "_index": "pre",
                "_type": "_doc",
                "_id": "2",
                "_score": 1.0,
                "_source": {
                    "title": "~ staring with"
                }
            }
        ]

更多信息请参考前缀查询
更新1:
索引Map:

{
  "mappings": {
    "properties": {
      "date": {
        "type": "date" 
      }
    }
  }
}

索引数据:

{
    "date": "2015-02-01",
    "title" : "in between ~ with"
}
{
    "date": "2015-01-01",
    "title": "content1 ~"
}
{
    "date": "2015-02-01",
     "title" : "~ staring with"
}
{
    "date": "2015-02-01",
    "title" : "~ in between with"
}

搜索查询:

{
    "query": {
        "bool": {
            "must": [
                {
                    "prefix": {
                        "title.keyword": "~"
                    }
                },
                {
                    "range": {
                        "date": {
                            "lte": "2015-02-05",
                            "gte": "2015-01-11"
                        }
                    }
                }
            ]
        }
    }
}

搜索结果:

"hits": [
      {
        "_index": "stof_63924930",
        "_type": "_doc",
        "_id": "2",
        "_score": 2.0,
        "_source": {
          "date": "2015-02-01",
          "title": "~ staring with"
        }
      },
      {
        "_index": "stof_63924930",
        "_type": "_doc",
        "_id": "4",
        "_score": 2.0,
        "_source": {
          "date": "2015-02-01",
          "title": "~ in between with"
        }
      }
    ]
hzbexzde

hzbexzde2#

如果我理解你的问题很好,我建议创建一个自定义分析器来搜索特殊字符 ~ .
在替换时,我做了如下测试 ~__SPECIAL__ :
我用一个自定义的 char_filter 同时在 projectKey 现场。新的multi\u字段的名称是 special_characters .
以下是Map:

PUT wildcard-index
{
"settings": {
    "analysis": {
    "char_filter": {
        "special-characters-replacement": {
        "type": "mapping",
        "mappings": [
            "~ => __SPECIAL__"
        ]
        }
    },
    "analyzer": {
        "special-characters-analyzer": {
        "tokenizer": "standard",
        "char_filter": [
            "special-characters-replacement"
        ]
        }
    }
    }
},
"mappings": {
    "properties": {
    "projectKey": {
        "type": "text",
        "fields": {
        "special_characters": {
            "type": "text",
            "analyzer": "special-characters-analyzer"
        }
        }
    }
    }
}
}

然后我在索引中摄取了以下内容:
“projectkey”:“content1~
“projectkey”:“这~是一个内容”
“projectkey”:“~路上的车”
“projectkey”:“o~ngram”
然后,查询是:

GET wildcard-index/_search
{
"query": {
    "match": {
    "projectKey.special_characters": "~"
    }
}
}

答复是:

"hits" : [
  {
    "_index" : "wildcard-index",
    "_type" : "_doc",
    "_id" : "h1hKmHQBowpsxTkFD9IR",
    "_score" : 0.43250346,
    "_source" : {
      "projectKey" : "content1 ~"
    }
  },
  {
    "_index" : "wildcard-index",
    "_type" : "_doc",
    "_id" : "iFhKmHQBowpsxTkFFNL5",
    "_score" : 0.3034693,
    "_source" : {
      "projectKey" : "This ~ is a content"
    }
  },
  {
    "_index" : "wildcard-index",
    "_type" : "_doc",
    "_id" : "-lhKmHQBowpsxTkFG9Kg",
    "_score" : 0.3034693,
    "_source" : {
      "projectKey" : "~ cars on the road"
    }
  }
]

请让我知道如果你有任何问题,我会很乐意帮助你。
注意:如果在 ~ . 您可以从响应中看到第4个数据没有显示。

相关问题