在spring data elasticsearch中使用asciifolding筛选器创建自定义分析器

vbopmzt1  于 2021-06-14  发布在  ElasticSearch
关注(0)|答案(1)|浏览(362)

我想在搜索时检索相同的对象 cozum 或者 çözüm 用名字录音后 çözüm . 我搜索过这个 asciifolding filter 建议。如何使用SpringDataElasticSearch实现此功能?

@Document(indexName = "erp")
    public class Company {

        @Id
        private String id;

        private String name;

        private String description;

        @Field(type = FieldType.Nested, includeInParent = true)
        private List<Employee> employees;

        // getters, setter
    }
ukxgm1gy

ukxgm1gy1#

您将需要创建一个asciifolding分析器,请参阅elasticsearch文档,并将其添加到索引的索引设置中。
然后可以在 @Field name属性的注解。
编辑:完整示例
首先是索引设置的文件,我将其命名为erp-company.json,并将其保存在src/main/resources下:

{
  "analysis": {
    "analyzer": {
      "custom_analyzer": {
        "type": "custom",
        "tokenizer": "standard",
        "char_filter": [
          "html_strip"
        ],
        "filter": [
          "lowercase",
          "asciifolding"
        ]
      }
    }
  }
}

然后您需要在实体类中引用这个文件和分析器,在这里命名为 Company :

@Document(indexName = "erp")
@Setting(settingPath = "/erp-company.json")
public class Company {

    @Id
    private String id;

    @Field(type = FieldType.Text, analyzer = "custom_analyzer")
    private String name;

    @Field(type = FieldType.Text, analyzer = "custom_analyzer")
    private String description;

    // getters, setter
}

这个 CompanyController 使用以下内容:

@RestController
@RequestMapping("/company")
public class CompanyController {

    private final CompanyRepository repository;

    public CompanyController(CompanyRepository repository) {
        this.repository = repository;
    }

    @PostMapping
    public Company put(@RequestBody Company company) {
        return repository.save(company);
    }

    @GetMapping("/{name}")
    public SearchHits<Company> get(@PathVariable String name) {
        return repository.searchByName(name);
    }
}

保存一些包含变音字符的数据(使用httpie):

http POST :8080/company id=1 name="Renée et François"

不带音调符号的搜索:

http  GET :8080/company/francois

HTTP/1.1 200
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Connection: keep-alive
Content-Type: application/json
Date: Wed, 09 Sep 2020 17:56:16 GMT
Expires: 0
Keep-Alive: timeout=60
Pragma: no-cache
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block

{
    "aggregations": null,
    "empty": false,
    "maxScore": 0.2876821,
    "scrollId": null,
    "searchHits": [
        {
            "content": {
                "description": null,
                "id": "1",
                "name": "Renée et François"
            },
            "highlightFields": {},
            "id": "1",
            "index": "erp",
            "innerHits": {},
            "nestedMetaData": null,
            "score": 0.2876821,
            "sortValues": []
        }
    ],
    "totalHits": 1,
    "totalHitsRelation": "EQUAL_TO"
}

elasticsearch为索引返回的索引信息:

{
    "erp": {
        "aliases": {},
        "mappings": {
            "properties": {
                "_class": {
                    "fields": {
                        "keyword": {
                            "ignore_above": 256,
                            "type": "keyword"
                        }
                    },
                    "type": "text"
                },
                "description": {
                    "analyzer": "custom_analyzer",
                    "type": "text"
                },
                "id": {
                    "fields": {
                        "keyword": {
                            "ignore_above": 256,
                            "type": "keyword"
                        }
                    },
                    "type": "text"
                },
                "name": {
                    "analyzer": "custom_analyzer",
                    "type": "text"
                }
            }
        },
        "settings": {
            "index": {
                "analysis": {
                    "analyzer": {
                        "custom_analyzer": {
                            "char_filter": [
                                "html_strip"
                            ],
                            "filter": [
                                "lowercase",
                                "asciifolding"
                            ],
                            "tokenizer": "standard",
                            "type": "custom"
                        }
                    }
                },
                "creation_date": "1599673911503",
                "number_of_replicas": "1",
                "number_of_shards": "1",
                "provided_name": "erp",
                "uuid": "lRwcKcPUQxKKGuNJ6G30uA",
                "version": {
                    "created": "7090099"
                }
            }
        }
    }
}

相关问题