MySQL全文搜索未返回所有匹配结果

xzlaal3s  于 2023-06-21  发布在  Mysql
关注(0)|答案(3)|浏览(151)
    • bounty还有1小时到期此问题的答案有资格获得+50声望奖励。MrCujo希望引起更多关注**这个问题。

我使用MySQL 8,我有一个名为records的表,我为它添加了以下索引,以便在其上执行文本搜索:

  1. CREATE FULLTEXT INDEX all_records_idx ON records (`title`, `label`, `description`, `catalog_number`,`barcode`);

现在,我得到了以下数据(复制为JSON):

  1. [
  2. {
  3. "barcode": "79027066021",
  4. "title": "White Blood Cells",
  5. "slug": "White-blood-cells",
  6. "image": null,
  7. "description": "rev",
  8. "artist_id": 3,
  9. "genre_id": 1,
  10. "num_of_records": 1,
  11. "original_release_year": 2001,
  12. "original_release_date": null,
  13. "reissue_release_date": null,
  14. "label": null,
  15. "sleeve_side_color": "ff1500",
  16. "vinyl_color": "ff1500",
  17. "size": 12,
  18. "barcode_in_record": null,
  19. "catalog_number": null,
  20. "owned_copies": 1,
  21. "active": 1,
  22. "created_at": "2023-06-11 21:32:17.000",
  23. "updated_at": null,
  24. "deleted_at": null
  25. },
  26. {
  27. "barcode": "889854036119",
  28. "title": "Nilsson Schmilsson",
  29. "slug": "nilsson-schmilsson",
  30. "image": null,
  31. "description": null,
  32. "artist_id": 6,
  33. "genre_id": 1,
  34. "num_of_records": 1,
  35. "original_release_year": 1971,
  36. "original_release_date": null,
  37. "reissue_release_date": null,
  38. "label": null,
  39. "sleeve_side_color": "490d57",
  40. "vinyl_color": "000000",
  41. "size": 12,
  42. "barcode_in_record": null,
  43. "catalog_number": null,
  44. "owned_copies": 1,
  45. "active": 1,
  46. "created_at": "2023-06-11 21:32:17.000",
  47. "updated_at": null,
  48. "deleted_at": null
  49. },
  50. {
  51. "barcode": "C10777774644617",
  52. "title": "Abbey Road",
  53. "slug": "abbey-road",
  54. "image": null,
  55. "description": null,
  56. "artist_id": 2,
  57. "genre_id": 1,
  58. "num_of_records": 1,
  59. "original_release_year": 1969,
  60. "original_release_date": null,
  61. "reissue_release_date": null,
  62. "label": null,
  63. "sleeve_side_color": "46518f",
  64. "vinyl_color": "000000",
  65. "size": 12,
  66. "barcode_in_record": null,
  67. "catalog_number": null,
  68. "owned_copies": 1,
  69. "active": 1,
  70. "created_at": "2023-06-11 21:32:17.000",
  71. "updated_at": null,
  72. "deleted_at": null
  73. },
  74. {
  75. "barcode": "C10777774644618",
  76. "title": "Revolver",
  77. "slug": "revolver",
  78. "image": null,
  79. "description": null,
  80. "artist_id": 2,
  81. "genre_id": 1,
  82. "num_of_records": 1,
  83. "original_release_year": 1966,
  84. "original_release_date": null,
  85. "reissue_release_date": null,
  86. "label": null,
  87. "sleeve_side_color": "46518f",
  88. "vinyl_color": "000000",
  89. "size": 12,
  90. "barcode_in_record": null,
  91. "catalog_number": null,
  92. "owned_copies": 1,
  93. "active": 1,
  94. "created_at": "2023-06-11 21:32:17.000",
  95. "updated_at": null,
  96. "deleted_at": null
  97. },
  98. {
  99. "barcode": "C10777774644619",
  100. "title": "arevolat",
  101. "slug": "aurevolat",
  102. "image": null,
  103. "description": "this is a test",
  104. "artist_id": 2,
  105. "genre_id": 1,
  106. "num_of_records": 1,
  107. "original_release_year": 1999,
  108. "original_release_date": null,
  109. "reissue_release_date": null,
  110. "label": null,
  111. "sleeve_side_color": null,
  112. "vinyl_color": null,
  113. "size": 12,
  114. "barcode_in_record": null,
  115. "catalog_number": null,
  116. "owned_copies": 1,
  117. "active": 1,
  118. "created_at": null,
  119. "updated_at": null,
  120. "deleted_at": null
  121. }
  122. ]

因此,表中目前有5条记录。为了测试搜索功能,我运行了以下查询:

  1. SELECT * FROM records WHERE MATCH(`title`, `label`, `description`, `catalog_number`,`barcode`)
  2. AGAINST ('*rev*' IN BOOLEAN MODE)

我希望能得到三个记录回来:

  • 条形码79027066021为1,应与描述“description”匹配:"牧师"
  • 条形码C10777774644618为1,应与标题“title”匹配:左轮手枪
  • 条形码C10777774644619为1,应与标题“title”匹配:"arevolat"。

但我只能拿回前两张唱片

  • 条形码79027066021为1,应与描述“description”匹配:"牧师"
  • 条形码C10777774644618为1,应与标题“title”匹配:左轮手枪

这一个不存在:

  • 条形码C10777774644619为1,应与标题“title”匹配:"arevolat"。

即使我使用*作为搜索的一部分。
DDL:

  1. CREATE TABLE `records` (
  2. `barcode` varchar(48) NOT NULL,
  3. `title` varchar(160) NOT NULL,
  4. `slug` varchar(250) NOT NULL,
  5. `image` varchar(191) DEFAULT NULL,
  6. `artist_id` bigint unsigned DEFAULT NULL,
  7. `genre_id` bigint unsigned DEFAULT NULL,
  8. `num_of_records` bigint NOT NULL,
  9. `original_release_date` date DEFAULT NULL,
  10. `reissue_release_date` date DEFAULT NULL,
  11. `label` longtext,
  12. `sleeve_side_color` varchar(191) DEFAULT NULL,
  13. `vinyl_color` longtext,
  14. `size` bigint DEFAULT NULL,
  15. `barcode_in_record` tinyint(1) DEFAULT NULL,
  16. `catalog_number` longtext,
  17. `owned_copies` bigint DEFAULT '1',
  18. `active` tinyint(1) DEFAULT '1',
  19. `created_at` datetime(3) DEFAULT NULL,
  20. `updated_at` datetime(3) DEFAULT NULL,
  21. `deleted_at` datetime(3) DEFAULT NULL,
  22. `original_release_year` bigint NOT NULL,
  23. `description` text,
  24. PRIMARY KEY (`barcode`),
  25. UNIQUE KEY `barcode` (`barcode`),
  26. UNIQUE KEY `slug` (`slug`),
  27. KEY `fk_records_artist` (`artist_id`),
  28. KEY `fk_records_genre` (`genre_id`),
  29. FULLTEXT KEY `all_records_idx` (`title`,`label`,`description`,`catalog_number`,`barcode`),
  30. CONSTRAINT `fk_records_artist` FOREIGN KEY (`artist_id`) REFERENCES `artists` (`id`),
  31. CONSTRAINT `fk_records_genre` FOREIGN KEY (`genre_id`) REFERENCES `genres` (`id`)
  32. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

知道发生什么事了吗
谢谢

zfycwa2u

zfycwa2u1#

“%”、“*”和其他标点符号对于AGAINSTLIKERLIKE具有特定且不同的含义。

  1. MATCH(title, ...) AGAINST('*rev*') -- ignores the leading punctuation mark
  2. title LIKE '%rev%' -- finds the desired 3 rows, but only in `title`; and slow
  3. title RLIKE 'rev' -- finds the desired 3 rows, but only in `title`; and slow
  4. title LIKE '%rev%' OR label LIKE '%rev%' OR ... -- even slower

在适当的情况下,FULLTEXT几乎总是比LIKERLIKE快;通常是“更快”。我建议您更改要求和/或期望。正如您从上面看到的,没有既简单又快速的语法。

mec1mxoz

mec1mxoz2#

在MySQL中,通配符是%而不是*
试试AGAINST ('%rev%' IN BOOLEAN MODE)
有时,*充当修饰符,以匹配“它前面的事物的零个或多个示例”。
参见https://dev.mysql.com/doc/refman/8.0/en/pattern-matching.html
但是,上面的模式匹配适用于扩展的正则表达式(例如使用RLIKE时)。
对于布尔全文搜索,*是截断运算符。表达式末尾的 * 表示rev必须出现在一个或多个单词的开头。这解释了结果中缺少记录的原因。
我怀疑第一个*被忽略了。
更多信息:https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html

pxiryf3j

pxiryf3j3#

12.9.5全文限制

“%”字符不是全文搜索所支持的通配符。
https://dev.mysql.com/doc/refman/5.7/en/fulltext-restrictions.html
星号 *
星号用作截断(或通配符)运算符。与其他运算符不同,它被附加到要受影响的单词后面。如果单词以 * 运算符前面的单词开始,则匹配。[...]
通配符被认为是一个前缀,必须出现在一个或多个单词的开头。

以下示例演示了一些使用布尔全文运算符的搜索字符串:[...]

'apple*'
查找包含“apple”、“apples”、“applesauce”或“applet”等单词的行。
https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html

相关问题