为什么MySQL在连接中不考虑(JSON)函数索引,而是考虑生成列上的索引?

qxsslcnc  于 2024-01-05  发布在  Mysql
关注(0)|答案(1)|浏览(143)

我在MySQL(8.0.35版)中使用了很多JSON列。通常,我在JSON属性上创建函数索引以加快查询速度。根据MySQL文档,* 函数索引被实现为隐藏的虚拟生成列 *。然而,它们的行为似乎与连接中生成列的索引不同。我将使用一个以某种方式构造的例子来解释我的观点。
示例:给定两个表productpurchasepurchase具有引用product的JSON属性$.productUuid

  1. CREATE TABLE IF NOT EXISTS product (
  2. id BINARY(16) NOT NULL,
  3. payload JSON NOT NULL,
  4. CONSTRAINT pk_product PRIMARY KEY (id)
  5. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
  6. CREATE TABLE IF NOT EXISTS purchase (
  7. id BINARY(16) NOT NULL,
  8. payload JSON NOT NULL,
  9. CONSTRAINT pk_product PRIMARY KEY (id),
  10. INDEX `i_purchase_product` (
  11. ( CAST(payload->>'$.productUuid' AS CHAR(36)) COLLATE utf8mb4_bin )
  12. )
  13. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
  14. INSERT INTO product (id, payload)
  15. VALUES
  16. ( UUID_TO_BIN(UUID(), TRUE), '{ "name": "random drink" }' ),
  17. ( UUID_TO_BIN(UUID(), TRUE), '{ "name": "random dish" }' ),
  18. ( UUID_TO_BIN(UUID(), TRUE), '{ "name": "random tool" }' )
  19. ;
  20. INSERT INTO purchase (id, payload)
  21. SELECT
  22. UUID_TO_BIN(UUID(), TRUE), JSON_SET(payload, '$.productUuid', BIN_TO_UUID(id))
  23. FROM product
  24. ;

字符串
对于以下查询

  1. SELECT * FROM product a INNER JOIN purchase b ON BIN_TO_UUID(a.id) = b.payload->>'$.productUuid';


MySQL生成以下计划:

  1. +----+-------------+-------+---------------+------+--------------------------------------------+
  2. | id | select_type | table | possible_keys | key | Extra |
  3. +----+-------------+-------+---------------+------+--------------------------------------------+
  4. | 1 | SIMPLE | a | NULL | NULL | NULL |
  5. | 1 | SIMPLE | b | NULL | NULL | Using where; Using join buffer (hash join) |
  6. +----+-------------+-------+---------------+------+--------------------------------------------+


该计划表明,函数索引甚至没有被考虑。如果我创建一个具有生成列和常规索引的表,情况就不同了。

  1. CREATE TABLE IF NOT EXISTS purchase (
  2. id BINARY(16) NOT NULL,
  3. payload JSON NOT NULL,
  4. product_uuid VARCHAR(36) GENERATED ALWAYS AS (payload->>'$.productUuid') STORED NOT NULL,
  5. CONSTRAINT pk_purchase PRIMARY KEY (id),
  6. INDEX `i_purchase_product` (product_uuid)
  7. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;


现在,MySQL生成的计划显示该索引已被考虑:

  1. +----+-------------+-------+--------------------+--------------------+-----------------------+
  2. | id | select_type | table | possible_keys | key | Extra |
  3. +----+-------------+-------+--------------------+--------------------+-----------------------+
  4. | 1 | SIMPLE | a | NULL | NULL | NULL |
  5. | 1 | SIMPLE | b | i_purchase_product | i_purchase_product | Using index condition |
  6. +----+-------------+-------+--------------------+--------------------+-----------------------+


对于这种行为,是否有一个解释,最好是有记录的解释?

pbossiut

pbossiut1#

这是一个已知的bug:https://bugs.mysql.com/bug.php?id=98937
这甚至不是关于JSON的使用。任何函数索引都不能在JOIN中使用,即使相同的索引可以在另一个条件中使用。
我修改了你的例子,使用一个任意的字符串函数,而不是使用JSON函数:

  1. CREATE TABLE IF NOT EXISTS product (
  2. id BINARY(16) NOT NULL,
  3. abc CHAR(3) NOT NULL,
  4. CONSTRAINT pk_product PRIMARY KEY (id)
  5. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
  6. CREATE TABLE IF NOT EXISTS purchase (
  7. id BINARY(16) NOT NULL,
  8. abc CHAR(3) NOT NULL,
  9. CONSTRAINT pk_product PRIMARY KEY (id),
  10. INDEX `i_abc` ((REVERSE(abc)))
  11. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
  12. INSERT INTO product (id, abc)
  13. VALUES
  14. ( UUID_TO_BIN(UUID(), TRUE), 'abc' ),
  15. ( UUID_TO_BIN(UUID(), TRUE), 'abc' ),
  16. ( UUID_TO_BIN(UUID(), TRUE), 'abc' );
  17. INSERT INTO purchase (id, abc)
  18. SELECT
  19. UUID_TO_BIN(UUID(), TRUE), REVERSE(abc)
  20. FROM product;

字符串
使用函数索引的表达式进行搜索的效果与预期一样,它使用索引:

  1. EXPLAIN
  2. SELECT * FROM purchase b WHERE REVERSE(b.abc) = 'abc';
  3. +----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
  4. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  5. +----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
  6. | 1 | SIMPLE | b | NULL | ref | i_abc | i_abc | 15 | const | 3 | 100.00 | NULL |
  7. +----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+


但是使用函数索引的表达式连接并不使用索引:

  1. EXPLAIN
  2. SELECT * FROM product a
  3. INNER JOIN purchase b
  4. ON REVERSE(b.abc) = a.abc;
  5. +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
  6. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  7. +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
  8. | 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | NULL |
  9. | 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where; Using join buffer (hash join) |
  10. +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+


这个bug是在2020年3月报告的(在我们写这篇文章的时候,已经过去了三年多)。唯一的解决办法是他们同意“提交一个文档bug来澄清当前的限制”。但是后来他们链接到了一个内部bug(我看不到)。
到目前为止,我还没有发现他们记录了这种限制。

展开查看全部

相关问题