mysql在一个表上使用select/order-by进行慢速查询,在另一个表上使用where,限制结果

64jmpszr  于 2021-08-13  发布在  Java
关注(0)|答案(2)|浏览(241)

我试图从几个表中查询前n行。where子句引用一个表中的列列表,而order by子句引用另一个表中的列。看起来mysql选择where子句中涉及的表作为第一次过滤(过滤不多),而应用限制后,影响返回行的是order by。如果我强制mysql为orderby使用覆盖索引,查询将立即返回所需的行。不幸的是,我无法通过jpa将索引提示传递给mysql,而使用本机查询重写所有内容将是一项大量的工作。下面是一个示例:

CREATE TABLE person (
    id INTEGER PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(255),
    last_name VARCHAR(255)
) engine=InnoDB;

CREATE TABLE membership (
    id INTEGER PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL
) engine=InnoDB;

CREATE TABLE employee (
    id INTEGER PRIMARY KEY AUTO_INCREMENT,
    membership_id INTEGER NOT NULL,
    type VARCHAR(15),
    enabled BIT NOT NULL,
    person_id INTEGER NOT NULL REFERENCES person ( id ),
    CONSTRAINT fk_employee_membership_id FOREIGN KEY ( membership_id ) REFERENCES membership ( id ),
    CONSTRAINT fk_employee_person_id FOREIGN KEY ( person_id ) REFERENCES person ( id )
) engine=InnoDB;

CREATE UNIQUE INDEX uk_employee_person_id ON employee ( person_id );

CREATE INDEX idx_person_first_name_last_name ON person ( first_name, last_name );

我编写了一个脚本,输出一堆insert语句,用200000行填充表:


# !/bin/bash

# 

echo "INSERT INTO membership ( id, name ) VALUES ( 1, 'Default Membership' );"
for seq in {1..200000}; do
    echo "INSERT INTO person ( id, first_name, last_name ) VALUES ( $seq, 'firstName$seq', 'lastName$seq' );"
    echo "INSERT INTO employee ( id, membership_id, type, enabled, person_id ) VALUES ( $seq, 1, 'INDIVIDUAL', 1, $seq );"
done

我的第一次尝试:

SELECT e.*
FROM person p INNER JOIN employee e ON p.id = e.person_id
WHERE e.membership_id = 1 AND type = 'INDIVIDUAL' AND enabled = 1
ORDER BY p.first_name ASC, p.last_name ASC, p.id ASC
LIMIT 100;
-- 100 rows in set (1.43 sec)

并解释:

+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------+---------+--------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys                                   | key                       | key_len | ref                | rows  | filtered | Extra                                        |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------+---------+--------------------+-------+----------+----------------------------------------------+
|  1 | SIMPLE      | e     | NULL       | ref    | uk_employee_person_id,fk_employee_membership_id | fk_employee_membership_id | 4       | const              | 99814 |     5.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | NULL       | eq_ref | PRIMARY                                         | PRIMARY                   | 4       | qsuite.e.person_id |     1 |   100.00 | NULL                                         |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------+---------+--------------------+-------+----------+----------------------------------------------+

现在我强制mysql在person上使用(first\u name,last\u name)索引:

SELECT e.*
FROM person p USE INDEX ( idx_person_first_name_last_name )
    INNER JOIN employee e ON p.id = e.person_id
WHERE e.membership_id = 1 AND type = 'INDIVIDUAL' AND enabled = 1
ORDER BY p.first_name ASC, p.last_name ASC, p.id ASC
LIMIT 100;
-- 100 rows in set (0.00 sec)

它马上就回来了。并解释:

+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys                                   | key                             | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | p     | NULL       | index  | NULL                                            | idx_person_first_name_last_name | 2046    | NULL        |  100 |   100.00 | Using index |
|  1 | SIMPLE      | e     | NULL       | eq_ref | uk_employee_person_id,fk_employee_membership_id | uk_employee_person_id           | 4       | qsuite.p.id |    1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------------------+---------------------------------+---------+-------------+------+----------+-------------+

请注意,示例中的where子句实际上并没有过滤任何行。这在很大程度上代表了我所拥有的数据和针对该表的大量查询。有没有一种方法可以诱使mysql使用这个索引,或者用一些不太具破坏性的方法来重构这个索引以提高性能?
谢谢。
编辑:我删除了原始的覆盖索引,并在每个表中添加了一个:

CREATE INDEX idx_person_id_first_name_last_name ON person ( id, first_name, last_name );
CREATE INDEX idx_employee_etc ON employee ( membership_id, type, enabled, person_id );

它似乎加快了一点速度,但mysql仍然坚持先运行employee表:

+----+-------------+-------+------------+--------+--------------------------------------------+------------------+---------+--------------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys                              | key              | key_len | ref                | rows  | filtered | Extra                                        |
+----+-------------+-------+------------+--------+--------------------------------------------+------------------+---------+--------------------+-------+----------+----------------------------------------------+
|  1 | SIMPLE      | e     | NULL       | ref    | uk_employee_person_id,idx_employee_etc     | idx_employee_etc | 68      | const,const,const  | 97311 |   100.00 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | NULL       | eq_ref | PRIMARY,idx_person_id_first_name_last_name | PRIMARY          | 4       | qsuite.e.person_id |     1 |   100.00 | NULL                                         |
+----+-------------+-------+------------+--------+--------------------------------------------+------------------+---------+--------------------+-------+----------+----------------------------------------------+
h9a6wy2h

h9a6wy2h1#

我会把你的第二个索引放在person表上 (id, first_name, last_name) 并且去掉第二个索引,除非您真的要以某人的名字作为主要依据进行查询。
对于employee表,在 (membership_id, type, enabled, person_id) 在employee表上有适当的索引将有助于收回所有符合条件的记录。索引中包含人员的姓名和id信息可防止引擎转到原始数据页以从中提取列以进行最终排序/限制

SELECT 
        e.*
    FROM 
        employee e 
            INNER JOIN person p 
                ON e.person_id = p.id
    WHERE 
            e.membership_id = 1 
        AND e.type = 'INDIVIDUAL' 
        AND e.enabled = 1
    ORDER BY 
        p.first_name ASC, 
        p.last_name ASC, 
        p.id ASC
    LIMIT 
        100;
8zzbczxx

8zzbczxx2#

将名字和姓氏冗余存储在 employee 表是一个选项-但有缺点。你必须管理冗余。为了保证一致性,可以将这些列作为外键的一部分。 ON UPDATE CASCADE 我会给你一些工作。但仍然需要重写insert语句或使用触发器。与 first_name 以及 last_name 是世界的一部分 employee 表中,您将能够为查询创建最佳索引。该表如下所示:

CREATE TABLE employee (
    id INTEGER PRIMARY KEY AUTO_INCREMENT,
    membership_id INTEGER NOT NULL,
    type VARCHAR(15),
    enabled BIT NOT NULL,
    person_id INTEGER NOT NULL REFERENCES person ( id ),
    CONSTRAINT fk_employee_membership_id FOREIGN KEY ( membership_id ) REFERENCES membership ( id ),
    CONSTRAINT fk_employee_person FOREIGN KEY ( person_id, first_name, last_name ) 
                                  REFERENCES person ( id, first_name, last_name ),
    INDEX (membership_id, type, enabled, first_name, last_name, person_id)
) engine=InnoDB;

查询将更改为:

SELECT e.*
FROM employee e
WHERE e.membership_id = 1 AND e.type = 'INDIVIDUAL' AND e.enabled = 1
ORDER BY e.first_name ASC, e.last_name ASC, e.person_id ASC
LIMIT 100;

不过,如果可能的话,我会避免这样的改变。可能还有其他方法可以使用order by的索引。我首先尝试将where条件移动到相关exists子查询中:

SELECT e.*
FROM person p INNER JOIN employee e ON p.id = e.person_id
WHERE EXISTS (
  SELECT *
  FROM employee e1
  WHERE e1.person_id = p.id
    AND e1.membership_id = 1
    AND e1.type = 'INDIVIDUAL'
    AND e1.enabled = 1
)
ORDER BY p.first_name ASC, p.last_name ASC, p.id ASC
LIMIT 100;

现在,为了计算子查询,引擎需要 p.id ,所以它必须从 person 首先是表(您将在执行计划中看到)。我想从索引中读出来会很聪明。注意,在innodb中,主键总是任何次键的一部分。所以 idx_person_first_name_last_name 索引实际上是打开的 (first_name, last_name, id) .

相关问题