我有两个名为ny_clean(3454602个条目)和picku_ids_temp_table(2739268个条目)的表,它们都有一个id char(11)列,它是主键,上面有一个btree索引(mysql 5.7)。
picku\u 0\u ids\u temp\u表中的“id”列是ny\u clean的子集,我想得到一个结果,它是ny\u clean,没有picku\u 0\u ids\u temp\u表中的id值。
方案1:
EXPLAIN
SELECT *
FROM pickup_0_ids_temp_table as t
JOIN ny_clean as n
ON n.id != t.id;
+----+-------------+----------+------------+-------+---------------+-------------------+---------+------+---------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+-------------------+---------+------+---------+----------+-----------------------------------------------------------------+
| 1 | SIMPLE | t | NULL | index | NULL | PRIMARY | 11 | NULL | 2734512 | 100.00 | Using index |
| 1 | SIMPLE | ny_clean | NULL | index | NULL | btree_pk_ny_clean | 11 | NULL | 3445904 | 90.00 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+----------+------------+-------+---------------+-------------------+---------+------+---------+----------+-----------------------------------------------------------------+
方案2:
EXPLAIN
SELECT *
FROM ny_clean as n
WHERE n.id NOT IN (
SELECT id
FROM pickup_0_ids_temp_table);
+----+--------------------+-------------------------+------------+-----------------+------------------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------------------------+------------+-----------------+------------------------+---------+---------+------+---------+----------+-------------+
| 1 | PRIMARY | n | NULL | ALL | NULL | NULL | NULL | NULL | 3445904 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | pickup_0_ids_temp_table | NULL | unique_subquery | PRIMARY,btree_pickup_0 | PRIMARY | 11 | func | 1 | 100.00 | Using index |
+----+--------------------+-------------------------+------------+-----------------+------------------------+---------+---------+------+---------+----------+-------------+
然后在这个较大的查询中使用其中一个选项
EXPLAIN
INSERT INTO y
SELECT id, pickup_longitude, pickup_latitude
FROM x
JOIN
(OPTION 1 OR 2) as z
ON z.id = x.id;
当我在较大的查询中使用选项1时,它运行了两天,但没有完成。另一方面,选项2在不到30分钟内完成了任务
我的问题是:为什么?遵循mysql文档(https://dev.mysql.com/doc/refman/5.7/en/subquery-materialization.html)我怀疑这是由于子查询的物化造成的,但我该如何检查呢?
我对解释输出的解释错了吗?因为从它来看,我希望选项1更快,因为它在两个表上都使用索引
或者它与更大的查询有关?
提前谢谢
1条答案
按热度按时间hfyxw5xn1#
你的选择1没有做你认为会做的事。
如果你有两张table
1,2
1,3
2,1
2,3
3,1
3,2