如何从date列中的每个给定行值中获取最后一行?

ubby3x7f  于 2021-06-18  发布在  Mysql
关注(0)|答案(2)|浏览(357)

这个问题在这里已经有答案了

检索每个组中的最后一条记录-mysql(29个答案)
两年前关门了。
我有下表。

+--------------------+--------------+-------+
Date                 | SymbolNumber | Value
+--------------------+--------------+-------+
 2018-08-31 15:00:00 | 123          | data
 2018-09-31 15:00:00 | 456          | data
 2018-09-31 15:00:00 | 123          | data
 2018-09-31 15:00:00 | 555          | data
 2018-10-31 15:00:00 | 555          | data
 2018-10-31 15:00:00 | 231          | data
 2018-10-31 15:00:00 | 123          | data
 2018-11-31 15:00:00 | 123          | data
 2018-11-31 15:00:00 | 555          | data
 2018-12-31 15:00:00 | 123          | data
 2018-12-31 15:00:00 | 555          | data

我需要一个查询,可以选择查询中所述的每个符号编号的最后一行。

SELECT
    *
FROM
    MyTable
WHERE
    symbolNumber IN (123, 555)
AND
  **lastOfRow ordered by latest-date**

预期结果:

2018-12-31 15:00:00 | 123 | data
 2018-12-31 15:00:00 | 555 | data

我该怎么做?

x759pob2

x759pob21#

首先,您将需要一个查询,该查询将获取每个事件的最新日期 symbolNumber . 第二,你可以 inner join 此表(使用 date )去拿剩下的柱子。这样地:

SELECT
    t.*
FROM
   <table_name> AS t
INNER JOIN
    (SELECT
        symbolNumber,
        MAX(date) AS maxDate
    FROM
        <table_name>
    GROUP BY
       symbolNumber) AS latest_date ON latest_date.symbolNumber = t.symbolNumber AND latest_date.maxDate = t.date

前面的查询将获取每个现有数据库的最新数据 symbolNumber 在table上。如果你想限制 symbolNumbers: 123 and 555 ,您将需要进行下一次修改:

SELECT
    t.*
FROM
   <table_name> AS t
INNER JOIN
    (SELECT
        symbolNumber,
        MAX(date) AS maxDate
    FROM
        <table_name>
    WHERE
        symbolNumber IN (123, 555)
    GROUP BY
       symbolNumber) AS latest_date ON latest_date.symbolNumber = t.symbolNumber AND latest_date.maxDate = t.date
nbysray5

nbysray52#

我们可以在上做“自左连接” symbolNumber ,并匹配同一组中具有较高 Date 右边的值。
我们最终只考虑那些无法找到更高日期的行(这意味着当前行属于组中的最高日期)。
下面是一个避免子查询和利用 Left Join :

SELECT t1.* 
FROM MyTable AS t1 
LEFT JOIN MyTable AS t2 
  ON t2.symbolNumber = t1.symbolNumber AND 
     t2.Date > t1.Date -- Joining to a row in same group with higher date
WHERE t1.symbolNumber IN (123, 555) AND 
      t2.symbolNumber IS NULL  -- Higher date not found; so this is highest row

编辑:
比较基准研究 Left Join 方法v/s派生表(子查询)
@草莓在5.6.21中运行了一个小基准测试。这是他发现的。。。

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,dense_user INT NOT NULL
,sparse_user INT NOT NULL
);

INSERT INTO my_table (dense_user,sparse_user) 
SELECT RAND()*100,RAND()*100000;

INSERT INTO my_table (dense_user,sparse_user)
SELECT RAND()*100,RAND()*100000 FROM my_table;
-- REPEAT THIS LINE A FEW TIMES !!!

SELECT COUNT(DISTINCT dense_user) dense
     , COUNT(DISTINCT sparse_user) sparse
     , COUNT(*) total 
  FROM my_table;
+-------+--------+---------+
| dense | sparse | total   |
+-------+--------+---------+
|   101 |  99999 | 1048576 |
+-------+--------+---------+

ALTER TABLE my_table ADD INDEX(dense_user);

ALTER TABLE my_table ADD INDEX(sparse_user);

--dense_test
SELECT x.* 
  FROM my_table x 
  LEFT 
  JOIN my_table y 
    ON y.dense_user = x.dense_user 
   AND y.id < x.id 
 WHERE y.id IS NULL 
 ORDER 
    BY dense_user 
 LIMIT 10;
+------+------------+-------------+
| id   | dense_user | sparse_user |
+------+------------+-------------+
| 1212 |          0 |        1950 |
|  153 |          1 |       23193 |
|  255 |          2 |       27472 |
|   28 |          3 |       86440 |
|   18 |          4 |       47886 |
|  291 |          5 |       76563 |
|   15 |          6 |       85049 |
|   16 |          7 |       78384 |
|  135 |          8 |       52304 |
|   62 |          9 |       40930 |
+------+------------+-------------+
10 rows in set (2.64 sec)

SELECT x.* 
  FROM my_table x 
  JOIN 
     ( SELECT dense_user, MIN(id) id FROM my_table GROUP BY dense_user ) y 
    ON y.dense_user = x.dense_user 
   AND y.id = x.id 
 ORDER 
    BY dense_user 
 LIMIT 10;
+------+------------+-------------+
| id   | dense_user | sparse_user |
+------+------------+-------------+
| 1212 |          0 |        1950 |
|  153 |          1 |       23193 |
|  255 |          2 |       27472 |
|   28 |          3 |       86440 |
|   18 |          4 |       47886 |
|  291 |          5 |       76563 |
|   15 |          6 |       85049 |
|   16 |          7 |       78384 |
|  135 |          8 |       52304 |
|   62 |          9 |       40930 |
+------+------------+-------------+
10 rows in set (0.05 sec)

Uncorrelated query is 50 times faster.

--sparse test
SELECT x.* 
  FROM my_table x 
  LEFT 
  JOIN my_table y 
    ON y.sparse_user = x.sparse_user 
   AND y.id < x.id 
 WHERE y.id IS NULL 
 ORDER 
    BY sparse_user 
 LIMIT 10;
+--------+------------+-------------+
| id     | dense_user | sparse_user |
+--------+------------+-------------+
| 165055 |         75 |           0 |
|  37598 |         63 |           1 |
| 170596 |         70 |           2 |
|  46142 |         87 |           3 |
|  33546 |         21 |           4 |
| 323114 |         87 |           5 |
|  86592 |         96 |           6 |
| 156711 |         36 |           7 |
|  17148 |         62 |           8 |
| 139965 |         71 |           9 |
+--------+------------+-------------+
10 rows in set (0.03 sec)

SELECT x.* 
  FROM my_table x 
  JOIN ( SELECT sparse_user, MIN(id) id FROM my_table GROUP BY sparse_user ) y 
    ON y.sparse_user = x.sparse_user 
   AND y.id = x.id 
 ORDER 
    BY sparse_user 
 LIMIT 10;
+--------+------------+-------------+
| id     | dense_user | sparse_user |
+--------+------------+-------------+
| 165055 |         75 |           0 |
|  37598 |         63 |           1 |
| 170596 |         70 |           2 |
|  46142 |         87 |           3 |
|  33546 |         21 |           4 |
| 323114 |         87 |           5 |
|  86592 |         96 |           6 |
| 156711 |         36 |           7 |
|  17148 |         62 |           8 |
| 139965 |         71 |           9 |
+--------+------------+-------------+
10 rows in set (4.73 sec)

Exclusion Join is 150 times faster

However, as you move further up the result set, the picture begins to change very dramatically...

SELECT x.* 
  FROM my_table x 
  JOIN ( SELECT sparse_user, MIN(id) id FROM my_table GROUP BY sparse_user ) y 
    ON y.sparse_user = x.sparse_user 
   AND y.id = x.id 
 ORDER 
    BY sparse_user 
 LIMIT 10000,10; 
+--------+------------+-------------+
| id     | dense_user | sparse_user |
+--------+------------+-------------+
|   9810 |         93 |       10000 |
| 162438 |          4 |       10001 |
| 467371 |         62 |       10002 |
|   8258 |         13 |       10003 |
| 297049 |         17 |       10004 |
|  68354 |         23 |       10005 |
| 192701 |         64 |       10006 |
| 176225 |         92 |       10007 |
| 156595 |         37 |       10008 |
| 318266 |          1 |       10009 |
+--------+------------+-------------+
10 rows in set (9.17 sec)

SELECT x.* 
  FROM my_table x 
  LEFT 
  JOIN my_table y 
    ON y.sparse_user = x.sparse_user 
   AND y.id < x.id 
 WHERE y.id IS NULL 
 ORDER 
    BY sparse_user 
 LIMIT 10000,10;
+--------+------------+-------------+
| id     | dense_user | sparse_user |
+--------+------------+-------------+
|   9810 |         93 |       10000 |
| 162438 |          4 |       10001 |
| 467371 |         62 |       10002 |
|   8258 |         13 |       10003 |
| 297049 |         17 |       10004 |
|  68354 |         23 |       10005 |
| 192701 |         64 |       10006 |
| 176225 |         92 |       10007 |
| 156595 |         37 |       10008 |
| 318266 |          1 |       10009 |
+--------+------------+-------------+
10 rows in set (32.19 sec) -- !!!

总之,排除连接(exclusion-join,即所谓的“草莓查询”)在某些有限的情况下可以(显著地)更快。一般来说,不相关的查询会更快。

相关问题