I am having an issue with what I thought would be pretty simple - I have a table, with sometimes several rows for the same entry corresponding to different dates. I would like to write a query to get only one row per entry, and always with the latest date. The latest date can be different for different entries.
For example, if the table looks like this :
+------+------+------------+---------+
| SKU | Flag | Date | Version |
+------+------+------------+---------+
| SKU1 | Up | 2020-02-01 | V2 |
| SKU2 | Up | 2020-01-01 | V2 |
| SKU2 | Down | 2020-04-01 | V2 |
| SKU3 | Up | 2020-04-20 | V2 |
| SKU3 | Down | 2020-03-01 | V2 |
| SKU4 | Down | 2020-01-01 | V1 |
+------+------+------------+---------+
I would like to return that :
+------+------+------------+---------+
| SKU | Flag | Date | Version |
+------+------+------------+---------+
| SKU1 | Up | 2020-02-01 | V2 |
| SKU2 | Down | 2020-04-01 | V2 |
| SKU3 | Up | 2020-04-20 | V2 | |
+------+------+------------+---------+
I tried several things but couldn't get what I wanted...the last thing I tried was something like this :
SELECT distinct sku,flag,version,max(date) AS latest_date
FROM table
GROUP BY sku,flag,version
HAVING version='V2'
However doing that I still get several entries for the same SKU with different dates... What am I doing wrong ? Also the date in the table is a VarChar type - I tried converting it to a date type (is it necessary?) replacing max(date) by max(convert(varchar,date)), but without success
Thank you in advance !
3条答案
按热度按时间dy1byipe1#
A simple method with good performance is a correlated subquery:
For performance, you want an index on
(sku, date)
.ccrfmcuu2#
One option is to use
WITH TIES
Example
2guxujil3#
You can use
row_number()
:Note that your expected result is missing a row for
'SKU4'
- I assume that's a typo.