用于为特定列值筛选重复行的查询

gorkyyrv  于 2021-08-13  发布在  Java
关注(0)|答案(3)|浏览(314)

我在下面的table上有values:-

  1. create or Replace table CourseLibrary (
  2. Parent_code varchar(250),
  3. Parent_Name varchar(250),
  4. Course_Version varchar(250),
  5. Course_Locale varchar(250),
  6. Timestamp varchar(250)
  7. );
  8. Insert into Courselibrary values
  9. ('UXPUEMBR007', 'Back Injury Prevention', '1.0.004', 'en_US', '3'),
  10. ('UXPUEMBR007', 'Back Injury Prevention2', '1.0.002', 'en_US', '4'),
  11. ('UXPUEMBR007', 'Back Injury Prevention1', '1.0.004', 'en_BR', '5'),
  12. ('UXPUEMBR007', 'Back Injury Prevention', '1.0.003', 'en_US', '2'),
  13. ('UXPUEMBR007', 'Back Injury Prevention', '1.0.004', 'en_US', '1'),
  14. ('UXPUEMBR008', 'House Safety', '1.0.006', 'en_US', '1'),
  15. ('UXPUEMBR008', 'House Safety', '1.0.005', 'en_US', '1'),
  16. ('UXPUEMBR008', 'House Safety', '1.0.006', 'en_US', '2');

需要查询以筛选以下结果集:

  1. ('UXPUEMBR007', 'Back Injury Prevention', '1.0.004', 'en_US', '3')
  2. ('UXPUEMBR008', 'House Safety', '1.0.006', 'en_US', '2')

我需要一个给定的父代码行。如果有多个父代码,那么我必须查找最高的course\u版本,如果甚至course\u版本都相同,那么我只能查找course\u locale='en\u us'。如果两个语言环境相同,那么最后我必须检查最高的时间戳值。
注意:没有主键
priority:-1st highest 课程版本,第二课程语言环境=“en\u us”,第三时间戳

abithluo

abithluo1#

解决方案1:

  1. WITH top_records AS (SELECT PARENT_CODE, PARENT_NAME, Course_Version, Course_Locale, Timestamp, RANK() OVER (PARTITION BY PARENT_CODE ORDER BY Course_Version DESC, Timestamp DESC) RN FROM Courselibrary WHERE Course_Locale = 'en_US') Select * from top_records where RN = 1;

解决方案2:

  1. Select PARENT_CODE, PARENT_NAME, Course_Version, Course_Locale, Timestamp from (SELECT PARENT_CODE, PARENT_NAME, Course_Version, Course_Locale, Timestamp, RANK() OVER (PARTITION BY PARENT_CODE ORDER BY Course_Version DESC, Timestamp DESC) RN FROM Courselibrary WHERE Course_Locale = 'en_US') Where RN = 1;
wrrgggsh

wrrgggsh2#

在mysql 8+中,您将使用 row_number() . 在早期版本中,可以使用相关子查询,但这假定每行上都有一个唯一的标识符。因此,将表定义为:

  1. create table CourseLibrary (
  2. CourseLibrary_Id int auto_increment primary key,
  3. Parent_code varchar(250),
  4. Parent_Name varchar(250),
  5. Course_Version varchar(250),
  6. Course_Locale varchar(250),
  7. Timestamp varchar(250)
  8. );

我还认为 timestamp 应存储为日期/时间值,而不是字符串。
然后你可以做:

  1. select cl.*
  2. from CourseLibrary cl
  3. where cl.CourseLibrary_Id = (select cl2.CourseLibrary_Id
  4. from CourseLibrary cl2
  5. where cl2.Parent_code = cl.Parent_code
  6. order by Course_Version desc,
  7. (course_locale = 'en_US') desc,
  8. timestamp desc
  9. limit 1
  10. );

你真的不需要 CourseLibrary_Id . 任何主键都可以。但是,您的表没有id,所以最好使用自动递增的id。

展开查看全部
lymnna71

lymnna713#

  1. select max(timestamp) as timestamp, parent_code, parent_name, course_version, course_locale from (select * from courselibrary where parent_code='UXPUEMBR007' and (select count(*) from courselibrary where parent_code='UXPUEMBR007') >= 1 and course_version=(select max(course_version) from courselibrary where parent_code='UXPUEMBR007') and course_locale='en_US') as resultset group by parent_code, parent_name, course_version, course_locale;

相关问题