oracle SQL:如何在列中找到不同的值并显示它们的计数?

2wnc66cl  于 2023-05-06  发布在  Oracle
关注(0)|答案(3)|浏览(202)

我有下面的Oracle SQL表,其中product_id 101和103分别在attr1attr3列中具有不同的值。
data

  1. | PRODUCT_ID | ATTR1 | ATTR2 | ATTR3 |
  2. |------------|-------|-------|-------|
  3. | 101 | a | x | z |
  4. | 101 | a | x | zzz |
  5. | 101 | aa | x | z |
  6. | 102 | b | y | z |
  7. | 102 | b | y | z |
  8. | 103 | c | z | z |
  9. | 103 | c | z | zz |

我想得到以下输出,其中列出了不同的值及其在列中的计数。
output

  1. | PRODUCT_ID | DESCR | VALUE_COUNT |
  2. |------------|------------------|--------------|
  3. | 101 | Issue with attr1 | a(2), aa(1) |
  4. | 101 | Issue with attr3 | z(2), zzz(1) |
  5. | 103 | Issue with attr3 | z(1), zz(1) |

我编写了一个查询来获取一列的结果,但是要为实际数据编写查询需要花费相当大的精力,因为我需要检查20多列的不同值。有什么建议可以让它更有效率吗?
query

  1. WITH data AS (
  2. SELECT 101 product_id, 'a' attr1, 'x' attr2, 'z' attr3 FROM dual UNION ALL
  3. SELECT 101 product_id, 'a' attr1, 'x' attr2, 'zzz' attr3 FROM dual UNION ALL
  4. SELECT 101 product_id, 'aa' attr1, 'x' attr2, 'z' attr3 FROM dual UNION ALL
  5. SELECT 102 product_id, 'b' attr1, 'y' attr2, 'z' attr3 FROM dual UNION ALL
  6. SELECT 102 product_id, 'b' attr1, 'y' attr2, 'z' attr3 FROM dual UNION ALL
  7. SELECT 103 product_id, 'c' attr1, 'z' attr2, 'z' attr3 FROM dual UNION ALL
  8. SELECT 103 product_id, 'c' attr1, 'z' attr2, 'zz' attr3 FROM dual
  9. ), d1 AS (
  10. SELECT product_id, 'Issue with attr1' descr
  11. FROM data
  12. GROUP BY product_id
  13. HAVING COUNT(DISTINCT attr1) > 1
  14. ), d2 AS (
  15. SELECT DISTINCT d1.product_id, d1.descr, data.attr1, COUNT(attr1) OVER (PARTITION BY attr1) cnt
  16. FROM d1
  17. INNER JOIN data
  18. ON d1.product_id = data.product_id
  19. )
  20. SELECT product_id, descr, LISTAGG(attr1 || '(' || cnt || ')', ', ') WITHIN GROUP (ORDER BY product_id) value_count
  21. FROM d2
  22. GROUP BY product_id, descr
  23. ;
dsekswqp

dsekswqp1#

您可以将所有属性反透视到单独的行中,计算每个属性和值的行数,并将其与每个product_id的行数进行比较。然后将错误聚合回listagg
这只需要向inpivot ... for ...添加更多列。

  1. with prep as (
  2. select
  3. sample.*
  4. /*Rowcount per product_id*/
  5. , count(1) over(partition by product_id) as rowcnt
  6. from sample
  7. )
  8. , unp as (
  9. select
  10. product_id,
  11. /*Classify the issue*/
  12. case
  13. /*Count per value*/
  14. when count(1) != rowcnt
  15. then 'Issue with ' || col
  16. end as issue,
  17. /*Count per value*/
  18. coalesce(val, 'NULL') || '(' || count(1) || ')' as cnt
  19. from prep
  20. unpivot include nulls (
  21. val for col in (attr1, attr2, attr3)
  22. )
  23. group by
  24. product_id,
  25. col,
  26. val,
  27. rowcnt
  28. )
  29. /*Aggregate all*/
  30. select
  31. product_id,
  32. issue,
  33. listagg(cnt, ', ') as value_count
  34. from unp
  35. where issue is not null
  36. group by
  37. product_id,
  38. issue
  39. order by 1, 2
产品编号问题VALUE_COUNT
一百零一ATTR 1问题a(2)、aa(1)
一百零一关于ATTR 3zz(2),zzz(1)
一百零三关于ATTR 3zz(1),zz(1)

fiddle

UPD:默认unpivot removes null values from output. To include them you need to add include nulls`。

如果向示例数据中添加一行null

  1. insert into sample (PRODUCT_ID, ATTR1, ATTR2, ATTR3)
  2. values (103, 'c', 'z', null)

上述(修改后)查询将返回:
| 产品编号|问题|VALUE_COUNT|
| --------------|--------------|--------------|
| 一百零一|ATTR 1问题|a(2)、aa(1)|
| 一百零一|关于ATTR 3|zz(2),zzz(1)|
| 一百零三|关于ATTR 3|(1)、zz(1)、NULL(1)|
fiddle

展开查看全部
rur96b6h

rur96b6h2#

尝试先GROUPing,然后围绕它Listagg。就像这样:

  1. WITH data AS (
  2. SELECT 101 product_id, 'a' attr1, 'x' attr2, 'z' attr3 FROM dual UNION ALL
  3. SELECT 101 product_id, 'a' attr1, 'x' attr2, 'zzz' attr3 FROM dual UNION ALL
  4. SELECT 101 product_id, 'aa' attr1, 'x' attr2, 'z' attr3 FROM dual UNION ALL
  5. SELECT 102 product_id, 'b' attr1, 'y' attr2, 'z' attr3 FROM dual UNION ALL
  6. SELECT 102 product_id, 'b' attr1, 'y' attr2, 'z' attr3 FROM dual UNION ALL
  7. SELECT 103 product_id, 'c' attr1, 'z' attr2, 'z' attr3 FROM dual UNION ALL
  8. SELECT 103 product_id, 'c' attr1, 'z' attr2, 'zz' attr3 FROM dual
  9. )
  10. SELECT product_id,'Issue with attr1' descr,LISTAGG(value_count,',') WITHIN GROUP (ORDER BY value_count) value_count
  11. FROM (SELECT product_id,
  12. attr1||'('||COUNT(*)||')' value_count
  13. FROM (SELECT product_id,
  14. attr1,
  15. COUNT(DISTINCT attr1) OVER (PARTITION BY product_id) cnt
  16. FROM data)
  17. WHERE cnt > 1
  18. GROUP BY product_id,
  19. attr1)
  20. GROUP BY product_id
  21. UNION ALL
  22. SELECT product_id,'Issue with attr3' descr,LISTAGG(value_count,',') WITHIN GROUP (ORDER BY value_count) value_count
  23. FROM (
  24. SELECT product_id,
  25. attr3||'('||COUNT(*)||')' value_count
  26. FROM (SELECT product_id,
  27. attr3,
  28. COUNT(DISTINCT attr3) OVER (PARTITION BY product_id) cnt
  29. FROM data)
  30. WHERE cnt > 1
  31. GROUP BY product_id,
  32. attr3)
  33. GROUP BY product_id
展开查看全部
xxls0lw8

xxls0lw83#

如果你没有太多的属性,你可以这样做:

  1. WITH data AS (
  2. SELECT 101 product_id, 'a' attr1, 'x' attr2, 'z' attr3 FROM dual UNION ALL
  3. SELECT 101 product_id, 'a' attr1, 'x' attr2, 'zz' attr3 FROM dual UNION ALL
  4. SELECT 101 product_id, 'aa' attr1, 'x' attr2, 'z' attr3 FROM dual UNION ALL
  5. SELECT 102 product_id, 'b' attr1, 'y' attr2, 'z' attr3 FROM dual UNION ALL
  6. SELECT 102 product_id, 'b' attr1, 'y' attr2, 'z' attr3 FROM dual UNION ALL
  7. SELECT 103 product_id, 'c' attr1, 'z' attr2, 'z' attr3 FROM dual UNION ALL
  8. SELECT 103 product_id, 'c' attr1, 'z' attr2, 'zz' attr3 FROM dual
  9. ), count_ as (SELECT product_id, COUNT(DISTINCT attr1) attr1
  10. , COUNT(DISTINCT attr2) attr2
  11. , COUNT(DISTINCT attr3) attr3
  12. FROM data
  13. GROUP BY product_id ),
  14. d2 AS (
  15. SELECT DISTINCT count_.product_id, 'Issue with attr1' descr , data.attr1, COUNT(data.attr1) OVER (PARTITION BY data.attr1) cnt
  16. FROM count_
  17. INNER JOIN data
  18. ON count_.product_id = data.product_id AND count_.attr1 > 1
  19. UNION ALL
  20. SELECT DISTINCT count_.product_id, 'Issue with attr2' descr , data.attr2, COUNT(data.attr2) OVER (PARTITION BY data.attr2) cnt
  21. FROM count_
  22. INNER JOIN data
  23. ON count_.product_id = data.product_id AND count_.attr2 > 1
  24. UNION ALL
  25. SELECT DISTINCT count_.product_id, 'Issue with attr3' descr , data.attr3, COUNT(data.attr3) OVER (PARTITION BY data.attr3) cnt
  26. FROM count_
  27. INNER JOIN data
  28. ON count_.product_id = data.product_id AND count_.attr3 > 1
  29. )
  30. SELECT product_id, descr, LISTAGG(attr1 || '(' || cnt || ')', ', ') WITHIN GROUP (ORDER BY product_id) value_count
  31. FROM d2
  32. GROUP BY product_id, descr
  33. ;
产品编号DESCRVALUE_COUNT
一百零一关于ATTR 1a(2)、aa(1)
一百零一关于ATTR 3z(3),zz(2)
一百零三关于ATTR 3z(3),zz(2)

fiddle

展开查看全部

相关问题