Oracle SQL中的相交范围,也包括不相交的范围

sigwle7e  于 2023-04-20  发布在  Oracle
关注(0)|答案(3)|浏览(350)

我无法解决以下问题:
我有两个包含范围和值的数据集,例如
数据集1
| 从|到|价值|
| --------------|--------------|--------------|
| 0|二十|A|
| 二十|五十|B|
| 五十|一百五十|C|
| 一百五十|一百八十|X|
数据集2
| 从|到|价值|
| --------------|--------------|--------------|
| 10个|三十|D|
| 七十|一百|E|
我想使用SQL将这些范围相交,但也包括Dataset 1中的所有范围。
所以结果应该是:
| 从|到|价值|
| --------------|--------------|--------------|
| 0|10个|A|
| 10个|二十|AD|
| 二十|三十|BD|
| 三十|五十|B|
| 五十|七十|C|
| 七十|一百|CE|
| 一百|一百五十|C|
| 一百五十|一百八十|X|
使用SQL很容易只找到相交的部分(10-20,20-30,70-100)和完全不相交的部分(150-180)-我正在努力与部分相交的部分(0-10,30-50,50-70,100-150)。
我把这个用于交叉部分:

SELECT MAX(t1.range_start, t2.range_start) AS intersect_start, 
       MIN(t1.range_end, t2.range_end) AS intersect_end
FROM ranges t1
JOIN ranges t2 ON t1.range_start < t2.range_end AND t1.range_end > t2.range_start

有人能告诉我解决方案吗?在普通SQL中会很好,但也可以在PL/SQL中

wooyq4lh

wooyq4lh1#

您可以将数据集解透视并合并它们,然后使用LEAD分析函数来查找范围中的下一个边界,然后使用递归查询来迭代边界并聚合,添加或删除值,分别在其范围的开始或结束时:

WITH data (value, start_end, bound, dataset) AS (
  SELECT value, start_end, bound, 1
  FROM   dataset1
  UNPIVOT (bound FOR start_end IN ("FROM" AS 1, "TO" AS -1)) d
UNION ALL
  SELECT value, start_end, bound, 2
  FROM   dataset2
  UNPIVOT (bound FOR start_end IN ("FROM" AS 1, "TO" AS -1)) d
),
bounds (value, start_end, bound, next_bound, rn) AS (
  SELECT value,
         start_end,
         bound,
         LEAD(bound) OVER (ORDER BY bound, start_end, dataset),
         ROW_NUMBER() OVER (ORDER BY bound, start_end, dataset)
  FROM   data
),
groups (value, bound, next_bound, rn) AS (
  SELECT value, bound, next_bound, rn
  FROM   bounds
  WHERE  rn = 1
UNION ALL
  SELECT CASE b.start_end
         WHEN 1
         THEN g.value || b.value
         ELSE REPLACE(g.value, b.value)
         END,
         b.bound,
         b.next_bound,
         b.rn
  FROM   bounds b
         INNER JOIN groups g
         ON (g.rn + 1 = b.rn)
)
SELECT value,
       bound AS "FROM",
       next_bound AS "TO"
FROM   groups
WHERE  bound < next_bound
ORDER BY rn;

其中,对于样本数据:

CREATE TABLE Dataset1 ("FROM", "TO", Value) AS
  SELECT   0,  20, 'A' FROM DUAL UNION ALL
  SELECT  20,  50, 'B' FROM DUAL UNION ALL
  SELECT  50, 150, 'C' FROM DUAL UNION ALL
  SELECT 150, 180, 'X' FROM DUAL;

CREATE TABLE Dataset2 ("FROM", "TO", Value) AS
  SELECT  10,  30, 'D' FROM DUAL UNION ALL
  SELECT  70, 100, 'E' FROM DUAL;

输出:
| 价值|从|TO|
| --------------|--------------|--------------|
| A|0|10个|
| AD|10个|二十|
| DB|二十|三十|
| B|三十|五十|
| C|五十|七十|
| CE|七十|一百|
| C|一百|一百五十|
| X|一百五十|一百八十|
fiddle

crcmnpdw

crcmnpdw2#

with un as (
  select * from ds1 union all select * from ds2),
up as (
  select distinct rng from un unpivot (rng for col in (r1, r2))),
ld as (
  select rng r1, lead(rng) over (order by rng) r2 from up)
select ld.r1, ld.r2, listagg(value) within group (order by value) list 
  from ld join un on un.r1 < ld.r2 and ld.r1 < un.r2
  group by ld.r1, ld.r2

dbfiddle
说明:

  • un-两个表的并集
  • up-未透视un,所有不同范围的一列
  • ld-以上,具有下一个范围值

最后,ldunld连接,listagg()对值进行分组

62lalag4

62lalag43#

你可以试试这个查询

WITH intersecting AS (
       SELECT GREATEST(d1.StartRange, d2.StartRange) AS StartRange, 
              LEAST(d1.EndRange, d2.EndRange) AS EndRange,
              CONCAT(d1.Value, d2.Value) AS Value, 
              d1.StartRange AS StartRange1,
              d1.EndRange AS EndRange1
       FROM   Dataset1 d1
              INNER JOIN Dataset2 d2 
                ON d1.StartRange <= d2.EndRange AND d1.EndRange >= d2.StartRange
     ),
     partially_intersecting AS (
       SELECT d1.StartRange AS StartRange,
              COALESCE(i1.StartRange, d1.EndRange)  AS EndRange,
              d1.Value
       FROM   Dataset1 d1
              LEFT JOIN intersecting i1 ON d1.StartRange = i1.StartRange1
       WHERE  d1.StartRange <> COALESCE(i1.StartRange, d1.EndRange)
       UNION
       SELECT COALESCE(i2.EndRange, d1.StartRange)  AS StartRange,
              d1.EndRange AS EndRange,
              d1.Value
       FROM   Dataset1 d1       
              LEFT JOIN intersecting i2 ON d1.EndRange = i2.EndRange1
       WHERE  d1.EndRange <> COALESCE(i2.EndRange, d1.StartRange)
     ),
     output_data AS (
       SELECT StartRange, EndRange, Value FROM intersecting
       UNION ALL 
       SELECT StartRange, EndRange, Value FROM partially_intersecting
     )

SELECT * FROM output_data ORDER BY StartRange

参见演示here

相关问题