oracle 在不同长度的子字符串上使用Case语句的SQL连接导致交叉连接

yvfmudvl  于 2023-10-16  发布在  Oracle
关注(0)|答案(3)|浏览(75)

我尝试连接两个表,以便返回匹配的最深子串。因此,如果我的第一个表有'AAA'和'阿坝',我的第二个表有'A'和'AB',那么连接应该分别返回'A'和'AB'。不幸的是,它返回了三行。我想在不使用多个连接和合并的情况下做到这一点。这就是我的。

With t1 as (
  Select 'AAA' as V from dual
  Union all
  Select 'ABA' as V from dual
)

,t2 as (
  Select 'A' as m from dual
  Union all 
  Select 'AB' as m from dual
)

,r1 as (
  Select
    t1.v
    ,t2.m
  From t1
  Left join t2 on
   t2.m = Case
     When substring(t1.v,1,2) = t2.m
      then substring(t1.v,1,2)
     When substring(t1.v,1,1) = t2.m 
      then substring(t1.v, 1,1)
   End
)

Select * from r1

不幸的是,我得到重复的阿坝在我的结果
| V| M|
| --|--|
| AAA|一|
| 阿坝|一|
| 阿坝|AB|
我已经尝试了几种不同的方法(在on语句和连接顺序中翻转t1和t2),但没有一种方法能让我得到想要的结果。感谢你的帮助。

goucqfw6

goucqfw61#

在Oracle 12中,使用LATERAL连接,然后使用ORDER BY LENGTH(m) DESCFETCH FIRST ROW ONLY(或FETCH FIRST ROW WITH TIES)来获得最长的匹配:

With t1 (v) as (
  Select 'AAA' from dual Union all
  Select 'ABA' from dual
),
t2 (m) as (
  Select 'A'  from dual Union all 
  Select 'AB' from dual
)
SELECT t1.v
     , t2.m
FROM   t1
       LEFT JOIN LATERAL (
         SELECT m
         FROM   t2
         WHERE  t1.v LIKE t2.m || '%'
         ORDER BY LENGTH(t2.m) DESC
         FETCH FIRST ROW ONLY
       ) t2
       ON 1 = 1

其输出:
| V| M|
| --|--|
| AAA|一|
| 阿坝|AB|
fiddle

mqxuamgl

mqxuamgl2#

这只是另一种方法。样本数据有点“差”,所以这取决于你真正拥有的是什么,以及我的建议是否有意义。基本上,你会计算两个字符串之间的 * 距离 *,根据 * 相似性 * 对结果进行排序,并返回最匹配的结果。
样本数据:

SQL> WITH
  2     t1
  3     AS
  4        (SELECT 'AAA' AS V FROM DUAL
  5         UNION ALL
  6         SELECT 'ABA' AS V FROM DUAL),
  7     t2
  8     AS
  9        (SELECT 'A' AS m FROM DUAL
 10         UNION ALL
 11         SELECT 'AB' AS m FROM DUAL),

查询从这里开始:

12     temp
 13     AS
 14        (SELECT t1.v,
 15                t2.m,
 16                UTL_MATCH.jaro_winkler_similarity (t1.v, t2.m) sim,
 17                ROW_NUMBER ()
 18                   OVER (PARTITION BY t1.v
 19                         ORDER BY UTL_MATCH.jaro_winkler_similarity (t1.v, t2.m) DESC) rn
 20           FROM t1 CROSS JOIN t2)
 21  SELECT v, m
 22    FROM temp
 23   WHERE rn = 1;

V   M
--- --
AAA A
ABA AB

SQL>
7uhlpewt

7uhlpewt3#

如果你想修改自己的查询,使其工作,那么这是你可以这样做。您看到不需要的行的原因是联接是逐行计算的。为了解决你得到的行只有一个substr字符的问题,对结果进行排名,然后只取最高的排名:

With t1 as (
  Select 'AAA' as V from dual
  Union all
  Select 'ABA' as V from dual
)

,t2 as (
  Select 'A' as m from dual
  Union all 
  Select 'AB' as m from dual
)

,r1 as (
  Select
    t1.v
   ,t2.m
   ,CASE
     WHEN SUBSTR(t1.v,1,2) = t2.m
      THEN 2
     WHEN SUBSTR(t1.v,1,1) = t2.m 
      THEN 1
   End as rn
  From t1
  left join t2 on
   t2.m = CASE
     WHEN SUBSTR(t1.v,1,2) = t2.m
      THEN SUBSTR(t1.v,1,2)
     WHEN SUBSTR(t1.v,1,1) = t2.m 
      THEN SUBSTR(t1.v, 1,1)
   End
), r2 as 
(Select v, m, rn, rank() over (partition by v order by rn desc) as rnk  from r1)
select v, m from r2 where rnk = 1;

V   M
--- --
AAA A
ABA AB

相关问题