oracle regex计算一个被逗号包围的字符串的多次出现次数

qojgxg4l  于 2021-08-13  发布在  Java
关注(0)|答案(3)|浏览(452)

这个问题和我先前的一个问题相似。我正在寻找一种方法来计算oracle(11g)sql数据库中列中以逗号分隔的值列表中的字符串。例如,假设我有以下数据:

SELECT ('SL,PK') as col1 FROM dual
    UNION ALL 
SELECT ('SL,CR,SL') as col1 FROM dual
    UNION ALL 
SELECT ('PK,SL') as col1 FROM dual
    UNION ALL 
SELECT ('SL,SL') as col1 FROM dual
    UNION ALL
SELECT ('SL') as col1 FROM dual
    UNION ALL
SELECT ('PK') as col1 FROM dual
    UNION ALL
SELECT ('PI,SL,PK') as col1 FROM dual
    UNION ALL 
SELECT ('PI,SL,SL,PK') as col1 FROM dual
    UNION ALL 
SELECT ('PI,SL,SL,SL,PK') as col1 FROM dual
    UNION ALL 
SELECT ('PI,SL,SL,SL,SL,PK') as col1 FROM dual
    UNION ALL 
SELECT ('PI,OSL,SL,PK') as col1 FROM dual
    UNION ALL 
SELECT ('PI,SL,SLR,PK') as col1 FROM dual

COL1
-----
SL,PK
SL,CR,SL
PK,SL
SL,SL
SL
PK
PI,SL,PK
PI,SL,SL,PK
PI,SL,SL,SL,PK
PI,SL,SL,SL,SL,PK
PI,OSL,SL,PK
PI,SL,SLR,PK

我希望严格计算子字符串“sl”的所有出现次数(即不包括“osl”、“slr”等)。理想结果如下:

COL1                COL2
-----               -----
SL,PK               1
SL,CR,SL            2
PK,SL               1
SL,SL               2
SL                  1
PK                  0
PI,SL,PK            1
PI,SL,SL,PK         2
PI,SL,SL,SL,PK      3
PI,SL,SL,SL,SL,PK   4
PI,OSL,SL,PK        1
PI,SL,SLR,PK        1

我可以用 length 以及 regexp_replace :

SELECT 
    col1,
    (length(col1) - NVL(length(regexp_replace(regexp_replace(col1,'(^|,)(SL)($|,)','\1' || '' || '\3',1,0,'imn'),'(^|,)(SL)($|,)','\1' || '' || '\3',1,0,'imn')),0))/length('SL') as col2
FROM (
    SELECT ('SL,PK') as col1 FROM dual
        UNION ALL 
    SELECT ('SL,CR,SL') as col1 FROM dual
        UNION ALL 
    SELECT ('PK,SL') as col1 FROM dual
        UNION ALL 
    SELECT ('SL,SL') as col1 FROM dual
        UNION ALL
    SELECT ('SL') as col1 FROM dual
        UNION ALL
    SELECT ('PK') as col1 FROM dual
        UNION ALL
    SELECT ('PI,SL,PK') as col1 FROM dual
        UNION ALL 
    SELECT ('PI,SL,SL,PK') as col1 FROM dual
        UNION ALL 
    SELECT ('PI,SL,SL,SL,PK') as col1 FROM dual
        UNION ALL 
    SELECT ('PI,SL,SL,SL,SL,PK') as col1 FROM dual
        UNION ALL 
    SELECT ('PI,OSL,SL,PK') as col1 FROM dual
        UNION ALL 
    SELECT ('PI,SL,SLR,PK') as col1 FROM dual
)

COL1                COL2
-----               -----
SL,PK               1
SL,CR,SL            2
PK,SL               1
SL,SL               2
SL                  1
PK                  0
PI,SL,PK            1
PI,SL,SL,PK         2
PI,SL,SL,SL,PK      3
PI,SL,SL,SL,SL,PK   4
PI,OSL,SL,PK        1
PI,SL,SLR,PK        1

但希望有一个更优雅的解决方案,也许是 regexp_count . 我已经在其他具有单词边界的regex实现中成功地实现了我的目标 \b 构造可用(带 \bSL\b ),但尚未找到针对oracle正则表达式的解决方案。

xurqigkl

xurqigkl1#

你可以用 regexp_count() 如果破解字符串:

select col1, regexp_count(replace(col1, ',', ',,'), '(^|\W)SL(\W|$)')

这将分隔符加倍,这样第一个匹配就不会消耗掉它——绕过了根本的问题,即oracle正则表达式不支持向前看。
这是一把小提琴。

0g0grzrc

0g0grzrc2#

这里有一个选择:

SQL> with temp as
  2    (select col1,
  3            regexp_substr(col1, '[^,]+', 1, column_value) val
  4     from test cross join
  5       table(cast(multiset(select level from dual
  6                           connect by level <= regexp_count(col1, ',') + 1
  7                          ) as sys.odcinumberlist))
  8    )
  9  select col1,
 10         sum(case when val = 'SL' then 1 else 0 end) col2
 11  From temp
 12  group by col1;

COL1                    COL2
----------------- ----------
PI,SL,SLR,PK               1
PK,SL                      1
PK                         0
SL,CR,SL                   2
PI,OSL,SL,PK               1
SL,SL                      2
PI,SL,SL,PK                2
PI,SL,SL,SL,PK             3
SL,PK                      1
SL                         1
PI,SL,PK                   1
PI,SL,SL,SL,SL,PK          4

12 rows selected.

SQL>

它有什么作用? temp cte将每列拆分为行(分隔符为逗号)
决赛 select 简单地计算 SL 每个都有一个 col1

w1e3prcc

w1e3prcc3#

你可以使用 XMLTABLE 要将字符串分开然后计数:

SELECT col1,
       (
         SELECT COUNT(*)
         FROM   XMLTABLE(
                  ('"' || REPLACE( col1, ',', '","' ) || '"')
                  COLUMNS
                    value CHAR(2) PATH '.'
                )
         WHERE  value = 'SL'
       ) AS col2
FROM   test_data

因此,对于您的测试数据:

CREATE TABLE test_data ( col1 ) AS
SELECT 'SL,PK'             FROM dual UNION ALL 
SELECT 'SL,CR,SL'          FROM dual UNION ALL 
SELECT 'PK,SL'             FROM dual UNION ALL 
SELECT 'SL,SL'             FROM dual UNION ALL
SELECT 'SL'                FROM dual UNION ALL
SELECT 'PK'                FROM dual UNION ALL
SELECT 'PI,SL,PK'          FROM dual UNION ALL 
SELECT 'PI,SL,SL,PK'       FROM dual UNION ALL 
SELECT 'PI,SL,SL,SL,PK'    FROM dual UNION ALL 
SELECT 'PI,SL,SL,SL,SL,PK' FROM dual UNION ALL 
SELECT 'PI,OSL,SL,PK'      FROM dual UNION ALL 
SELECT 'PI,SL,SLR,PK'      FROM dual

这将输出:

COL1              | COL2
:---------------- | ---:
SL,PK             |    1
SL,CR,SL          |    2
PK,SL             |    1
SL,SL             |    2
SL                |    1
PK                |    0
PI,SL,PK          |    1
PI,SL,SL,PK       |    2
PI,SL,SL,SL,PK    |    3
PI,SL,SL,SL,SL,PK |    4
PI,OSL,SL,PK      |    1
PI,SL,SLR,PK      |    2

db<>在这里摆弄

相关问题