使用某种条件匹配模式,并在oracle中使用regexp\u replace将其替换为字符串

kcwpcxri  于 2021-08-13  发布在  Java
关注(0)|答案(1)|浏览(261)

我需要使用以下条件识别文本中的特定字符串:

    1. 任何前后包含空格的字符串,或

任何包含字符串的字符串(点)作为前缀,空格作为后缀或
任何以空格作为前缀,以(逗号)作为后缀的字符串*
一旦找到,我需要用另一个字符串替换它,而不替换前面提到的前缀和后缀。这需要在oracle的pl/sql代码中完成(最好使用regexp\u replace函数)。
例子:
文本: 'This, is a sample_text, which_needs_.to_be_replaced as per, the matching.criteria.defined above,' 替换字符串: 'replaced' 输出: 'This, replaced replaced replaced, which_needs_.replaced replaced replaced, replaced matching.criteria.replaced replaced' 我知道这是一个奇怪的例子,但实际的需求比这个更奇怪。请指导我如何做到这一点。
先谢谢你。

5lhxktic

5lhxktic1#

与其尝试编写一个大型正则表达式,不如为每个标记将字符串拆分为行。您可以通过调整csv中的任何一个来实现这一点。例如。:

with rws as (
  select 'This, is a sample_text, which_needs_.to_be_replaced as per, the matching.criteria.defined above,' str from dual
), vals as (
  select regexp_substr(str,'[A-z_,]+(\.|\s)?', 1, level) str, level l
  from   rws
  connect by regexp_substr(str, '[^, .]+', 1, level) is not null
)
  select * from vals;

STR               L    
This,                  1 
is                     2 
a                      3 
sample_text,           4 
which_needs_.          5 
to_be_replaced         6 
as                     7 
per,                   8 
the                    9 
matching.             10 
criteria.             11 
defined               12 
above,                13

现在根据你的规则把它们都换掉。一次只处理一个令牌,因此很容易看到正确替换的令牌。这使得regex更易于编写和调试:

with rws as (
  select 'This, is a sample_text, which_needs_.to_be_replaced as per, the matching.criteria.defined above,' str from dual
), vals as (
  select regexp_substr(str,'[A-z_,]+(\.|\s)?', 1, level) str, level l
  from   rws
  connect by regexp_substr(str, '[^, .]+', 1, level) is not null
)
  select case 
           when l = 1 then str
           when substr ( str, -1, 1 ) = '.' then
             str
           else 
           regexp_replace (
             str,
             '^[A-z_]+',
             'replaced'
           )
       end replaced, l
  from   vals;

REPLACED        L    
This,                1 
replaced             2 
replaced             3 
replaced,            4 
which_needs_.        5 
replaced             6 
replaced             7 
replaced,            8 
replaced             9 
matching.           10 
criteria.           11 
replaced            12 
replaced,           13

那你呢 listagg 将这些值重新组合在一起以获得最终字符串:

with rws as (
  select 'This, is a sample_text, which_needs_.to_be_replaced as per, the matching.criteria.defined above,' str from dual
), vals as (
  select regexp_substr(str,'[A-z_,]+(\.|\s)?', 1, level) str, level l
  from   rws
  connect by regexp_substr(str, '[^, .]+', 1, level) is not null
), replaces as (
  select case 
           when l = 1 then str
           when substr ( str, -1, 1 ) = '.' then
             str
           else 
           regexp_replace (
             str,
             '[A-z_]+',
             'replaced'
           )
       end replaced, l
  from   vals
)
  select listagg ( replaced ) 
           within group ( order by l ) s
  from   replaces;

S                                                                                                                          
This, replaced replaced replaced, which_needs_.replaced replaced replaced, replaced matching.criteria.replaced replaced,

确保测试彻底!根据我的经验,当你有这样复杂的规则时,你会发现更多的例外/改进。因此,您可能需要修改case表达式中的替换规则。

相关问题