oracle sql regexp\u substr非捕获/可选组

7jmck4yq  于 2021-08-09  发布在  Java
关注(0)|答案(2)|浏览(496)

表达式:

  1. Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\].+?\.(?: Target definition = (\d+))?.*

正确生成以下匹配项:

  1. Group 1. 24-30 494801
  2. Group 2. 38-45 8280955
  3. Group 3. 52-59 8336297
  4. Group 4. 103-109 494767

对于输入字符串:

  1. Reassigning definition: 494801 from: [8280955] to: [8336297], advancing due dates. Target definition = 494767.

输入字符串的前3个匹配项:

  1. Reassigning definition: 494801 from: [8280955] to: [8336297], advancing due dates.

使用javascript、python、php和golang风格(参见https://regex101.com/r/br66wm/3),但不使用sql regexp substr:

  1. with
  2. input_string as
  3. (
  4. select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
  5. union all
  6. select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
  7. ),
  8. pattern_string as
  9. (
  10. select 'Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\].+?\.(?: Target definition = (\d+))?.*$' as pattern_string from dual
  11. )
  12. select
  13. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 1) as group_1,
  14. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 2) as group_2,
  15. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 3) as group_3,
  16. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 4) as group_4
  17. from
  18. input_string i, pattern_string p;

第四组总是 null . 我使用非捕获组有什么问题?基本上,以下句子在我的输入测试字符串中是可选的:

  1. Target definition = 494767.
c0vxltue

c0vxltue1#

这是一个有点太多的意见,所以我将写在这里。如果说不通,我就把它拿走。
如果您总是在这些字符串中寻找数字(与它们周围的内容无关),那么可以简化为

  1. SQL> with
  2. 2 input_string as
  3. 3 (
  4. 4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
  5. 5 union all
  6. 6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
  7. 7 )
  8. 8 select regexp_substr(test_string, '\d+', 1, 1) grp1,
  9. 9 regexp_substr(test_string, '\d+', 1, 2) grp2,
  10. 10 regexp_substr(test_string, '\d+', 1, 3) grp3,
  11. 11 regexp_substr(test_string, '\d+', 1, 4) grp4
  12. 12 from input_string;
  13. GRP1 GRP2 GRP3 GRP4
  14. ---------- ---------- ---------- ----------
  15. 494801 8280955 8336297 494767
  16. 494801 8280955 8336297
  17. SQL>

或者,没有固定组数的选项(不过,布局与您想要的不同):

  1. SQL> with
  2. 2 input_string as
  3. 3 (
  4. 4 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
  5. 5 union all
  6. 6 select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates.' as test_string from dual
  7. 7 )
  8. 8 select column_value grp_rn,
  9. 9 regexp_substr(test_string, '\d+', 1, column_value) grp
  10. 10 from input_String cross join
  11. 11 table(cast(multiset(select level from dual
  12. 12 connect by level <= regexp_count(test_string, '\d+')
  13. 13 ) as sys.odcinumberlist));
  14. GRP_RN GRP
  15. ------- ----------
  16. 1 494801
  17. 2 8280955
  18. 3 8336297
  19. 4 494767
  20. 1 494801
  21. 2 8280955
  22. 3 8336297
  23. 7 rows selected.
展开查看全部
bcs8qyzn

bcs8qyzn2#

因为基于posix的regex实现似乎不支持非捕获组和捕获的 regex_substr 不容易作为单独的列提供,我使用了以下内容,基本上对可选组使用不同的regex。

  1. with
  2. input_string as
  3. (
  4. select 'Reassigning definition: 494801 from: [8280955] to: [8336297], advancing dates. Target definition = 494767.' as test_string from dual
  5. union all
  6. select 'Reassigning definition: 494767 from: [8336297] to: [8369944], advancing dates.' as test_string from dual
  7. ),
  8. pattern_string as
  9. (
  10. select 'Reassigning definition: (\d+) from: \[(\d+)\] to: \[(\d+)\]' as pattern_string from dual
  11. )
  12. select
  13. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 1) as group_1,
  14. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 2) as group_2,
  15. regexp_substr(i.test_string, p.pattern_string, 1, 1, null, 3) as group_3,
  16. regexp_substr(i.test_string, 'Target definition = (\d+)', 1, 1, null, 1) as group_4
  17. from
  18. input_string i, pattern_string p;
展开查看全部

相关问题