regex 如何在一个正则表达式中捕获(0/32 LK)(0/21 x)和(3/17)

ih99xse1  于 2022-11-18  发布在  其他
关注(0)|答案(1)|浏览(77)

我想清理一些TNM条目,下面是一个示例:

structure(list(TNM = c("pT3 N0 (0/13)", "pT3 N2b (21/45l)", "pT3 N0 (0/32 LK)"
)), class = "data.frame", row.names = c(NA, -3L))

               TNM
1    pT3 N0 (0/13)
2 pT3 N2b (21/45l)
3 pT3 N0 (0/32 LK)

到目前为止我得到了这个:

library(dplyr)
library(stringr)

df %>% 
  mutate(TNM = str_remove_all(TNM, '\\,|\\;|\\.'),
         TNM = str_replace_all(TNM, ' ', ''),
         TNM = str_replace_all(TNM, "x", "X")) %>% 
  mutate(N_count = str_extract(TNM, '\\(\\d+\\/\\d+\\)'))

            TNM N_count
1    pT3N0(0/13)  (0/13)
2 pT3N2b(21/45l)    <NA>
3  pT3N0(0/32LK)    <NA>

这是可行的:

library(dplyr)
library(stringr)

df %>% 
  mutate(TNM = str_remove_all(TNM, '\\,|\\;|\\.'),
         TNM = str_replace_all(TNM, ' ', ''),
         TNM = str_replace_all(TNM, "x", "X")) %>% 
  mutate(N_count = str_extract(TNM, '\\(\\d+\\/\\d+\\)|\\(\\d+\\/\\d+\\w\\)|\\(\\d+\\/\\d+\\w+\\)'))

    TNM  N_count
1    pT3N0(0/13)   (0/13)
2 pT3N2b(21/45l) (21/45l)
3  pT3N0(0/32LK) (0/32LK)

有没有办法缩短这个正则表达式:'\\(\\d+\\/\\d+\\)|\\(\\d+\\/\\d+\\w\\)|\\(\\d+\\/\\d+\\w+\\)'

iih3973s

iih3973s1#

在交替中,你要匹配的是没有、单个或1个或多个单词字符。
你可以不使用交替和重复可选单词字符来缩短模式

\\(\\d+/\\d+\\w*\\)

Regex demo
要同时匹配(0/32 LK)而不仅仅是像(21/45 )这样的尾部空格,您可以选择匹配可选的空格字符,后面跟1+单词字符:

\\(\\d+/\\d+(?:\\s*\\w+)?\\)

Regex demo|R演示

相关问题