我有这个字符向量
vec <- c("(0,13.2]", "(13.2,28.3]", "(28.3,39.3]", "(39.3,49.4]", "(49.4,59.4]",
"(59.4,69.3]", "(69.3,78.9]", "(78.9,87.8]", "(87.8,95.5]", "(95.5,100]")
字符串
我想把条目改成
expected <- c("0 to 13.2", "13.2 to 28.3", "28.3 to 39.3", "39.3 to 49.4", "49.4 to 59.4",
"59.4 to 69.3", "69.3 to 78.9", "78.9 to 87.8", "87.8 to 95.5", "95.5 to 100")
型
我做的是
vec %>%
strsplit(., ",") %>%
lapply(., function(level_i){
from <- gsub("^\\(([0-9])+(\\.)*([0-9])*$", "\\1\\2\\3", level_i[1])
to <- gsub("^([0-9])+(\\.)*([0-9])*]$", "\\1\\2\\3", level_i[2])
paste0(from, " to ", to)
}) %>%
unlist()
# This gives:
# "0 to 3.2" "3.2 to 8.3" "8.3 to 9.3" "9.3 to 9.4" "9.4 to 9.4" "9.4 to 9.3" "9.3 to 8.9"
# "8.9 to 7.8" "7.8 to 5.5" "5.5 to 0"
型
我的代码只捕获组的最后一个元素,即。"(0,13.2]"
变成"0 to 3.2"
而不是"0 to 13.2"
。如何捕获一个组的所有字符?
3条答案
按热度按时间wkftcu5l1#
使用
gsub
,您可以使用()
捕获组:字符串
为了准确地捕获数字而不是
.*
,您可以这样做。这包括整数和十进制格式:型
有了所有这些反作用,你可以用raw strings简化正则表达式:
r"{\((\d+[\.]*\d*),(\d+[\.]*\d*)\]}"
smdnsysy2#
你可以试试
read.table
+trimws
字符串
这给了
型
另一个技巧是
sub
+trimws
+chartr
型
zbdgwd5y3#
我们可以使用gsub两次: