为什么forstringr::str_split_extract不像对其他字符那样对句点(.)进行拆分?

y4ekin9u  于 2023-10-13  发布在  其他
关注(0)|答案(1)|浏览(94)

为什么forstringr::str_split_extract不像对其他字符那样对句点字符进行拆分?
根据str_split_extract手册中的示例,以下代码运行良好:

code <- c("HS-IB-EDE", "OG-OYO-CAS-0121", "NY-ILR-NIG-036")

str_split_extract(code, "-", 1)

str_split_extract(code, "-", 4)

这将返回结果:

> str_split_extract(code, "-", 1)
[1] "HS" "OG" "NY"
> str_split_extract(code, "-", 4)
[1] NA     "0121" "036"

但是,如果我修改它,将“-”替换为“.”,它只会对所有值返回NA:

code <- c("HS.IB.EDE", "OG.OYO.CAS.0121", "NY.ILR.NIG.036")

str_split_extract(code, ".", 1)

str_split_extract(code, ".", 4)

这给出了结果:

> str_split_extract(code, ".", 1)
[1] NA NA NA
> str_split_extract(code, ".", 4)
[1] NA NA NA

为什么?为什么?
示例代码:https://search.r-project.org/CRAN/refmans/forstringr/html/str_split_extract.html

r7s23pms

r7s23pms1#

我在stringr中发现了一个类似的问题:
https://github.com/tidyverse/stringr/issues/303
原来模式字符串默认是regex,而“.”是有效的regex,所以它工作正常,但只是没有做我所期望的。因此,必须将其转换为适当的正则表达式,以便使用正则表达式搜索(.):

code <- c("HS.IB.EDE", "OG.OYO.CAS.0121", "NY.ILR.NIG.036")

str_split_extract(code, "\\.", 1)

str_split_extract(code, "\\.", 4)

...returns..

> str_split_extract(code, "\\.", 1)
[1] "HS" "OG" "NY"
> str_split_extract(code, "\\.", 4)
[1] NA     "0121" "036"

相关问题