在R中使用递归将字符串拆分为块

7gs2gvoe  于 2022-12-20  发布在  其他
关注(0)|答案(2)|浏览(157)

我试图理解递归的概念。然而,当我应用递归R函数时,它并不返回一个分割成所需数量的块的字符串。它只返回两个块。然而,我的目标是将一个长字符串分割成多个大小为n的较小字符串的块。我确信还有其他方法可以做到这一点。但我正在尝试找到一个递归的解决方案。任何帮助都是赞赏提前感谢。

# Sample dataset
x <- paste0(rep(letters, 10000), collapse = "")

split_group <- function(x, n = 10) {
    if (nchar(x) < n) {
        return(x)
    } else {
        beginning <- substring(x, 1, n)
        remaining <- substring(x, (n + 1), (n + 1) + (n - 1))
      c(beginning, split_group(remaining, n))
    }
}

split_group(x = x, n = 10)

# Returns:  "abcdefghij" "klmnopqrst" ""
sbtkgmzw

sbtkgmzw1#

使用〈=代替〈并修复剩余部分。

split_group <- function(x, n = 10) {
    if (nchar(x) <= n) x
    else {
        beginning <- substring(x, 1, n)
        remaining <- substring(x, n + 1)
        c(beginning, split_group(remaining, n))
    }
}

x <- substring(paste(letters, collapse = ""), 1, 24)

split_group(x, 2)
##  [1] "ab" "cd" "ef" "gh" "ij" "kl" "mn" "op" "qr" "st" "uv" "wx"

split_group(x, 5)
## [1] "abcde" "fghij" "klmno" "pqrst" "uvwx" 

split_group(x, 6)
## [1] "abcdef" "ghijkl" "mnopqr" "stuvwx"

split_group(x, 10)
## [1] "abcdefghij" "klmnopqrst" "uvwx"      

split_group(x, 23)
## [1] "abcdefghijklmnopqrstuvw" "x"                      

split_group(x, 24)
## [1] "abcdefghijklmnopqrstuvwx"

split_group(x, 25)
## [1] "abcdefghijklmnopqrstuvwx"

**2)**和一些没有递归的方法第一个是最短的,但第二个是最简单的,只使用了基R。

library(gsubfn)
strapply(x, "(.{1,10})", simplify = c)
## [1] "abcdefghij" "klmnopqrst" "uvwx"      

ix <- seq(1, nchar(x), 10)
substring(x, ix, ix + 10 - 1)
## [1] "abcdefghij" "klmnopqrst" "uvwx"      

sapply(seq(1, nchar(x), 10), function(i) substring(x, i, i + 10 - 1))
## [1] "abcdefghij" "klmnopqrst" "uvwx"    

library(zoo)
s <- strsplit(x, "")[[1]]
rollapply(s, 10, by = 10, paste0, collapse = "", partial = TRUE, align = "left")
## [1] "abcdefghij" "klmnopqrst" "uvwx"
sr4lhrrt

sr4lhrrt2#

base R选项为

x1 <- strsplit(x, "(?<=.{10})(?=.)", perl = TRUE)[[1]]
  • 输出
> head(x1, 10)
 [1] "abcdefghij" "klmnopqrst" "uvwxyzabcd" "efghijklmn" "opqrstuvwx" "yzabcdefgh" "ijklmnopqr" "stuvwxyzab" "cdefghijkl" "mnopqrstuv"

相关问题