R语言 一个因子中的不同元素是否可能具有相同的水平?

ddrv8njm  于 2023-02-27  发布在  其他
关注(0)|答案(2)|浏览(214)

我谷歌了很多次,结果都不是我想要的:
样本数据集如下所示:

year = c(1991,1996,2001,2006,2011,2016,2021)

factor(a,levels = c(1991,1996,2001,2011,2016,2021))

结果是:

[1] 1991 1996 2001 <NA> 2011 2016 2021
Levels: 1991 1996 2001 2011 2016 2021

我想将2006的级别设置为与2001相同,因此,我的有利结果将是:

[1] 1991 1996 2001 2006 2011 2016 2021
Levels: 1991 1996 2001 2011 2016 2021

是否可以在不改变向量year的原始内容的情况下,将2006的级别更改为与2001相同?

koaltpgm

koaltpgm1#

当你深入挖掘factor的源代码时,我猜你的脑海中会有答案(我想对你的问题应该是“不”)

> factor
function (x = character(), levels, labels = levels, exclude = NA, 
    ordered = is.ordered(x), nmax = NA)
{
    if (is.null(x))
        x <- character()
    nx <- names(x)
    if (missing(levels)) {
        y <- unique(x, nmax = nmax)
        ind <- order(y)
        levels <- unique(as.character(y)[ind])
    }
    force(ordered)
    if (!is.character(x))
        x <- as.character(x)
    levels <- levels[is.na(match(levels, exclude))]
    f <- match(x, levels)
    if (!is.null(nx))
        names(f) <- nx
    if (missing(labels)) {
        levels(f) <- as.character(levels)
    }
    else {
        nlab <- length(labels)
        if (nlab == length(levels)) {
            nlevs <- unique(xlevs <- as.character(labels))
            at <- attributes(f)
            at$levels <- nlevs
            f <- match(xlevs, nlevs)[f]
            attributes(f) <- at
        }
        else if (nlab == 1L)
            levels(f) <- paste0(labels, seq_along(levels))
        else stop(gettextf("invalid 'labels'; length %d should be 1 or %d",
            nlab, length(levels)), domain = NA)
    }
    class(f) <- c(if (ordered) "ordered", "factor")
    f
}
<bytecode: 0x00000186f0fe3640>
<environment: namespace:base>

正如我们所看到的,levels要么由unique(x, nmax = nmax)生成(如果没有提供levels参数),要么由levels[is.na(match(levels, exclude))]和给定的levels生成。这意味着,不可能有一个level对应两个x值。

5f0d552i

5f0d552i2#

这是不可能的,不确定你的目的是什么,但是你可以这样做,初始值就是名字。

year = c(1991,1996,2001,2006,2011,2016,2021,2006)

year2 <- factor(year,levels = c(1991,1996,2001,2006,2011,2016,2021), labels = c(1991,1996,2001,2001,2011,2016,2021))

names(year2) <- year

year2

1991 1996 2001 2006 2011 2016 2021 2006 
1991 1996 2001 2001 2011 2016 2021 2001 
Levels: 1991 1996 2001 2011 2016 2021

str(year2)
 Factor w/ 6 levels "1991","1996",..: 1 2 3 3 4 5 6 3
 - attr(*, "names")= chr [1:8] "1991" "1996" "2001" "2006" ...

相关问题