dplyr条件变异数据类型

o2rvlv0m  于 2023-01-18  发布在  其他
关注(0)|答案(2)|浏览(94)

我的目标是使用tidyverse有条件地改变数据类型。下面是一个可重复的示例。例如,我想将列cyl更改为因子。然而,因子levelslabels参数将取决于用户是否提供了对象bin.order或将其保留为NULL。我知道如何在tidyverse之外执行此操作。而是寻找一种更简洁的方式来使用tidyverse函数。

mtcars %>% 
  mutate(cyl = ifelse(is.null(bin.order), 
                      factor(x = cyl, levels = sort(unique(cyl)), labels = sort(unique(cyl))), 
                      factor(x = cyl, levels = bin.order, labels = bin.order)))

所期望的结果可能是这样的:

# if bin.order is null
mtcars %>% 
  mutate(cyl = factor(x = cyl, levels = sort(unique(cyl)), labels = sort(unique(cyl))))

# if bin.order is not null
bin.order = c(4, 6, 8)
mtcars %>% 
  mutate(cyl = factor(x = cyl, levels = bin.order, labels = bin.order))
cbjzeqam

cbjzeqam1#

您可以使用%||%操作符(来自rlang,由purrr重新导出),如果不是NULL,则使用左侧,否则使用右侧。例如,x %||% y等效于if (is.null(x)) y else x
对于您的情况:

library(dplyr)
library(purrr)

factor.bin.order <- function(x, bin.order = NULL) {
  factor(x, bin.order %||% sort(unique(x)))
}

mtcars2 <- mtcars %>% 
  mutate(
    cyl1 = factor.bin.order(cyl),
    cyl2 = factor.bin.order(cyl, c(6, 4, 8))
  )

levels(mtcars2$cyl1)
# "4" "6" "8"

levels(mtcars2$cyl2)
# "6" "4" "8"

还要注意,如果labelslevels相同,则无需指定labels,因为这是默认行为。

mqxuamgl

mqxuamgl2#

可能的解决方案是构建一个函数

fct_if <- function(x,bin.order = NULL){
  
  if(is.null(bin.order)){
    output <- factor(x = x, levels = sort(unique(x)), labels = sort(unique(x)))
  }else{
    output <- factor(x = x, levels = bin.order, labels = bin.order)
  }
  return(output)
}

mtcars %>% 
  mutate(cyl = fct_if(cyl)) 

mtcars %>% 
  mutate(cyl = fct_if(cyl,bin.order = c(4, 6, 8)))

相关问题