跨多个 Dataframe 对同一变量重新编码

8ehkhllq  于 2023-03-05  发布在  其他
关注(0)|答案(1)|浏览(119)

我想创建一种简化的方式来重新编码跨多个 Dataframe 的同一变量(相同的方式)。例如,现在我正在重新编码来自状态数据集FL和GA的年龄变量。我目前正在分别编码它们。我如何压缩这段代码?

FL <- FL %>% 
  mutate(
    # Create categories
    age_group = dplyr::case_when(
                              age >= 18 & age <= 29 ~ "18-29",
                              age >= 30 & age <= 39 ~ "30-39",
                              age >= 40 & age <= 49 ~ "40-49",
                              age >= 50 & age <= 64 ~ "50-64", 
                              age >= 65 ~ "65+"),
    # Convert to factor
    age_group = factor(
      age_group,
      level = c("18-29", "30-39","40-49", "50-64","65+")
    )
  )

GA <- GA %>% 
  mutate(
    # Create categories
    age_group = dplyr::case_when(
                              age >= 18 & age <= 29 ~ "18-29",
                              age >= 30 & age <= 39 ~ "30-39",
                              age >= 40 & age <= 49 ~ "40-49",
                              age >= 50 & age <= 64 ~ "50-64", 
                              age >= 65 ~ "65+"),
    # Convert to factor
    age_group = factor(
      age_group,
      level = c("18-29", "30-39","40-49", "50-64","65+")
    )
  )
ifsvaxew

ifsvaxew1#

我们可以调用同一个函数作为循环函数的参数
首先,把你所有的 Dataframe 放在一个列表中(有几种方法,如果没有一个合适的可重复的例子,很难判断哪一种是最好的)。

my_dfs <- list(FL, GA)

然后定义您的函数:

my_function <- function(x) x %>% 
  mutate(
    age_group = dplyr::case_when(
                              age >= 18 & age <= 29 ~ "18-29",
                              age >= 30 & age <= 39 ~ "30-39",
                              age >= 40 & age <= 49 ~ "40-49",
                              age >= 50 & age <= 64 ~ "50-64", 
                              age >= 65 ~ "65+"),
    age_group = factor(
      age_group,
      level = c("18-29", "30-39","40-49", "50-64","65+")
    )
  )

最后在循环中调用它:

lapply(my_dfs, my_function)

相关问题