使用dplyr在mutate()中动态引用列

l3zydbqr 于 2023-07-31 发布在其他

关注(0)|答案(3)|浏览(110)

我尝试使用summarise()基于动态列名创建列。我发现我可以很容易地用粘合语法"{}"和:=创建动态名称，但是我不知道如何在另一个mutate()函数中引用这些列。
根据我在网上读到的提示，常见的解决方案是使用{{varname}}或将varname_enq <- enquo(varname)与!!varname_enq一起使用。不幸的是，这两种方法对我都不起作用。
到目前为止，我已经看了其他SO职位以及Programming with dplyr guide。我非常感谢你能给予我的所有建议！
下面是一个小例子，突出了这个问题。

This is the goal:

# A tibble: 3 × 3
  Species    species_sum cumulative_sum
  <fct>            <dbl>          <dbl>
1 setosa            250.           250.
2 versicolor        297.           547.
3 virginica         329.           876.

mycol <- "species_sum"
mycol_enquo <- enquo(mycol)
myothercol <- "cumulative_sum"

# this works, but the cumulative sum isn't dynamic
iris %>% 
  group_by(Species) %>% 
  summarise("{mycol}" := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate(cumulative_sum = cumsum(species_sum))

# this works, but the cumsum function still uses a fixed variable name
iris %>% 
  group_by(Species) %>% 
  summarise("{mycol}" := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate("{myothercol}" := cumsum(species_sum))

# doesn't work, the new column is all NA
iris %>% 
  group_by(Species) %>% 
  summarise("{mycol}" := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate("{myothercol}" := cumsum( "{mycol}" ))
  
# doesn't work, the new column is all NA
iris %>% 
  group_by(Species) %>% 
  summarise("{mycol}" := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate("{myothercol}" := cumsum( {{mycol}} ))

# doesn't work, the new column is all NA
iris %>% 
  group_by(Species) %>% 
  summarise("{mycol}" := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate("{myothercol}" := cumsum( !!mycol_enquo ))

来源：https://stackoverflow.com/questions/76762950/reference-columns-dynamically-inside-mutate-with-dplyr

3条答案

按热度按时间

okxuctiv1#

library(tidyverse)

myFunction <- function(df, col1, col2) {
  df %>% 
    group_by(Species) %>% 
    summarise({{ col1 }} := sum(Sepal.Length))  %>% 
     ungroup() %>%
     mutate({{ col2 }} := cumsum({{ col1 }}))
}

iris %>% myFunction(species_sum, cumulative_sum)
# A tibble: 3 × 3
  Species    species_sum cumulative_sum
  <fct>            <dbl>          <dbl>
1 setosa            250.           250.
2 versicolor        297.           547.
3 virginica         329.           876.

字符串

赞(0）回复(0）举报 2023-07-31

kokeuurv2#

这样行吗？
要用字符向量指定变量，.data是您的朋友。

iris %>% 
  group_by(Species) %>% 
  summarise(!!mycol := sum(Sepal.Length)) %>% 
  ungroup() %>%
  mutate(!!myothercol := cumsum(.data[[mycol]]))
# A tibble: 3 x 3
  Species    species_sum cumulative_sum
  <fct>            <dbl>          <dbl>
1 setosa            250.           250.
2 versicolor        297.           547.
3 virginica         329.           876.

字符串
或者，您可以使用across()

iris %>% 
  group_by(Species) %>% 
  summarise(across(Sepal.Length, sum, .names = mycol)) %>%
  ungroup() %>%
  mutate(across(all_of(mycol), cumsum, .names = myothercol))
# A tibble: 3 x 3
  Species    species_sum cumulative_sum
  <fct>            <dbl>          <dbl>
1 setosa            250.           250.
2 versicolor        297.           547.
3 virginica         329.           876

型

赞(0）回复(0）举报 2023-07-31

68bkxrlz3#

另一种选择是使用sym()将字符串转换为符号，然后我们使用!!取消引号。注意，与{{相反，我们必须使用"：

library(dplyr)
library(rlang)

my_function <- function(df, mycol, myothercol) {
  df %>% 
  summarise(!!mycol := sum(Sepal.Length), .by = Species) %>% 
  mutate(!!myothercol := cumsum(!!sym(mycol)))
}

iris %>% 
  my_function("species_sum", "cumulative_sum")

Species species_sum cumulative_sum
1     setosa       250.3          250.3
2 versicolor       296.8          547.1
3  virginica       329.4          876.5

赞(0）回复(0）举报 2023-07-31

我来回答

使用dplyr在mutate()中动态引用列

3条答案

相关问题

热门标签

最新问答