R语言 在`:=`中为`glue()`提供data.table的环境

qgelzfjb  于 2023-11-14  发布在  其他
关注(0)|答案(4)|浏览(142)

我试图弄清楚是否有一个好的方法在j中使用glue()的data.table:

library(data.table)
library(glue)
data(iris)
dt.iris <- data.table(iris)

dt.iris[, myText := glue('The species is {Species} with sepal length of {Sepal.Length}')] 
# Error in eval(parse(text = text, keep.source = FALSE), envir) : 
#   object 'Species' not found

字符串
我可以使用它,如果我指示.envir = .SD

dt.iris[, myText := glue('The species is {Species} with sepal length of {Sepal.Length}', .envir = .SD)]
# works OK


但是我想知道我是否能找到一些方法,而不是每次都添加这个。也许是这样的:

glue1 <- function(...) glue(..., .envir = ???)

8wigbo56

8wigbo561#

为什么不简单地使用sprintf

> library(data.table)
> dt.iris[, myText := sprintf('The species is %s with sepal length of %.2g', 
+                               Species, Sepal.Length)]

字符串
或者paste,这是相当慢,虽然。

> dt.iris[, myText := paste('The species is', Species, 'with sepal length of', Sepal.Length)] 
> dt.iris
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          5.1         3.5          1.4         0.2    setosa
  2:          4.9         3.0          1.4         0.2    setosa
  3:          4.7         3.2          1.3         0.2    setosa
  4:          4.6         3.1          1.5         0.2    setosa
  5:          5.0         3.6          1.4         0.2    setosa
 ---                                                            
146:          6.7         3.0          5.2         2.3 virginica
147:          6.3         2.5          5.0         1.9 virginica
148:          6.5         3.0          5.2         2.0 virginica
149:          6.2         3.4          5.4         2.3 virginica
150:          5.9         3.0          5.1         1.8 virginica
                                                myText
  1:    The species is setosa with sepal length of 5.1
  2:    The species is setosa with sepal length of 4.9
  3:    The species is setosa with sepal length of 4.7
  4:    The species is setosa with sepal length of 4.6
  5:      The species is setosa with sepal length of 5
 ---                                                  
146: The species is virginica with sepal length of 6.7
147: The species is virginica with sepal length of 6.3
148: The species is virginica with sepal length of 6.5
149: The species is virginica with sepal length of 6.2
150: The species is virginica with sepal length of 5.9

基准测试

library(data.table)
dt.iris <- as.data.table(iris)
dt.iris.l <- dt.iris[sample.int(nrow(dt.iris), 1e6, replace=TRUE), ]
gluedt <- function(...) glue::glue(..., .envir = parent.frame(3)$x)
microbenchmark::microbenchmark(
  sprintf=dt.iris.l[, myText := sprintf('The species is %s with sepal length of %.2g', 
                              Species, Sepal.Length)],
  paste=dt.iris.l[, myText := paste('The species is', Species, 'with sepal length of', Sepal.Length)] ,
  gluedt=dt.iris.l[, myText := gluedt('The species is {Species} with sepal length of {Sepal.Length}')],
  times=3L,
  check='identical'
)

$ Rscript --vanilla foo.R
Unit: milliseconds
    expr      min        lq      mean    median        uq       max neval cld
 sprintf  748.210  755.7418  758.8391  763.2735  764.1537  765.0338     3 a  
   paste 1545.685 1547.1562 1549.3632 1548.6278 1551.2025 1553.7771     3  b 
  gluedt 1426.333 1437.6870 1443.4343 1449.0413 1451.9851 1454.9289     3   c

  • 数据:*
> dt.iris <- as.data.table(iris)

8dtrkrch

8dtrkrch2#

你可以做

gluedt <- function(...) glue::glue(..., .envir = parent.frame(3)$x)

字符串
测试,我们有:

library(data.table)

data(iris)
dt.iris <- data.table(iris)
       
dt.iris[, myText := gluedt('The species is {Species} with sepal length of {Sepal.Length}')]

dt.iris
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#>   1:          5.1         3.5          1.4         0.2    setosa
#>   2:          4.9         3.0          1.4         0.2    setosa
#>   3:          4.7         3.2          1.3         0.2    setosa
#>   4:          4.6         3.1          1.5         0.2    setosa
#>   5:          5.0         3.6          1.4         0.2    setosa
#>  ---                                                            
#> 146:          6.7         3.0          5.2         2.3 virginica
#> 147:          6.3         2.5          5.0         1.9 virginica
#> 148:          6.5         3.0          5.2         2.0 virginica
#> 149:          6.2         3.4          5.4         2.3 virginica
#> 150:          5.9         3.0          5.1         1.8 virginica
#>                                                 myText
#>   1:    The species is setosa with sepal length of 5.1
#>   2:    The species is setosa with sepal length of 4.9
#>   3:    The species is setosa with sepal length of 4.7
#>   4:    The species is setosa with sepal length of 4.6
#>   5:      The species is setosa with sepal length of 5
#>  ---                                                  
#> 146: The species is virginica with sepal length of 6.7
#> 147: The species is virginica with sepal length of 6.3
#> 148: The species is virginica with sepal length of 6.5
#> 149: The species is virginica with sepal length of 6.2
#> 150: The species is virginica with sepal length of 5.9


创建于2023-11-11使用reprex v2.0.2

iyfamqjs

iyfamqjs3#

我的方法是简单地使用glue_data

dt.iris[Sepal.Width > 4, myText := glue_data(.SD, "The species is {Species} with sepal length of {Sepal.Length}")]

字符串
我认为这是由于glue将所有内容都视为一个字符串"The species is {Species} with sepal length of {Sepal.Length}"的方式,而不是像R中通常那样将字符串和变量(如pastesprintf)分开,因此data.table将正常工作。
另一种方法是使用元编程:

gluedt <- function(...) substitute(glue(..., .envir = .SD))
dt.iris[Sepal.Width > 4, myText := eval(gluedt("The species is {Species} with sepal length of {Sepal.Length}"))]

cyej8jka

cyej8jka4#

使用transform.data.table(来自data.table)给出所有data.table解决方案,或者使用mutate(来自dcloud)或fmutate(来自collapse)代替[.data.table。如果我们提供data.table输入,我们仍然会得到一个data.table结果。

dt.iris |>
  transform(myText = glue('The species is {Species} with sepal length of {Sepal.Length}'))

library(dplyr)
dt.iris |>
  mutate(myText = glue('The species is {Species} with sepal length of {Sepal.Length}'))

library(collapse)
dt.iris |>
  fmutate(myText = glue('The species is {Species} with sepal length of {Sepal.Length}'))

字符串

相关问题