R语言 使用lapply()更改data.table中在data.table语法之外引用的多个列

vuktfyat  于 2023-05-11  发布在  其他
关注(0)|答案(1)|浏览(107)

我有一个数据表dt,它有三列:

dt <- data.table(A=c("eggs","flour","butter","eggs","eggs"), B = c(8, 9, 7, 5, 1), C = c(5, 1, 0, 2, 3))
dt
        A B C
1:   eggs 8 5
2:  flour 9 1
3: butter 7 0
4:   eggs 5 2
5:   eggs 1 3

我想将B和C的值乘以10,只要A == eggs,可以实现如下:

dt[A=="eggs", c("B","C") := lapply(.SD, function(x){x*10}), .SDcols = c("B","C")]
dt
        A  B  C
1:   eggs 80 50
2:  flour  9  1
3: butter  7  0
4:   eggs 50 20
5:   eggs 10 30

我使用lapply()是因为在我的实际data.table中,我有比B和C更多的列,它们的值需要相乘。出于这个原因,我想通过在data.table语法之外定义columns_to_multiply <- c("B","C")来使代码看起来不那么混乱。在documentation中提到,可以使用..前缀从语法中调用列名向量,例如:就像这样:

columns_to_multiply <- c("B","C")
dt[A=="eggs", ..columns_to_multiply := lapply(.SD, function(x){x*10}), .SDcols = ..columns_to_multiply]

这给了我以下错误:

Error in eval(colsub, parent.frame(), parent.frame()) : 
object '..columns_to_multiply' not found

如果我只在j中使用..前缀:

dt[A=="eggs", ..columns_to_multiply := lapply(.SD, function(x){x*10}), .SDcols = columns_to_multiply]

我收到以下错误消息:

Error in `[.data.table`(dt, A=="eggs", `:=`(..columns_to_multiply, lapply(.SD,  : 
Supplied 2 items to be assigned to 3 items of column '..columns_to_multiply'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.

有没有一种干净简单的方法来执行我想要的操作,通过在data.table语法中正确引用我的columns_to_multiply向量?

qyzbxkaa

qyzbxkaa1#

比你想的要简单-

library(data.table)
columns_to_multiply <- c("B","C")

dt[A=="eggs", (columns_to_multiply) := lapply(.SD, function(x){x*10}), 
              .SDcols = columns_to_multiply]

dt
#        A  B  C
#1:   eggs 80 50
#2:  flour  9  1
#3: butter  7  0
#4:   eggs 50 20
#5:   eggs 10 30

由于乘法可以直接应用于dataframe/data.table,因此您可以进一步简化它,而无需使用lapply-

dt[A=="eggs", (columns_to_multiply) := .SD * 10, .SDcols = columns_to_multiply]

相关问题