在R中的函数中引用数据.表列

9bfwbjaz 于 12个月前发布在其他

关注(0)|答案(1)|浏览(109)

在SAP中，我上传了用两个已知字段标识的事务处理的文本文件：“文档编号”和“位置”。文件中的多个事务处理可以有一个唯一ID。共享同一唯一ID的事务处理有一个文档编号，并有连续的位置。下面是文本文件的示例。
| uniqueID|文件编号|位置|
| --|--|--|
| 200 | 1 | 1 |
| 200 | 1 | 2 |
| 300 | 2 | 1 |
| 300 | 2 | 2 |
| 400 | 3 | 1 |
| 400 | 3 | 2 |
我设法写了一个代码，以产生这个文档号和位置序列的一个样本数据（tblSAP）.下面的代码工作良好.然而，我需要有这个代码作为一个函数，因为我有许多需要转换成SAP文本文件处理表.因此，我需要将此代码转换成一个函数.

library(data.table)
# sample data table
tblSAP <- data.table(
  UniqueID = c(200,200,300,300,400,400)
  
)

# code to generate Document No and Position fields

## Sort entire tblSAP table
tblSAP <- setorder(tblSAP,cols=UniqueID)

## create a temp table of UniqueID 
dfUniqueID <- tblSAP[,.(UniqueID)]

## create "Document No" and "Position" count
DT1 <- dfUniqueID[,.N, by=.(UniqueID, "Document No"=rleid(UniqueID))]

## extend UniqueIDs with Positions
DT2 <- dfUniqueID[,seq_len(.N), by = UniqueID] |> 
  setnames(old = "V1",new = "Position")

## Join UniqueID, Document Number, and Positions in one table
DT3 <- DT1[DT2,on = .(UniqueID)][,-c("N")]

## Bind SAP Upload table with the columns of Document No and Position from DT3 table
tblSAP <- cbind(tblSAP,DT3[,-c("UniqueID")])

字符串
当我将代码转换为函数时，我只将UniqueID替换为函数中的输入参数，如图所示。但似乎在函数中引用tblSAP列（UniqueID）时出错。我收到错误“Error in setorderv（x，order，na.last）：some columns are not in the data.table：IdFld”。
下面是我写的函数：

# function to generate Document No and Position fields
SAPDocPos <- function(tblSAP,IdFld){
  
  ## Sort entire tblSAP table
  tblSAP <- setorder(tblSAP,cols=IdFld)
  
  ## create a temp table of IdFld 
  dfUniqueID <- tblSAP[,.(IdFld)]
  
  ## create "Document No" and "Position" count
  DT1 <- dfUniqueID[,.N, by=.(IdFld, "Document No"=rleid(IdFld))]
  
  ## extend IdFlds with Positions
  DT2 <- dfUniqueID[,seq_len(.N), by = IdFld] %>% 
    setnames(old = "V1",new = "Position")
  
  ## Join IdFlds, Document Number, and Positions in one table
  DT3 <- DT1[DT2,on = .(IdFld)][,-c("N")]
  
  ## Bind SAP Upload table with the columns of Document No and Position from DT3 table
  tblSAP <- cbind(tblSAP,DT3[,-c("UniqueID")])
  return(tblSAP)
}

SAPDocPos(tblSAP,UniqueID)

型
非常感谢协助解决这一案件。

r

来源：https://stackoverflow.com/questions/77671257/referencing-data-table-columns-in-functions-in-r

1条答案

按热度按时间

sshcrbum1#

几处改动：

为了使用 * 符号 * UniqueID，我们需要使用deparse(substitute(.))将其转换为字符;一般来说，我倾向于建议不要使用非标准求值（正如您在这里尝试做的那样），特别是当它添加的唯一值是省略引号时.
setorder需要非标准求值，但我们可以使用setorderv，它需要一个字符向量
我将使用.SDcols=来按列划分子集，而不是尝试.(UniqueID)
对于DT1，by=也可以是一个字符向量，但这需要预先定义Document No，因此需要稍微更新
也许只是一个细微差别，我从%>%转移到|>以减少依赖性;如果您已经在其他地方依赖dplyr或magrittr，则不需要
这只是一种偏好，但文字return(.)语句仅在需要函数返回代码中较早的值时才需要，例如在if块中;由于函数总是返回最后一个表达式的值，因此我们可以省略它。（显式调用return(tblSAP)可以，但会向堆栈添加不必要的函数调用。）

SAPDocPos <- function(tblSAP,IdFld){
  IdFld <- deparse(substitute(IdFld))
  
  ## Sort entire tblSAP table
  tblSAP <- setorderv(tblSAP,cols=IdFld)
  
  ## create a temp table of IdFld 
  dfUniqueID <- tblSAP[,.SD, .SDcols = IdFld]
  
  ## create "Document No" and "Position" count
  DT1 <- dfUniqueID[, `Document No` := rleid(get(IdFld))][, .N, by = c(IdFld, "Document No")]
  
  ## extend IdFlds with Positions
  DT2 <- dfUniqueID[, seq_len(.N), by = IdFld] |>
    setnames(old = "V1", new = "Position")
  
  ## Join IdFlds, Document Number, and Positions in one table
  DT3 <- DT1[DT2, on = c(IdFld)][,-c("N")]
  
  ## Bind SAP Upload table with the columns of Document No and Position from DT3 table
  tblSAP <- cbind(tblSAP, DT3[, -c(IdFld), with=FALSE])

  tblSAP
}

SAPDocPos(tblSAP, UniqueID)
#    UniqueID Document No Position
#       <num>       <int>    <int>
# 1:      200           1        1
# 2:      200           1        2
# 3:      300           2        1
# 4:      300           2        2
# 5:      400           3        1
# 6:      400           3        2

字符串
如果你想在data.table的引用（就地）语义上“全部”，你可以添加一些set调用（更直接的data.table值设置）来将列添加到原始框架中。注意：这与data.table是一致的，但与base-R的写时复制语义不一致。

SAPDocPos <- function(tblSAP,IdFld){
  IdFld <- deparse(substitute(IdFld))
  
  ## Sort entire tblSAP table
  tblSAP <- setorderv(tblSAP,cols=IdFld)
  
  ## create a temp table of IdFld 
  dfUniqueID <- tblSAP[,.SD, .SDcols = IdFld]
  
  ## create "Document No" and "Position" count
  DT1 <- dfUniqueID[, `Document No` := rleid(get(IdFld))][, .N, by = c(IdFld, "Document No")]
  
  ## extend IdFlds with Positions
  DT2 <- dfUniqueID[, seq_len(.N), by = IdFld] |>
    setnames(old = "V1", new = "Position")
  
  ## Join IdFlds, Document Number, and Positions in one table
  DT3 <- DT1[DT2, on = c(IdFld)][,-c("N")]
  
  set(tblSAP, i = NULL, j = "Document No", value = DT3[["Document No"]])
  set(tblSAP, i = NULL, j = "Position", value = DT3[["Position"]])

  tblSAP
}

tblSAP
#    UniqueID
#       <num>
# 1:      200
# 2:      200
# 3:      300
# 4:      300
# 5:      400
# 6:      400
SAPDocPos(tblSAP, UniqueID) # not re-assigning over tblSAP
tblSAP
#    UniqueID Document No Position
#       <num>       <int>    <int>
# 1:      200           1        1
# 2:      200           1        2
# 3:      300           2        1
# 4:      300           2        2
# 5:      400           3        1
# 6:      400           3        2

型

赞(0）回复(0）举报 12个月前

我来回答

在R中的函数中引用数据.表列

1条答案

相关问题

热门标签

最新问答