用R转换抽象语法树

hzbexzde  于 2024-01-03  发布在  其他
关注(0)|答案(3)|浏览(97)

给定一个算术表达式,例如x + y*z,我想把它转换成add(x, multiply(y, z))
我发现了一个helpful function here

> getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)
> getAST(quote(x + y*z)) 
[[1]]
`+`

[[2]]
x

[[3]]
[[3]][[1]]
`*`

[[3]][[2]]
y

[[3]][[3]]
z

字符串
可以使用rapply(result, as.character, how = "list")来获取字符而不是符号。
如何从这个AST中得到add(x, multiply(y, z))(结果)?当有一些括号时,这变得更加复杂:

> getAST(quote((x + y) * z)) 
[[1]]
`*`

[[2]]
[[2]][[1]]
`(`

[[2]][[2]]
[[2]][[2]][[1]]
`+`

[[2]][[2]][[2]]
x

[[2]][[2]][[3]]
y


[[3]]
z


我不要求答案必须使用getAST函数。这只是一种可能的方法。
当然,在我的真实的用例中,表达式更长。
这里有一个解决方案(我认为)的情况下,没有括号:

getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)

ast <- rapply(getAST(quote(x + y*z)), as.character, how = "list")

convertAST <- function(ast) {
  op <- switch(
    ast[[1]],
    "+" = "add",
    "-" = "subtract",
    "*" = "multiply",
    "/" = "divide"
  )
  left <- ast[[2]]
  right <- ast[[3]]
  if(is.character(left) && is.character(right)) {
    return(sprintf("%s(%s, %s)", op, left, right))
  }
  if(is.character(left)) {
    return(sprintf("%s(%s, %s)", op, left, convertAST(right)))
  }
  if(is.character(right)) {
    return(sprintf("%s(%s, %s)", op, convertAST(left), right))
  }
  return(sprintf("%s(%s, %s)", op, convertAST(left), convertAST(right)))
}

convertAST(ast)

cngwdvgl

cngwdvgl1#

我们可以这样使用替代:

subst <- function(e, sub = list(`+` = "add", 
                                `-` = "minus",
                                `/` = "divide",
                                `*` = "multiply")) {
  sub <- Map(as.name, sub)
  do.call("substitute", list(e, sub))
}

# test
e <- quote(x + (y + 1) * z)
res <- subst(e); res
## add(x, multiply((add(y, 1)), z))

# evaluate test against values
add <- `+`; multiply <- `*`; x <- 1; y <- 2; z <- 3
eval(res)
## [1] 10

字符串
如果你想要一个字符串结果,

deparse1(subst(e))
## [1] "add(x, multiply((add(y, 1)), z))"

eoxn13cs

eoxn13cs2#

这可能只是因为我不太理解rapply,但任何时候我试图使用它,我的代码都比编写自己的递归函数更复杂。
在本例中,我将递归函数放在一个瘦 Package 器中,该 Package 器允许直接输入表达式,而无需使用quote(如果需要)。

sub_call <- function(input, direct = TRUE,
                     subs = list(`+` = "add", `-` = "minus", 
                                 `/` = "divide", `*` = "multiply")) {
  scall <- function(x, subs) {
    if(is.call(x))
    {
      if(as.character(x[[1]]) %in% names(subs)) {
        x[[1]] <- str2lang(subs[[match(as.character(x[[1]]), names(subs))]])
      }
    }
    if(length(x) == 1) return(x) 
    x[-1] <- lapply(x[-1], scall, subs = subs)
    x
  }

  if(direct) return(scall(as.list(match.call())$input, subs))
  return(scall(input, subs))
}

字符串
这允许直接输入表达式:

sub_call(x + y*z)
#> add(x, multiply(y, z))


或间接输入:

my_expr <- quote(x + y*z)

sub_call(my_expr, direct = FALSE)
#> add(x, multiply(y, z))


并处理任意深度的嵌套,使括号保持不变:

sub_call(sin(((x + (1/3))^2)))
#> sin(((add(x, (divide(1, 3))))^2))

gab6jxml

gab6jxml3#

使用外部包,这也可以用rrapply()(在rrapply包中)来完成,与基本rapply()相反,它也通过调用对象/表达式向量递归:

## examples 
lang <- quote(x + (y + 1) * z)
expr <- expression(x + (y + 1) * z, sin(((x + (1/3))^2)))

## replacement function
replace_symbol <- function(s) {
    switch(
      as.character(s),
      '+' = quote(add),
      '-' = quote(minus),
      '/' = quote(divide),
      '*' = quote(multiply),
      s
    )
}

lang1 <- rrapply::rrapply(lang, f = replace_symbol, how = "replace")
str(lang1)
#>  language add(x, multiply((add(y, 1)), z))

expr1 <- rrapply::rrapply(expr, f = replace_symbol, how = "replace")
str(expr1)
#>   expression(add(x, multiply((add(y, 1)), z)), sin(((add(x, (divide(1, 3))))^2)))

字符串

相关问题