使用when函数创建客户规则

iklwldmw  于 2021-05-18  发布在  Spark
关注(0)|答案(1)|浏览(337)

我正在尝试使用“when”函数创建自定义规则,以便最终将它们应用于Dataframe的一列。这些规则中有许多将应用于不同的列,但其思想不是为每一列编写规则,而是将它们存储在变量中并串联起来。例如,我有以下内容:

df
.withColumn("campoOut1",when(col("campo1") === "G" && col("campo2") === "00", "001"))
.withColumn("campoOut2",
    when(col("campo1") === "G" && col("campo2") === "00", "001").
    when(col("campo3") === "G" && col("campo4") =!= "00", "002"))

我想实现以下目标:

val ruler1 = when(col("campo1") === "G" && col("campo2") === "00", "001")
val ruler2 = when(col("campo3") === "G" && col("campo4") =!= "00", "002")

 df.withColumn("campoOut1",ruler1)
   .withColumn("campoOut2",ruler1 + ruler2)

我没有成功,因为变量ruler1和ruler2不是“string”类型,你知道怎么做吗?
非常感谢

5t7ly7z5

5t7ly7z51#

您可以递归地链接规则:

def chainRules(rules: (Column, String)*) = {
     def go(rules: Seq[(Column, String)], chained: Column): Column = {
        if (rules.isEmpty) {
           return chained
        }
        go(rules.tail, chained.when(rules.head._1, rules.head._2))
     }
     go(rules.tail, when(rules.head._1, rules.head._2))
  }

但你需要像这样调整你的规则:

val rule1 = (col("campo1") === "G" && col("campo2") === "00", "001")
val rule2 = (col("campo3") === "G" && col("campo4") =!= "00", "002")

你可以这样使用它:

df.withColumn("campoOut1", chainRules(rule1))
   .withColumn("campoOut2", chainRules(rule1, rule2))

相关问题