将字符串转换为类型列spark

bq3bfh9z  于 2021-07-14  发布在  Java
关注(0)|答案(1)|浏览(381)

代码:

import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.Column

def func(rawDF: DataFrame,primaryKey: Column, orderKey: Column): DataFrame = {

     //some process
    return newDf
}

我试图用上面的函数从现有的原始df创建一个新的已处理df。
代码:

var processedDF  = func(rawDF,"col1","col2")

错误:

<console>:73: error: type mismatch;
found   : String("col1")
required: org.apache.spark.sql.Column
   var processedDF  = func(rawDF,"col1","col2")
                                     ^

关于如何将函数参数的类型从string更改为org.apache.spark.sql.column,有什么建议吗

ubof19bj

ubof19bj1#

任何一个

import org.apache.spark.sql.functions.col

func(rawDF, col("col1"), col("col2"))

func(rawDF, rawDF("col1"), rawDF("col2"))

或提供 Column 直接通过 $ (其中 sparkSparkSession 对象)

import spark.implicits.StringToColumn

func(rawDF, $"col1", $"col2")

或者 Symbol ```
import spark.implicits.symbolToColumn

func(rawDF, 'col1, 'col2)

相关问题