代码:
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.Column
def func(rawDF: DataFrame,primaryKey: Column, orderKey: Column): DataFrame = {
//some process
return newDf
}
我试图用上面的函数从现有的原始df创建一个新的已处理df。
代码:
var processedDF = func(rawDF,"col1","col2")
错误:
<console>:73: error: type mismatch;
found : String("col1")
required: org.apache.spark.sql.Column
var processedDF = func(rawDF,"col1","col2")
^
关于如何将函数参数的类型从string更改为org.apache.spark.sql.column,有什么建议吗
1条答案
按热度按时间ubof19bj1#
任何一个
或
或提供
Column
直接通过$
(其中spark
是SparkSession
对象)或者
Symbol
```import spark.implicits.symbolToColumn
func(rawDF, 'col1, 'col2)