我是Scala的新手
我想弄明白的是
这段代码给了我RDD[Int],而不是给出使用toDF
的选项
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9))
但是当我导入import spark.sqlContext.implicits._
时,它提供了一个使用toDF
的选项
import spark.sqlContext.implicits._
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
所以我查看了源代码,implicits
在SQLContext
类中是object
。我不明白,为什么RDD
示例在导入后能够调用toDF
?
有人能帮我理解吗?
更新
在SQLContext类中的代码段下面找到
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
object implicits extends SQLImplicits with Serializable {
protected override def _sqlContext: SQLContext = self
}
1条答案
按热度按时间km0tfn4u1#
toDF
是一个扩展方法。通过导入,您可以将必要的隐式带入作用域。例如,
Int
没有方法foo
但如果定义扩展方法并隐式导入
编译器将
1.foo()
转换为new IntOps(1).foo()
。同样地
import spark.sqlContext.implicits._
将spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
转换为rddToDatasetHolder(spark.sparkContext.parallelize...).toDF
,即DatasetHolder(_sqlContext.createDataset(spark.sparkContext.parallelize...)).toDF
。你可以在Scala中读到隐式和扩展方法
Understanding implicit in Scala
Where does Scala look for implicits?
Understand Scala Implicit classes
https://docs.scala-lang.org/overviews/core/implicit-classes.html
https://docs.scala-lang.org/scala3/book/ca-extension-methods.html
https://docs.scala-lang.org/scala3/reference/contextual/extension-methods.html
How extend a class is diff from implicit class?
关于
spark.implicits._
Importing spark.implicits._ in scala
What is imported with spark.implicits._?
import implicit conversions without instance of SparkSession
Workaround for importing spark implicits everywhere
Why is spark.implicits._ is embedded just before converting any rdd to ds and not as regular imports?