我必须将下面的代码转换为Spark,但我不明白Seq在这段代码中到底执行了什么?
val tempFactDF = unionTempDF.join(fact.select("x","y","d","f","s"), Seq("x","y","d","f")).dropDuplicates
bnlyeluc1#
在这里,它对多个列执行联接操作,并定义为Seq("x","y","d","f")。它相当于:
Seq("x","y","d","f")
val joiningTable = fact.select("x","y","d","f","s") unionTempDF.join(joiningTable, unionTempDF("x") === joiningTable("x") && unionTempDF("y") === joiningTable("y") && unionTempDF("d") === joiningTable("d") && unionTempDF("f") === joiningTable("f"))
1条答案
按热度按时间bnlyeluc1#
在这里,它对多个列执行联接操作,并定义为
Seq("x","y","d","f")
。它相当于: