scala 将数据框列转换为元组列表

az31mfrm 于 2022-11-09 发布在 Scala

关注(0)|答案(2)|浏览(197)

我有：

val DF1 = sparkSession.sql("select col1,col2,col3 from table");
val tupleList = DF1.select("col1","col2").rdd.map(r => (r(0),r(1))).collect()

tupleList.foreach(x=> x.productIterator.foreach(println))

但我没有得到输出中的所有元组。问题出在哪里？

col1 col2
AA  CCC
AA  BBB 
DD  CCC 
AB  BBB 
Others  BBB 
GG  ALL 
EE  ALL 
Others  ALL 
ALL BBB 
NU FFF 
NU  Others 
Others  Others 
C   FFF

我得到的输出是：CCC AA BBB AA Others AA Others DD ALL Others ALL GG ALL ALL

scala

来源：https://stackoverflow.com/questions/41830916/converting-dataframe-columns-into-list-of-tuples

2条答案

按热度按时间

c3frrgcw1#

scala> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
scala> val df1 = hiveContext.sql("select id, name from class_db.students")
scala> df1.show()
+----+-------+
|  id|   name|
+----+-------+
|1001|   John|
|1002|Michael|
+----+-------+

scala> df1.select("id", "name").rdd.map(x => (x.get(0), x.get(1))).collect()
res3: Array[(Any, Any)] = Array((1001,John), (1002,Michael))

赞(0）回复(0）举报 2022-11-09

egdjgwm82#

解决使用pysppark时出现的无效语法问题

temp = df1.select('id','name').rdd.map(lambda x: (x[0],x[1])).collect()

赞(0）回复(0）举报 2022-11-09

我来回答

scala 将数据框列转换为元组列表

2条答案

相关问题

热门标签

最新问答