我按照给定的模式创建dataframe,然后我想通过对现有的dataframe重新排序来创建新的dataframe。sparkDataframe中的列可以重新排序吗?
object Demo extends Context {
def main(args: Array[String]): Unit = {
val emp = Seq((1,"Smith",-1,"2018","10","M",3000),
(2,"Rose",1,"2010","20","M",4000),
(3,"Williams",1,"2010","10","M",1000),
(4,"Jones",2,"2005","10","F",2000),
(5,"Brown",2,"2010","40","",-1),
(6,"Brown",2,"2010","50","",-1)
)
val empColumns = Seq("emp_id","name","superior_emp_id","year_joined",
"emp_dept_id","gender","salary")
import sparkSession.sqlContext.implicits._
val empDF = emp.toDF(empColumns: _*)
empDF.show(false)
}
}
Current DF:
+------+--------+---------------+-----------+-----------+------+------+
|emp_id|name |superior_emp_id|year_joined|emp_dept_id|gender|salary|
+------+--------+---------------+-----------+-----------+------+------+
|1 |Smith |-1 |2018 |10 |M |3000 |
|2 |Rose |1 |2010 |20 |M |4000 |
|3 |Williams|1 |2010 |10 |M |1000 |
|4 |Jones |2 |2005 |10 |F |2000 |
|5 |Brown |2 |2010 |40 | |-1 |
|6 |Brown |2 |2010 |50 | |-1 |
+------+--------+---------------+-----------+-----------+------+------+
I want output as this following df, where gender and salary column re-ordered
New DF:
+------+--------+------+------+---------------+-----------+-----------+
|emp_id|name |gender|salary|superior_emp_id|year_joined|emp_dept_id|
+------+--------+------+------+---------------+-----------+-----------+
|1 |Smith |M |3000 |-1 |2018 |10 |
|2 |Rose |M |4000 |1 |2010 |20 |
|3 |Williams|M |1000 |1 |2010 |10 |
|4 |Jones |F |2000 |2 |2005 |10 |
|5 |Brown | |-1 |2 |2010 |40 |
|6 |Brown | |-1 |2 |2010 |50 |
+------+--------+------+------+---------------+-----------+-----------+
2条答案
按热度按时间yws3nbqq1#
只是使用
select()
要重新排列列,请执行以下操作:它将根据您的订单显示在
select()
争论。ijxebb2r2#
这是一种方法