如何根据spark scala中另一列的值创建新列?

njthzxwz  于 2021-05-16  发布在  Spark
关注(0)|答案(1)|浏览(628)

这是我的第一个Dataframe:

val sample_df = Seq(("john","morning",1.5,0.0),("john","night",0.0,3.9),("bill","morning",0.4,0.0),("bill","night",0.0,2.3)).toDF("name","time_of_day","morning_min","night_min")


我想添加一个名为“row\u min”的列,如果行“time\u of \u day”是“morning”,则该列取“morning\u min”列的值,否则为“night\u min”列。
下面是生成的Dataframe的外观:

val resulting_df = Seq(("john","morning",1.5,0.0,1.5),("john","night",0.0,3.9,3.9),("bill","morning",0.4,0.0,0.4),("bill","night",0.0,2.3,2.3)).toDF("name","time_of_day","morning_min","night_min","row_min")


任何帮助都将不胜感激。祝你今天愉快。

uxh89sit

uxh89sit1#

import org.apache.spark.sql.functions._

resulting_df = sample_df.withColumn("row_min",
    when($"time_of_day" == lit("morning"), $"morning_min").when($"time_of_day" == lit("night"), $"night_min")
)

相关问题