使用sparksql计算两列之间的秒差

6xfqseft  于 2021-05-29  发布在  Spark
关注(0)|答案(1)|浏览(1181)

我在scala中使用sparksql来计算两列之间的差异

writingTime,time
 2020-06-25T13:29:34.415Z,2020-06-25T13:29:33.190Z

我希望这两列之间的差值是1秒。
我使用了以下代码:

import org.apache.spark.sql.functions._
val df = spark.read.format("csv").option("header","true").load("path")
val diff_secs_col = col("time").cast("long") - col("writingTime").cast("long")
val resultDf = df.withColumn("date_diff_seconds", datediff(col("writingTime"), col("time"))).show()

有什么需要帮忙的吗

3vpjnl9f

3vpjnl9f1#

试试这个-

val resultDf = df.withColumn("date_diff_seconds",
      $"writingTime".cast("timestamp").cast("long") -  $"time".cast("timestamp").cast("long"))
    resultDf.show(false)

    /**
      * +------------------------+------------------------+---+
      * |writingTime             |time                    |date_diff_seconds  |
      * +------------------------+------------------------+---+
      * |2020-06-25T13:29:34.415Z|2020-06-25T13:29:33.190Z|1  |
      * +------------------------+------------------------+---+
      */

.show(false)

相关问题