spark scala

jdg4fx2g 于 2021-05-19 发布在 Spark

关注(0)|答案(1)|浏览(505)

我在Dataframe（scala）中有一个时间戳列，希望从中获取毫秒。unix\u timestamp是到秒斩波，我不能做unix\u timestamp*1000，因为我在寻找精确的毫秒转换
输入Dataframe

+---------+-----------------------+-----+-----------------------+
|OrderName|DateTime               |Count|timestamp              |
+---------+-----------------------+-----+-----------------------+
|a        |2020-07-11 23:58:45.538|1    |2020-07-11 23:58:45.538|
|a        |2020-07-12 00:00:07.307|2    |2020-07-12 00:00:07.307|
|a        |2020-07-12 00:01:08.817|3    |2020-07-12 00:01:08.817|
|a        |2020-07-12 00:02:15.675|1    |2020-07-12 00:02:15.675|
|a        |2020-07-12 00:05:48.277|1    |2020-07-12 00:05:48.277|
+---------+-----------------------+-----+-----------------------+
Second column is string and i used to to_timestamp($"DateTime") to get 4th column

Example 2020-07-11 23:58:45.538 -> 1594537125538

scala apache-spark

来源：https://stackoverflow.com/questions/64495154/get-exact-milliseconds-from-time-stamp-spark-scala

1条答案

按热度按时间

wh6knrhe1#

您可以使用一个自定义项来获取此值，该自定义项将字符串读入一个瞬间，然后将其转换为epoch毫秒：

import org.apache.spark.sql.functions._
import java.time._
import java.time.format.DateTimeFormatter
//...
spark.udf.register("to_epoch_millis", 
                   (s: String) => LocalDateTime.parse(s, DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS"))
                       .toInstant(ZoneOffset.UTC).toEpochMilli())

然后

df.selectExpr("to_epoch_millis(DateTime) as ts").show()

+-------------+
|           ts|
+-------------+
|1594511925538|
|1594512007307|
+-------------+

以上假设 DateTime 是utc时间戳。

展开查看全部

赞(0）回复(0）举报 2021-05-20

我来回答

spark scala

1条答案

相关问题

热门标签

最新问答