在spark dataframe中创建一个没有毫秒部分的timestamp列

3duebb1j 于 2021-07-13 发布在 Spark

关注(0)|答案(2)|浏览(427)

我正在尝试创建一个名为 load_time_stamp 在pyspark的Dataframe中，它应该只包含今天的日期和时间，直到秒，不应该包含毫秒。
我已经为相同的内容编写了下面的代码，但是使用它，一个新的列将被创建为空值，而不是我所期望的时间戳值。

from pyspark.sql import functions as F

x.withColumn("load_time_stamp", F.to_timestamp(F.substring(F.current_timestamp(), 0, 19), "yyyy-MM-dd'T'HH:mm:ss")).show()

python apache-spark pyspark apache-spark-sql

来源：https://stackoverflow.com/questions/66315018/create-a-timestamp-column-in-spark-dataframe-without-milliseconds-part

2条答案

按热度按时间

vd8tlhqk1#

你可以用 date_format 取而代之的是：

import pyspark.sql.functions as F

x.withColumn("load_time_stamp", F.date_format(F.current_timestamp(), "yyyy-MM-dd'T'HH:mm:ss"))

请注意 to_timestamp 从给定格式转换时间戳，而 date_format 将时间戳转换为给定格式。你不需要这么做 substring 当前时间戳是因为 date_format 将在格式化为所需格式时处理该问题。

赞(0）回复(0）举报 2021-07-13

bq3bfh9z2#

如果您想得到一个只有秒精度的timestamp类型列，那么可以使用 from_unixtime 功能。
例子：

from pyspark.sql import functions as F

x = spark.createDataFrame([(1,)], ["id"])

x.withColumn(
    "load_time_stamp",
    F.from_unixtime(F.current_timestamp().cast("long"))
).show(truncate=False)

# +---+-------------------+

# |id |load_time_stamp    |

# +---+-------------------+

# |1  |2021-02-22 15:35:34|

# +---+-------------------+

赞(0）回复(0）举报 2021-07-13

我来回答

在spark dataframe中创建一个没有毫秒部分的timestamp列

2条答案

相关问题

热门标签

最新问答