R语言 如何将一列UTC时间戳转换为几个不同的时区?

edqdpe6u  于 12个月前  发布在  其他
关注(0)|答案(2)|浏览(100)

我有一个数据集,数据库中存储的日期是UTC,然而,时区实际上是不同的。

mydat <- data.frame(
  time_stamp=c("2022-08-01 05:00:00 UTC","2022-08-01 17:00:00 UTC","2022-08-02 22:30:00 UTC","2022-08-04 05:00:00 UTC","2022-08-05 02:00:00 UTC"),
  timezone=c("America/Chicago","America/New_York","America/Los_Angeles","America/Denver","America/New_York")
)

字符串
我想将时区应用于整个列中保存的UTC时间戳。
我查看了lubridate包中的with_tz函数,但我不知道如何引用“timezone”列,而不是硬编码一个值。
如果我尝试

with_tz(mydat$time_stamp, tzone = mydat$timezone)


我得到以下错误

Error in as.POSIXlt.POSIXct(x, tz) : invalid 'tz' value`


但是,如果我尝试

mydat$time_stamp2 <- with_tz(mydat$time_stamp,"America/New_York")


这将呈现一个新的列没有问题。我怎么能这样做,只是引用列值?

gzszwxb4

gzszwxb41#

以下内容应满足您的要求:

代码

mydat <- data.frame(time_stamp=c("2022-08-01 05:00:00 UTC", 
                                 "2022-08-01 17:00:00 UTC",
                                 "2022-08-02 22:30:00 UTC",
                                 "2022-08-04 05:00:00 UTC", 
                                 "2022-08-05 02:00:00 UTC"),
                    timezone=c("America/Chicago", "America/New_York", 
                               "America/Los_Angeles", "America/Denver",
                               "America/New_York"))

mydat$utc <- anytime::utctime(mydat$time_stamp, tz="UTC")
mydat$format <- ""
for (i in seq_len(nrow(mydat)))
    mydat[i, "format"] <- strftime(mydat[i,"utc"], 
                                   "%Y-%m-%d %H:%M:%S",
                                   tz=mydat[i,"timezone"])

字符串

输出

> mydat
               time_stamp            timezone                 utc              format
1 2022-08-01 05:00:00 UTC     America/Chicago 2022-08-01 05:00:00 2022-08-01 00:00:00
2 2022-08-01 17:00:00 UTC    America/New_York 2022-08-01 17:00:00 2022-08-01 13:00:00
3 2022-08-02 22:30:00 UTC America/Los_Angeles 2022-08-02 22:30:00 2022-08-02 15:30:00
4 2022-08-04 05:00:00 UTC      America/Denver 2022-08-04 05:00:00 2022-08-03 23:00:00
5 2022-08-05 02:00:00 UTC    America/New_York 2022-08-05 02:00:00 2022-08-04 22:00:00
>

评论

我们首先将你的数据解析为UTC,我曾经在我的anytime包中为此编写了一个帮助函数(有其他方法,但这是我的方法...)。然后我们需要从给定的(数字!!)UTC表示格式化为给定的时区。我们需要一个循环,因为strftime()tz参数没有矢量化。

py49o6xq

py49o6xq2#

Dirk给出了一个很好的答案,它使用了(大部分)基本的R工具,如果这是您的要求。我还想添加一个使用我开发的clock包的答案,因为它不需要在 Dataframe 上按行工作。clock有一个名为sys_time_info()的函数,用于检索有关特定时区中UTC时间点的低级信息。它是少数几个函数之一,有一个矢量化的zone参数(这里需要)并返回一个来自UTC的offset是有意义的,它在这里转换为“本地”时间时很有用。
正如其他人所提到的,您将无法构建一个日期-时间向量来存储多个时区,但如果您只需要查看这些时区的本地时间,这仍然很有用。

library(clock)

mydat <- data.frame(
  time_stamp=c("2022-08-01 05:00:00 UTC","2022-08-01 17:00:00 UTC","2022-08-02 22:30:00 UTC","2022-08-04 05:00:00 UTC","2022-08-05 02:00:00 UTC"),
  timezone=c("America/Chicago","America/New_York","America/Los_Angeles","America/Denver","America/New_York")
)

# Parse into a "sys-time" type, which can be thought of as a UTC time point
mydat$time_stamp <- sys_time_parse(mydat$time_stamp, format = "%Y-%m-%d %H:%M:%S")
mydat
#>            time_stamp            timezone
#> 1 2022-08-01T05:00:00     America/Chicago
#> 2 2022-08-01T17:00:00    America/New_York
#> 3 2022-08-02T22:30:00 America/Los_Angeles
#> 4 2022-08-04T05:00:00      America/Denver
#> 5 2022-08-05T02:00:00    America/New_York

# "Low level" information about DST, the time zone abbreviation,
# and offset from UTC in that zone. This is one of the few functions where
# it makes sense to have a vectorized `zone` argument.
info <- sys_time_info(mydat$time_stamp, mydat$timezone)
info
#>                 begin                 end offset  dst abbreviation
#> 1 2022-03-13T08:00:00 2022-11-06T07:00:00 -18000 TRUE          CDT
#> 2 2022-03-13T07:00:00 2022-11-06T06:00:00 -14400 TRUE          EDT
#> 3 2022-03-13T10:00:00 2022-11-06T09:00:00 -25200 TRUE          PDT
#> 4 2022-03-13T09:00:00 2022-11-06T08:00:00 -21600 TRUE          MDT
#> 5 2022-03-13T07:00:00 2022-11-06T06:00:00 -14400 TRUE          EDT

# Add the offset to the sys-time and then convert to a character column
# (these times don't really represent sys-time anymore since they are now localized)
mydat$localized <- as.character(mydat$time_stamp + info$offset)
mydat
#>            time_stamp            timezone           localized
#> 1 2022-08-01T05:00:00     America/Chicago 2022-08-01T00:00:00
#> 2 2022-08-01T17:00:00    America/New_York 2022-08-01T13:00:00
#> 3 2022-08-02T22:30:00 America/Los_Angeles 2022-08-02T15:30:00
#> 4 2022-08-04T05:00:00      America/Denver 2022-08-03T23:00:00
#> 5 2022-08-05T02:00:00    America/New_York 2022-08-04T22:00:00

字符串

相关问题