为什么我的时间差在R中不像预期的那样出现?

pgccezyw  于 2023-07-31  发布在  其他
关注(0)|答案(1)|浏览(102)

我使用的是R包track2KBA中的数据集,其中包含一种海鸟的跟踪数据。我想测量每只鸟分组的每次迁移之间的时间差。
但是当我运行我的脚本时,我没有得到我所期望的差异。例如,第一差值应该是6秒。

track_id date_gmt   time_gmt longitude latitude lon_colony lat_colony datetime            difference
      <int> <chr>      <chr>        <dbl>    <dbl>      <dbl>      <dbl> <dttm>              <drtn>    
 1    69303 2012-07-21 11:01:54     -5.73    -16.0      -5.73      -16.0 2012-07-21 11:01:54  NA secs  
 2    69302 2012-07-21 11:02:00     -5.73    -16.0      -5.73      -16.0 2012-07-21 11:02:00  NA secs  
 3    69303 2012-07-21 11:03:33     -5.73    -16.0      -5.73      -16.0 2012-07-21 11:03:33  99 secs  
 4    69302 2012-07-21 11:03:42     -5.73    -16.0      -5.73      -16.0 2012-07-21 11:03:42 102 secs  
 5    69303 2012-07-21 11:05:13     -5.73    -16.0      -5.73      -16.0 2012-07-21 11:05:13 100 secs  
 6    69302 2012-07-21 11:05:26     -5.73    -16.0      -5.73      -16.0 2012-07-21 11:05:26 104 secs

字符串
下面是我的代码:

library(track2KBA)
library(tidyverse)
library(lubridate)

boobies$datetime <-
 (paste(boobies$date_gmt, boobies$time_gmt))

boobies <- boobies %>%
  mutate(datetime = lubridate::ymd_hms(datetime)) %>%
  group_by(track_id) %>%
  arrange(datetime) %>%
  mutate(difference = datetime - lag(datetime))


和一些来自软件包的示例数据:

boobies <- structure(list(track_id = c(69303L, 69302L, 69303L, 69302L, 69303L, 
69302L), date_gmt = c("2012-07-21", "2012-07-21", "2012-07-21", 
"2012-07-21", "2012-07-21", "2012-07-21"), time_gmt = c("11:01:54", 
"11:02:00", "11:03:33", "11:03:42", "11:05:13", "11:05:26"), 
    longitude = c(-5.72769, -5.72639, -5.72769, -5.72635, -5.72769, 
    -5.72639), latitude = c(-16.00749, -16.00713, -16.00749, 
    -16.00723, -16.00749, -16.0071), lon_colony = c(-5.73, -5.73, 
    -5.73, -5.73, -5.73, -5.73), lat_colony = c(-16.01, -16.01, 
    -16.01, -16.01, -16.01, -16.01), datetime = c("2012-07-21 11:01:54", 
    "2012-07-21 11:02:00", "2012-07-21 11:03:33", "2012-07-21 11:03:42", 
    "2012-07-21 11:05:13", "2012-07-21 11:05:26")), .internal.selfref = <pointer: (nil)>, row.names = c(NA, 6L), class = c("data.table", "data.frame"))

gdrx4gfi

gdrx4gfi1#

您的数据有问题。你得到的答案(在差列的开始有两个NA)是正确的(看起来是),因为前两行是前两个track_id s的前两个数据点(我假设对应于鸟类)。第一个点没有可以引用的点,因此它们都是NA。
无论如何,这里有两种方法:分组和非分组

library(tidyverse)

# not grouped by track_id (this gets the 6 second difference you were looking for)

mutate(boobies, difference = difftime(datetime, lag(datetime), units = "secs"))

# Output
  track_id date_gmt   time_gmt longitude latitude lon_colony lat_colony
     <int> <date>     <chr>        <dbl>    <dbl>      <dbl>      <dbl>
1    69303 2012-07-21 11:01:54     -5.73    -16.0      -5.73      -16.0
2    69302 2012-07-21 11:02:00     -5.73    -16.0      -5.73      -16.0
3    69303 2012-07-21 11:03:33     -5.73    -16.0      -5.73      -16.0
4    69302 2012-07-21 11:03:42     -5.73    -16.0      -5.73      -16.0
5    69303 2012-07-21 11:05:13     -5.73    -16.0      -5.73      -16.0
6    69302 2012-07-21 11:05:26     -5.73    -16.0      -5.73      -16.0
  datetime            difference
  <dttm>              <drtn>    
1 2012-07-21 11:01:54 NA secs   
2 2012-07-21 11:02:00  6 secs   
3 2012-07-21 11:03:33 93 secs   
4 2012-07-21 11:03:42  9 secs   
5 2012-07-21 11:05:13 91 secs   
6 2012-07-21 11:05:26 13 secs   

# grouped by track_id

mutate(boobies, difference = difftime(datetime, lag(datetime), units = "secs"), .by = track_id)

# Output:
# A tibble: 6 × 9
  track_id date_gmt   time_gmt longitude latitude lon_colony lat_colony
     <int> <date>     <chr>        <dbl>    <dbl>      <dbl>      <dbl>
1    69303 2012-07-21 11:01:54     -5.73    -16.0      -5.73      -16.0
2    69302 2012-07-21 11:02:00     -5.73    -16.0      -5.73      -16.0
3    69303 2012-07-21 11:03:33     -5.73    -16.0      -5.73      -16.0
4    69302 2012-07-21 11:03:42     -5.73    -16.0      -5.73      -16.0
5    69303 2012-07-21 11:05:13     -5.73    -16.0      -5.73      -16.0
6    69302 2012-07-21 11:05:26     -5.73    -16.0      -5.73      -16.0
  datetime            difference
  <dttm>              <drtn>    
1 2012-07-21 11:01:54  NA secs  
2 2012-07-21 11:02:00  NA secs  
3 2012-07-21 11:03:33  99 secs  
4 2012-07-21 11:03:42 102 secs  
5 2012-07-21 11:05:13 100 secs  
6 2012-07-21 11:05:26 104 secs

字符串

相关问题