如何从R中的两个单独的 Dataframe 中确定一个日期在另一个日期的n天内?

xxb16uws  于 2023-05-04  发布在  其他
关注(0)|答案(2)|浏览(152)

我有两个数据框,一个是采样日期,另一个是 Storm (AR)事件。我需要确定采样日期是否发生在 Storm 的+/- 3天。
我希望最终的结果看起来像这样:

Reprex ~~

样本 Dataframe :

structure(list(Date = structure(c(7319,7378,7439,7500,7531,7562,7592,7623,7653,7684),class =“Date”),NO3 = c(2.37,3.42,3.13,2.24,1.97秒2.22秒58,2.15,2.05,3.09),cumP = c(122.8 104 19.9 204 0 8.8 1343,232.8,168.3,171.3),季节= c(“R”,“R”,“F”,“I”,“H”,“R”,“R”,“R”,“R”,“R”),行。names = c(NA,
10L),class =“data.frame”)

** Storm 事件:**

结构(list(Date = structure(c(7311,7313,7316,7329,7338,7345,7355,7451,7458,7474,7580,7581,7586,7598,7601,7602,7615,7617,7618,7619,7620,7621,7630,7631,7632,7637,7641,7642,7646,7647,7655),class =“Date”)),行。names = c(NA,-31L),class =“data.frame”)

li9yvcax

li9yvcax1#

在基R中,您可以创建一个从 Storm 日期起+/- 3天的序列,然后使用简单的逻辑检查您的df$Date值是否在使用%in%的 Storm 日期向量中。
(Note我将示例数据框命名为df,将示例 Storm 日期命名为events

df$AR_occurence <- df$Date %in% (rep(events$Date, each=7) + -3:3) # thanks @thelatemail

输出:

Date  NO3  cumP Season AR_occurence
1  1990-01-15 2.37 122.8      R         TRUE
2  1990-03-15 3.42 104.0      R        FALSE
3  1990-05-15 3.13  19.9      F        FALSE
4  1990-07-15 2.24  20.4      I        FALSE
5  1990-08-15 1.97   0.0      H        FALSE
6  1990-09-15 2.22   8.8      R        FALSE
7  1990-10-15 2.58 134.3      R        FALSE
8  1990-11-15 2.15 232.8      R         TRUE
9  1990-12-15 2.05 168.3      R         TRUE
10 1991-01-15 3.09 171.3      R        FALSE
fjaof16o

fjaof16o2#

这是可行的:

a = structure(list(Date = structure(c(7319, 7378, 7439, 7500, 7531, 7562, 7592, 7623, 7653, 7684), class = "Date"), NO3 = c(2.37, 3.42, 3.13, 2.24, 1.97, 2.22, 2.58, 2.15, 2.05, 3.09), cumP = c(122.8, 104, 19.9, 20.4, 0, 8.8, 134.3, 232.8, 168.3, 171.3), Season = c("R", "R", "F", "I", "H", "R", "R", "R", "R", "R")), row.names = c(NA,
                                                                                                                                                                                                                                                                                                                                        10L), class = "data.frame")
a2 = structure(list(Date = structure(c(7311, 7313, 7316, 7329, 7338, 7345, 7355, 7451, 7458, 7474, 7580, 7581, 7586, 7598, 7601, 7602, 7615, 7617, 7618, 7619, 7620, 7621, 7630, 7631, 7632, 7637, 7641, 7642, 7646, 7647, 7655), class = "Date")), row.names = c(NA, -31L), class = "data.frame")

storm_dates = a2 %>% pull  

  a %>% rowwise %>%
   mutate(closest_date = storm_dates[which.min(abs(storm_dates - Date))]) %>% 
   mutate(AR_ocurrence = abs(Date - closest_date) <= 3, , .keep = 'unused')

 # NO3  cumP Season AR_ocurrence
 # <dbl> <dbl> <chr>  <lgl>       
 # 1  2.37 123.  R      TRUE        
 # 2  3.42 104   R      FALSE       
 # 3  3.13  19.9 F      FALSE       
 # 4  2.24  20.4 I      FALSE       
 # 5  1.97   0   H      FALSE       
 # 6  2.22   8.8 R      FALSE       
 # 7  2.58 134.  R      FALSE       
 # 8  2.15 233.  R      TRUE        
 # 9  2.05 168.  R      TRUE        
 # 10  3.09 171.  R      FALSE

编辑:改变了我的答案。还添加了删除closest_date的位

相关问题