假设我们有两个 Dataframe :
DF1
structure(list(ID = c("A", "A", "A", "A", "B", "B", "C", "C",
"C", "C"), Date.X = structure(c(1452643200, 1452643200, 1452643200,
1560556800, 1491177600, 1537142400, 1524700800, 1524787200, 1452297600,
1417132800), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Score.X = c(40, 50, 10, 45, 60, 100, 90, 400, 376, 276)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
DF2
structure(list(ID = c("A", "A", "A", "B", "B", "B", "B", "C",
"C"), Date.Y = structure(c(1433462400, 1474588800, 1511740800,
1431475200, 1442188800, 1537747200, 1577404800, 1451606400, 1493942400
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), Score.Y = c(35,
39, 130, 240, 126, 100, 95, 327, 257)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -9L))
我想通过匹配相同的ID和近似日期,将分数从df1输入df2。
预期结果:
structure(list(ID = c("A", "A", "A", "A", "B", "B", "C", "C",
"C", "C"), Date.X = structure(c(1452643200, 1452643200, 1452643200,
1560556800, 1491177600, 1537142400, 1524700800, 1524787200, 1452297600,
1417132800), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
Score.X = c(40, 50, 10, 45, 60, 100, 90, 400, 376, 276),
Score.Y = c(35, 35, 35, 130, 100, 100, 257, 257, 327, 327
), Date.Y = structure(c(1433462400, 1433462400, 1433462400,
1511740800, 1537747200, 1537747200, 1493942400, 1493942400,
1451606400, 1451606400), class = c("POSIXct", "POSIXt"), tzone = "UTC")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
我试着用dplyr自己用r编码,但没有得到结果。
library(dplyr)
result <- df1 %>%
left_join(df2, by="ID") %>%
group_by(ID) %>%
mutate(diff = abs(as.numeric(difftime(Date.X, Date.Y, units="days")))) %>%
slice_min(diff) %>%
select(-diff
1条答案
按热度按时间vyswwuz21#
也许你可以用它来做,但我真的不知道你想在这里实现什么。
输出: