将列中的NA替换为最接近的,就日期列而言,非NA具有条件,在R中[关闭]

ttp71kqs  于 2023-10-13  发布在  其他
关注(0)|答案(1)|浏览(117)

已关闭,此问题需要details or clarity。它目前不接受回答。
**想改善这个问题吗?**通过editing this post添加详细信息并澄清问题。

12天前关闭
Improve this question
我有一个类似于下面这个的框架--我的实际规模更大,而且是分组的--我想知道如何用最接近的非NA来估算一个整数变量的NA,就日期而言,距离观察不到30天--之前或之后。当出现平局时,我想选择较早的日期,而不是较晚的日期。我找到了this,但它不能解释连续的NA。
任何帮助都将是非常感激的!

df <- data.frame(
  dates = c("2023-09-01", "2023-09-02", "2023-09-05", "2023-09-06", "2023-09-10",
            "2023-09-11", "2023-09-14", "2023-09-16", "2023-09-20", "2023-09-27", "2023-09-28"),
  x = c(10, NA, 20, NA, NA, 30, NA, NA, NA, 40, NA)
)

# desired output for the x column

x= c(10, 10, 20, 20, 30, 30, 30, 30, 40, 40, 40)
mnemlml8

mnemlml81#

下面是一个tidyverse解决方案:

library(tidyverse)

df <- df %>% mutate(dates = as.Date(dates))  # type cast from Q sample

ddd <- filter(df, !is.na(x)) %>% pull(dates) # vector of dates for which x'es are available

useDatediff <- function(d){       # find nearest date for which value is present
  aftr <- min(ddd[ddd >= d] - d)  # and return the days difference (+) after or (-) before
  bfr <- min(d - ddd[ddd <= d])
    if (aftr > 30 && bfr > 30) return(0) # after left_join x will be NA
    if (bfr <= aftr) { # bfr is prevailing
      -bfr 
    } else {
      aftr
    } 
}

df %>% 
  rowwise() %>% 
  mutate(useDatediff = useDatediff(dates)
  ) %>% 
  mutate(useDate = dates + useDatediff) %>% 
  left_join(df, by = c("useDate" = "dates")) %>% 
  transmute(dates, 
            x = coalesce(x.x, x.y))

测试结果:

# A tibble: 11 × 2
# Rowwise: 
   dates          x
   <date>     <dbl>
 1 2023-09-01    10
 2 2023-09-02    10
 3 2023-09-05    20
 4 2023-09-06    20
 5 2023-09-10    30
 6 2023-09-11    30
 7 2023-09-14    30
 8 2023-09-16    30
 9 2023-09-20    40
10 2023-09-27    40
11 2023-09-28    40

相关问题