通过R中的重复日期绑定或合并行

6rvt4ljy  于 2023-03-05  发布在  其他
关注(0)|答案(2)|浏览(123)

我有一个像这样的6列数据框。您可以看到日期是重复的。我如何合并行并保留以下列中的信息?

  1. date 1 2 3 4 5
  2. 2019-01-01 NA 1966439. NA NA NA
  3. 2019-01-01 NA NA NA 133.6 NA
  4. 2019-01-01 NA NA NA NA 6.2
  5. 2019-02-01 NA 1962946 NA NA NA
  6. 2019-02-01 NA NA NA 134.5 NA
  7. 2019-02-01 NA NA NA NA 6.1
  8. 2019-03-01 NA 1974072 NA NA NA
  9. 2019-03-01 NA NA NA 135.4 NA
  10. 2019-03-01 NA NA NA NA 6.3
  11. 2019-04-01 NA 1984086 NA NA NA

我想要这样的,没有重复的日期。

  1. date 1 2 3 4 5
  2. 2019-01-01 NA 1966439 NA 133.6 6.2
  3. 2019-02-01 NA 1962946 NA 134.5 6.1
  4. 2019-03-01 NA 1974072 NA 135.4 6.3
  5. 2019-04-01 NA 1984086 NA NA NA

非常感谢

bvjveswy

bvjveswy1#

我在下面粘贴了一个解决方案。希望我的评论能很好地解释这个解决方案。

  1. #Packages used
  2. library(dplyr)
  3. #Some reproducible data
  4. dta <- data.frame(
  5. date = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
  6. a = c(NA, NA, NA, NA, NA, NA, NA, NA, NA),
  7. x = c(123, NA, NA, 3456, NA, NA, 2345, NA, NA),
  8. y = c(NA, 123, NA, NA, 3456, NA, NA, 2345, NA),
  9. z = c(NA, NA, 123, NA, NA, 3456, NA, NA, 2345)
  10. )
  11. dta <- dta |>
  12. group_by(date) |> #To group by the dates
  13. dplyr::summarise(a = sum(a, na.rm = TRUE), #just summarise the single value (min(), mean(), etc. work just as well)
  14. x = sum(x, na.rm = TRUE),
  15. y = sum(y, na.rm = TRUE),
  16. z = sum(z, na.rm = TRUE)) |>
  17. select_if(~sum(.) > 0) #Remove columns with sum of 0 (columns with all NA)
展开查看全部
ws51t4hk

ws51t4hk2#

如果每列中每个日期只有一个非缺失值,则以下代码有效:

  1. library(tidyverse)
  2. df <- tibble::tribble(
  3. ~date, ~col1, ~col2, ~col3, ~col4, ~col5,
  4. "2019-01-01", NA, 1966439, NA, NA, NA,
  5. "2019-01-01", NA, NA, NA, 133.6, NA,
  6. "2019-01-01", NA, NA, NA, NA, 6.2,
  7. "2019-02-01", NA, 1962946, NA, NA, NA,
  8. "2019-02-01", NA, NA, NA, 134.5, NA,
  9. "2019-02-01", NA, NA, NA, NA, 6.1,
  10. "2019-03-01", NA, 1974072, NA, NA, NA,
  11. "2019-03-01", NA, NA, NA, 135.4, NA,
  12. "2019-03-01", NA, NA, NA, NA, 6.3,
  13. "2019-04-01", NA, 1984086, NA, NA, NA
  14. )
  15. remove_na <- function(x) {
  16. if (all(is.na(x))) return(NA)
  17. discard(x, is.na)
  18. }
  19. df |>
  20. group_by(date) |>
  21. summarize(across(starts_with("col"), remove_na))
  22. #> # A tibble: 4 × 6
  23. #> date col1 col2 col3 col4 col5
  24. #> <chr> <lgl> <dbl> <lgl> <dbl> <dbl>
  25. #> 1 2019-01-01 NA 1966439 NA 134. 6.2
  26. #> 2 2019-02-01 NA 1962946 NA 134. 6.1
  27. #> 3 2019-03-01 NA 1974072 NA 135. 6.3
  28. #> 4 2019-04-01 NA 1984086 NA NA NA

创建于2023年3月3日,使用reprex v2.0.2
请包括一些生成数据集的代码(就像我在这里做的),下次你张贴一个问题!

展开查看全部

相关问题