R语言合并在不同位置具有缺失值的相同数据框的列表

dgtucam1 于 2023-05-04 发布在其他

关注(0)|答案(1)|浏览(100)

从一个数据框列表开始，这些数据框缺少不同的值，但在其他方面相同。

df1 <- df2 <- df3 <- df4 <- head(iris)

df1[4,2] <- NA
df2[3,5] <- NA
df3[2,]  <- NA
df4[2,2] <- NA

lst <- list(df1, df2, df3, df4)

合并这些 Dataframe 的最佳（计算速度快）方法是什么，以便至少在一个 Dataframe 中可以获得信息的任何缺失值都消失了？
注：最好只使用碱R。

r

来源：https://stackoverflow.com/questions/76146523/merge-a-list-of-identical-data-frames-that-have-missing-values-in-different-plac

1条答案

按热度按时间

k4emjkb11#

我会用Reduce代替它们，并使用is.na和[<-替换。

done <- Reduce(function(prev, this) {
  isna <- is.na(this)
  if (any(isna)) this[isna] <- prev[isna]
  this
}, lst)
done
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

仅供参考，在 this 示例中，所有值都在第二次迭代中解析。我们可以通过添加accumulate=TRUE来查看每个“减少”的结果：

Reduce(function(prev, this) { isna <- is.na(this); if (any(isna)) this[isna] <- prev[isna]; this; }, lst, accumulate = TRUE)
# [[1]]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6          NA          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa
# [[2]]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa
# [[3]]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5            5         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa
# [[4]]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

如果你想在不需要进一步的插补时“提前逃逸”，我相信需要一个自定义的for循环。

for (i in seq_along(lst)[-1]) {
  message(i)
  isna <- is.na(lst[[1]])
  if (!any(isna)) { message("out!"); break(); }
  lst[[1]][isna] <- lst[[i]][isna]
}
# 2
# 3
# out!
lst[[1]]
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9           3          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

赞(0）回复(0）举报 2023-05-04

我来回答

R语言合并在不同位置具有缺失值的相同数据框的列表

1条答案

相关问题

热门标签

最新问答

R语言 合并在不同位置具有缺失值的相同数据框的列表

1条答案

相关问题

热门标签

最新问答

R语言合并在不同位置具有缺失值的相同数据框的列表