使用dplyr排列/删除R中的数据

6pp0gazn 于 2022-12-27 发布在其他

关注(0)|答案(1)|浏览(106)

._883<-quantmod::getSymbols("0883.HK",from="2022-04-21",to="2022-12-22",auto.assign = FALSE)
._600938<-quantmod::getSymbols("600938.SS",from="2022-04-21",to="2022-12-22",auto.assign = FALSE)

._883<-._883[,6]
._600938<-._600938[,6]
spread<-data.frame(._883,._600938)

> length(._883)
[1] 169
> length(._600938)
[1] 165

两个数组都按日期索引;如果我想在这里使用dplyr库，我想找出._600938中丢失的数据，然后删除整行，包括._883（这样我就可以使用相同的日期对齐数据），请问我如何才能使用dplyr做到这一点。非常感谢您的帮助。
大家好，我不使用dplyr库，而是尝试以下方法：

x<-cbind(._883,._600938)
na.omit(x)
x$spread<-x[,2]-x[,1]

r

来源：https://stackoverflow.com/questions/74892622/using-dplyr-to-arrange-delete-data-in-r

1条答案

按热度按时间

qq24tv8q1#

要使用dplyr库查找._600938中缺少的数据并删除包括._883在内的整行，可以使用过滤器（）函数仅选择._600938列中缺少值的行，然后使用选择（-1）函数排除._883列，仅保留._600938列。然后可以使用anti_join（）函数从扩散中删除缺少数据的行：

library(dplyr)

# Select rows with missing values in the ._600938 column and exclude the ._883 column
missing_data <- spread %>% filter(is.na(._600938)) %>% select(-1)

# Remove rows with missing data from spread
spread_clean <- spread %>% anti_join(missing_data)

或者，可以使用tidyr库中的complete（）函数，用默认值（如0或NA）填充缺失的值：

library(tidyr)

# Fill in missing values with NA
spread_complete <- spread %>% complete(._883, ._600938)

# Fill in missing values with 0
spread_complete <- spread %>% complete(._883, ._600938, fill = list(._600938 = 0))

请注意，如果._883或._600938中的日期不同时存在，则complete（）函数会向数据框中添加行。如果不想向数据框中添加行，则可以使用dplyr库中的full_join（）函数：

library(dplyr)

# Join spread with missing data filled in with NA
spread_complete <- spread %>% full_join(missing_data, by = c("._883", "._600938"))

# Join spread with missing data filled in with 0
spread_complete <- spread %>% full_join(missing_data, by = c("._883", "._600938")) %>%
  mutate(._600938 = ifelse(is.na(._600938), 0, _.600938))

希望能有所帮助！

赞(0）回复(0）举报 2022-12-27

我来回答

使用dplyr排列/删除R中的数据

1条答案

相关问题

热门标签

最新问答