在R中使用rbind函数时出错:多次添加相同的行

a2mppw5e  于 2023-05-26  发布在  其他
关注(0)|答案(1)|浏览(117)

我在R中有一个dataframe:

dput(trans_eqtl[1:5,1:6])
structure(list(Gene = c("ENSG00000166086", "ENSG00000166086", 
"ENSG00000265688", "ENSG00000265688", "ENSG00000265688"), `Gene-Chr` = c(11, 
11, 17, 17, 17), `Gene-Pos` = c(133980358, 133980358, 79887167, 
79887167, 79887167), RsId = c("rs3811444", "rs35340377", "rs705705", 
"rs1131017", "rs10876864"), `SNP-Chr` = c(1, 1, 12, 12, 12), 
    `SNP-Pos` = c(248039451, 248038210, 56435504, 56435929, 56401085
    )), row.names = c(NA, 5L), class = "data.frame")

我想检查gene-chr和snp-chr是否存在于同一染色体或不同染色体上。我写了一段代码,遍历每一行,检查这两个条件。并在循环期间使用rbind函数将在这两个条件下出现的行添加到单独的intra和inter变量中。但在我的情况下:rbind函数继续向最终变量添加相同的行。这是一个代码,以供参考:

##check for intra and inter chrom:
##intra:
inter <- NULL
intra <- NULL
for(i in 1:240){
  if(!(trans_eqtl$`Gene-Chr`[i] %in% trans_eqtl$`SNP-Chr`[i])){
    x <- which(!(trans_eqtl$`Gene-Chr`[i] %in% trans_eqtl$`SNP-Chr`[i]==TRUE))
    value <- trans_eqtl[x,]
  }
  inter <- rbind(inter,value)
}

##check for intra:
for(i in 1:240){
  if(trans_eqtl$`Gene-Chr`[i] %in% trans_eqtl$`SNP-Chr`[i]){
    y <- which((trans_eqtl$`Gene-Chr`[i] %in% trans_eqtl$`SNP-Chr`[i]==TRUE))
    value1 <- trans_eqtl[y,]
  }
  intra <- rbind(intra,value1)
}

我的输出变量看起来像这样:

dput(intra[1:5,1:6])
structure(list(Gene = c("ENSG00000166086", "ENSG00000166086", 
"ENSG00000166086", "ENSG00000166086", "ENSG00000166086"), `Gene-Chr` = c(11, 
11, 11, 11, 11), `Gene-Pos` = c(133980358, 133980358, 133980358, 
133980358, 133980358), RsId = c("rs3811444", "rs3811444", "rs3811444", 
"rs3811444", "rs3811444"), `SNP-Chr` = c(1, 1, 1, 1, 1), `SNP-Pos` = c(248039451, 
248039451, 248039451, 248039451, 248039451)), row.names = c(NA, 
5L), class = "data.frame")

dput(inter[1:5,1:6])
structure(list(Gene = c("ENSG00000166086", "ENSG00000166086", 
"ENSG00000166086", "ENSG00000166086", "ENSG00000166086"), `Gene-Chr` = c(11, 
11, 11, 11, 11), `Gene-Pos` = c(133980358, 133980358, 133980358, 
133980358, 133980358), RsId = c("rs3811444", "rs3811444", "rs3811444", 
"rs3811444", "rs3811444"), `SNP-Chr` = c(1, 1, 1, 1, 1), `SNP-Pos` = c(248039451, 
248039451, 248039451, 248039451, 248039451)), row.names = c(NA, 
5L), class = "data.frame")

据我所知:rbind函数应该在循环内部,以捕获所有与条件匹配的i值,并继续添加最终变量。有人知道如何解决这个问题吗?谢谢你。

xytpbqjk

xytpbqjk1#

您可以尝试dplyr方法:

library(dplyr)

inter <- trans_eqtl %>%
  filter(`Gene-Chr` != `SNP-Chr`)

intra <- trans_eqtl %>%
  filter(`Gene-Chr` == `SNP-Chr`)

输出:

#inter: 
             Gene Gene-Chr  Gene-Pos       RsId SNP-Chr   SNP-Pos
1 ENSG00000166086       11 133980358  rs3811444       1 248039451
2 ENSG00000166086       11 133980358 rs35340377       1 248038210
3 ENSG00000265688       17  79887167   rs705705      12  56435504
4 ENSG00000265688       17  79887167  rs1131017      12  56435929
5 ENSG00000265688       17  79887167 rs10876864      12  56401085

# intra:

[1] Gene     Gene-Chr Gene-Pos RsId     SNP-Chr  SNP-Pos 
<0 Zeilen> (oder row.names mit Länge 0)

相关问题