在R中合并矩阵行/列名

x6h2sr28  于 2023-05-11  发布在  其他
关注(0)|答案(3)|浏览(155)

我有多个矩阵,反映了不同时间点的二分/从属网络。这些矩阵在其现任者中有很多重叠,但也有很多差异。然而,为了进一步分析,我需要它们具有相同的维度,并且每行/列具有相同的参与者,因此我需要以某种方式合并行和列名。
最终的矩阵将是大约8000乘以200,但每个单独的矩阵大约是2000乘以150。下面是一个两个矩阵的例子,以及我想要的结果:

adj1 <- matrix(0, 3, 5)
colnames(adj1) <- c("g1", "g2", "g3", "g5", "g6")
rownames(adj1) <- c("Tim", "John", "Sarah")

adj2 <- matrix(0, 4, 2)
colnames(adj2) <- c("g1", "g4")
rownames(adj2) <- c("Tim", "Mary", "John", "Paolo")

combined_adj <- matrix(0,5,6)
colnames(combined_adj) <- c("g1","g2","g3","g4","g5","g6")
rownames(combined_adj) <- c("John","Mary","Paolo","Sarah","Tim")

理想情况下,新单元格应读作“NA”或“10”,行和列按字母顺序排列。需要保留每个矩阵中的初始值。我不知道该怎么办,感谢任何帮助!

efzxgjgh

efzxgjgh1#

您可以使用merge并指定要使用row.names进行合并。

combined_adj <- merge(x = adj1,
      y = adj2,
      by = c('row.names', 
             intersect(colnames(adj1), 
                       colnames(adj2))
             ), 
      all = TRUE
)
combined_adj
  Row.names g1 g2 g3 g5 g6 g4
1      John  0  0  0  0  0  0
2      Mary  0 NA NA NA NA  0
3     Paolo  0 NA NA NA NA  0
4     Sarah  0  0  0  0  0 NA
5       Tim  0  0  0  0  0  0

这会将其转换为data.frame,因此如果需要,您需要将其转换回矩阵。

row.names(combined_adj) <- combined_adj[,1]
combined_adj <- combined_adj[,-1]

编辑:合并多个矩阵

我们使用Reduce将其应用于所有矩阵。我们首先转换为data.frame,然后创建一个带有row_names的列来简化操作。

# create sample data
adj1 <- matrix(
  0, 3, 5,
  dimnames = list(c("Tim", "John", "Sarah"), 
                  c("g1", "g2", "g3", "g5", "g6"))
)

adj2 <- matrix(
  0, 4, 2, 
  dimnames = list(c("Tim", "Mary", "John", "Paolo"),
                  c("g1", "g4"))
)

adj3 <- matrix(
  0, 3, 3, 
  dimnames = list(c("Tim2", "Mary2", "John"), c("g1", "g4", 'g7'))
)

# create a list 
list_matrices <- list(adj1, adj2, adj3)

# convert to dataframes and create a column with row.names
list_matrices <- lapply(list_matrices, function(mat){
  mat <- as.data.frame(mat)
  mat$row_names <- row.names(mat)
  mat
})

# successively combine them, merge 1..2 and then merge result with 3 and so on
res <- Reduce(function(mat1, mat2) merge(mat1, mat2, all = TRUE), x = list_matrices)

res
  g1 row_names g4 g2 g3 g5 g6 g7
1  0      John  0  0  0  0  0  0
2  0      Mary  0 NA NA NA NA NA
3  0     Mary2  0 NA NA NA NA  0
4  0     Paolo  0 NA NA NA NA NA
5  0     Sarah NA  0  0  0  0 NA
6  0       Tim  0  0  0  0  0 NA
7  0      Tim2  0 NA NA NA NA  0
k3bvogb1

k3bvogb12#

这可能是一个解决方案。但是,我假设这些单元格中确实存在的信息对于相同的行名称和列名组合总是相同的。除此之外,它还依赖于dplyr

require(tidyverse)

list_adj <- list(
  adj1, adj2
)

df.adj <- NULL

for (adj in list_adj) {
  df.adj.temp <- adj %>% as_tibble(rownames = "row_names")
  
  if (is.null(df.adj)) {
    df.adj <- df.adj.temp
  } else {
    c.colnames.join.by <- c(intersect(colnames(df.adj), colnames(df.adj.temp)))
    
    df.adj <- df.adj %>% 
      full_join(df.adj.temp, by = c.colnames.join.by) %>% 
      mutate(across(.cols = - row_names, .fns = \(x) replace_na(x, 10)))
  }
}

df.adj %>% 
  arrange(row_names) %>% # ordering rows
  select(all_of(sort(colnames(df.adj)))) %>% # ordering columns
  column_to_rownames(var = "row_names") %>% 
  as.matrix()

# output
      g1 g2 g3 g5 g6 g4
John   0  0  0  0  0  0
Mary   0 10 10 10 10  0
Paolo  0 10 10 10 10  0
Sarah  0  0  0  0  0 10
Tim    0  0  0  0  0  0
w6lpcovy

w6lpcovy3#

下面是一个带有reshape的基本R选项

df <- unique(
    rbind(
        as.data.frame(as.table(adj1)),
        as.data.frame(as.table(adj2))
    )
)

reshape(
    df,
    direction = "wide",
    idvar = "Var1",
    timevar = "Var2"
)

它给出了

Var1 Freq.g1 Freq.g2 Freq.g3 Freq.g5 Freq.g6 Freq.g4
1    Tim       0       0       0       0       0       0
2   John       0       0       0       0       0       0
3  Sarah       0       0       0       0       0      NA
17  Mary       0      NA      NA      NA      NA       0
19 Paolo       0      NA      NA      NA      NA       0

或者,我们使用xtabs

mat <- `class<-`(xtabs(Freq ~ ., df) * NA, "matrix")
mat[as.matrix(df[-3])] <- df$Freq

它给出了

> mat
       Var2
Var1    g1 g2 g3 g5 g6 g4
  Tim    0  0  0  0  0  0
  John   0  0  0  0  0  0
  Sarah  0  0  0  0  0 NA
  Mary   0 NA NA NA NA  0
  Paolo  0 NA NA NA NA  0
attr(,"call")
xtabs(formula = Freq ~ ., data = df)

相关问题