R:如何将数据框转换为具有空拐角的非对称矩阵

gg58donl  于 2022-12-20  发布在  其他
关注(0)|答案(2)|浏览(140)

我有以下数据框:

table <- data.frame(pop_1 = c("AL","AL","AL","AL","AL","AL","AL","ALT","ALT","ALT","ALT","ALT","ALT","BU","BU","BU","BU","BU","IRK","IRK","IRK","IRK","KK","KK","KK","KYA","KYA","TU"),
                    pop_2 = c("ALT","BU","IRK","KK","KYA","TU","ZAB","BU","IRK","KK","KYA","TU","ZAB","IRK","KK","KYA","TU","ZAB","KK","KYA","TU","ZAB","KYA","TU","ZAB","TU","ZAB","ZAB"),
                    value = c(0.43447,0.15267,0.25912,0.10435,0.19238,0.19186,0.18155,0.34969,0.07506,0.29206,0.13597,0.46354,0.17870,0.18658,0.02297,0.08851,0.18950,0.05176,0.12086,0.02690,0.29669,0.05551,0.04910,0.15779,0.03276,0.23422,0.00568,0.22181))

如何将其转换为具有空(或NA等)单元格的非对称矩阵,如下所示:

qnakjoqk

qnakjoqk1#

对你的数据框做一些小的改动,在开始处多加一个"AL","AL","NA"组合。你也可以在结尾处多加一个"ZAB":

df<- data.frame(pop_1 = c("AL","AL","AL","AL","AL","AL","AL","AL","ALT","ALT","ALT","ALT","ALT","ALT","BU","BU","BU","BU","BU","IRK","IRK","IRK","IRK","KK","KK","KK","KYA","KYA","TU"),
              pop_2 = c("AL","ALT","BU","IRK","KK","KYA","TU","ZAB","BU","IRK","KK","KYA","TU","ZAB","IRK","KK","KYA","TU","ZAB","KK","KYA","TU","ZAB","KYA","TU","ZAB","TU","ZAB","ZAB"),
              value = c(NA,0.43447,0.15267,0.25912,0.10435,0.19238,0.19186,0.18155,0.34969,0.07506,0.29206,0.13597,0.46354,0.17870,0.18658,0.02297,0.08851,0.18950,0.05176,0.12086,0.02690,0.29669,0.05551,0.04910,0.15779,0.03276,0.23422,0.00568,0.22181))

library(tidyverse)
pivot_wider(df, names_from=pop_1, values_from=value)

 pop_2     AL     ALT      BU     IRK      KK      KYA     TU
  <chr>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>    <dbl>  <dbl>
1 AL    NA     NA      NA      NA      NA      NA       NA    
2 ALT    0.434 NA      NA      NA      NA      NA       NA    
3 BU     0.153  0.350  NA      NA      NA      NA       NA    
4 IRK    0.259  0.0751  0.187  NA      NA      NA       NA    
5 KK     0.104  0.292   0.0230  0.121  NA      NA       NA    
6 KYA    0.192  0.136   0.0885  0.0269  0.0491 NA       NA    
7 TU     0.192  0.464   0.190   0.297   0.158   0.234   NA    
8 ZAB    0.182  0.179   0.0518  0.0555  0.0328  0.00568  0.222

编辑:

df2 <- df #copy data
names(df2) <- c("pop_2", "pop_1", "value") #re-name copied data
x <- rbind(df, df2) %>% 
              pivot_wider(names_from=pop_1, values_from=value) %>% 
              arrange(pop_2) # sort the columns by alpha
x[upper.tri(x)] <- NA #fill the upper triangle with NA
x
dxxyhpgq

dxxyhpgq2#

创建一个包含数据框的pop_1和pop_2列中所有唯一值的向量。这将是矩阵的行和列的名称。

populations <- unique(c(table$pop_1, table$pop_2))

使用matrix函数创建一个空矩阵,其行数和列数与步骤1中的向量相同。使用value参数将矩阵的默认值设置为NA。

matrix <- matrix(NA, nrow = length(populations), ncol = length(populations))

使用rownames和colnames函数将矩阵的行和列的名称设置为population向量中的值。

rownames(matrix) <- populations
colnames(matrix) <- populations

使用for循环迭代数据框的行。对于每一行,使用pop_1和pop_2列查找矩阵中对应的单元格,并使用value列设置这些单元格的值。

for (i in 1:nrow(table)) {
  row_name <- table[i, "pop_1"]
  col_name <- table[i, "pop_2"]
  value <- table[i, "value"]
  matrix[row_name, col_name] <- value
}

在这些步骤之后,矩阵应该是非对称矩阵,其中来自 Dataframe 的值在适当的单元中,并且NA在所有其它单元中。
当您查看矩阵的结果时:

AL     ALT      BU     IRK      KK     KYA      TU     ZAB
AL  NA 0.43447 0.15267 0.25912 0.10435 0.19238 0.19186 0.18155
ALT NA      NA 0.34969 0.07506 0.29206 0.13597 0.46354 0.17870
BU  NA      NA      NA 0.18658 0.02297 0.08851 0.18950 0.05176
IRK NA      NA      NA      NA 0.12086 0.02690 0.29669 0.05551
KK  NA      NA      NA      NA      NA 0.04910 0.15779 0.03276
KYA NA      NA      NA      NA      NA      NA 0.23422 0.00568
TU  NA      NA      NA      NA      NA      NA      NA 0.22181
ZAB NA      NA      NA      NA      NA      NA      NA      NA

相关问题