如何通过单元格的列名和行名对单元格的值求和以创建R中的矩阵?

34gzjxbg  于 2022-12-27  发布在  其他
关注(0)|答案(2)|浏览(155)

我有一个数据矩阵,其中的行和列名称相似。但是,行名称是重复的。我想按行名称和列名称的唯一组合对单元格值求和(例如,维度为Row1 * Col1的所有单元格值的总和),并在R中创建一个矩阵。新矩阵将是按唯一组合或行和列名称对所有单元格求和。谢谢
示例数据集:

nhhxz33t

nhhxz33t1#

你可以这样做。也可以看到代码中的注解。你还需要Grothendieck在他/她的答案中给出的pckgs。

df <- data.frame(Row = c("r_1", "r_2",etc.),
                     Col = c("Col1", "Col2",etc.),
                     Value = c(1, 2, etc.))
    
    row_names <- unique(df$Row)
    col_names <- unique(df$Col)
#this will give you all possible combinations
    combinations <- expand.grid(Row = row_names, Col = col_names)
    result <- df %>% group_by(Row, Col) %>% summarize(Value = sum(Value))
    result <- left_join(combinations, result, by = c("Row" = "Row", "Col" = "Col"))
    result[is.na(result)] <- 0
    names(result)[1] <- "R"# choose the value that suits you best
    names(result)[2] <- "C"# choose the value that suits you best
5f0d552i

5f0d552i2#

以下是一个完整的单管道tidyverse解决方案:

library(dplyr)
library(tidyr)
rn <- c("name1", "name2", "name3", 
"name1", "name2", "name3") # necessary as duplicate names are not allowed in data.frame 
# and will be dropped at type cast
matrix(1:18, ncol=3) %>%
`colnames<-`(c("name1", "name2", "name3")) %>% 
`rownames<-`(rn) %>% # here is the sample matrix 
# alike one from your question ready for analysis 
data.frame() %>%
mutate(rn = rn, .before=1) %>%
# restore the row names
# you can store them from your matrix in a variable before type cast 
pivot_longer(-rn) %>%
group_by(rn, name) %>%
summarise(value=sum(value)) %>%
pivot_wider()

这将导致:

# A tibble: 3 x 4
# Groups:   rn [3]
  rn    name1 name2 name3
  <chr> <int> <int> <int>
1 name1     5    17    29
2 name2     7    19    31
3 name3     9    21    33

相关问题