R语言 如何通过一列绑定表列表,并在标题后添加列表项名称

evrscar2  于 2022-12-20  发布在  其他
关注(0)|答案(2)|浏览(236)

我有一个 Dataframe 的列表。我想通过索引列cbindcbind它们。每个表都有相同的列标题,所以我不能只是cbind它们,否则我将不知道列来自哪个列表项。列表项的名称给出了用于生成它们的度量,所以我想用列表项名称作为列标题的后缀。
以下是简化的 Dataframe 演示列表:

list_of_tables <- list(number = structure(list(sample_id = structure(1:3, levels = c("CSF_1", 
"CSF_2", "CSF_4"), class = "factor"), total = c(655, 331, 271
), max = c(12, 5, 7)), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame")), concentration_cm_3 = structure(list(sample_id = structure(1:3, levels = c("CSF_1", 
"CSF_2", "CSF_4"), class = "factor"), total = c(121454697, 90959097, 
43080697), max = c(2050000, 2140000, 915500)), row.names = c(NA, 
-3L), class = c("tbl_df", "tbl", "data.frame")), volume_nm_3 = structure(list(
    sample_id = structure(1:3, levels = c("CSF_1", "CSF_2", "CSF_4"
    ), class = "factor"), total = c(2412783009, 1293649395, 438426087
    ), max = c(103500000, 117400000, 23920000)), row.names = c(NA, 
-3L), class = c("tbl_df", "tbl", "data.frame")), area_nm_2 = structure(list(
    sample_id = structure(1:3, levels = c("CSF_1", "CSF_2", "CSF_4"
    ), class = "factor"), total = c(15259297.4, 7655352.2, 3775922
    ), max = c(266500, 289900, 100400)), row.names = c(NA, -3L
), class = c("tbl_df", "tbl", "data.frame")))

您将看到这是一个包含4个表的列表,列表项名称为“number”、“concentration_cm_3”、“volume_nm_3”和“area_nm_2”。
使用plyr中的join_all,我可以通过sample_id将它们合并。但是,如何添加列表项名称的后缀?

merged_tables <- plyr::join_all(stats_by_measure, by = "sample_id", type = "left")
dxxyhpgq

dxxyhpgq1#

我们可以这样做:诀窍是在bind_rows中使用.id = 'id',它将名称添加为一列,然后我们可以透视:

library(dplyr)
library(tidyr)

bind_rows(list_of_tables, .id = 'id') %>% 
  pivot_wider(names_from = id,
              values_from = c(total, max))
sample_id total_number total_concentration_cm_3 total_volume_nm_3 total_area_nm_2 max_number max_concentration_cm_3 max_volume_nm_3 max_area_nm_2
  <fct>            <dbl>                    <dbl>             <dbl>           <dbl>      <dbl>                  <dbl>           <dbl>         <dbl>
1 CSF_1              655                121454697        2412783009       15259297.         12                2050000       103500000        266500
2 CSF_2              331                 90959097        1293649395        7655352.          5                2140000       117400000        289900
3 CSF_4              271                 43080697         438426087        3775922           7                 915500        23920000        100400
u5rb5r59

u5rb5r592#

可能,我们可以在这里使用reduce2以及left_join中的suffix选项

library(dplyr)
library(purrr)
nm <- names(list_of_tables)[1]
reduce2(list_of_tables, names(list_of_tables)[-1], 
   function(x, y, z) left_join(x, y, by = 'sample_id', suffix = c(nm, z)))

或者,如果我们想使用join_all,也许可以在连接之前重命名列

library(stringr)
imap(list_of_tables, ~ {
   nm <- .y
   .x %>% rename_with(~str_c(.x, nm), -1)
  }) %>%
   plyr::join_all( by = "sample_id", type = "left")

或者使用for循环

tmp <- list_of_tables[[1]]
names(tmp)[-1] <- paste0(names(tmp)[-1], names(list_of_tables)[1])
for(nm in names(list_of_tables)[-1]) {
   tmp2 <- list_of_tables[[nm]]
   names(tmp2)[-1] <- paste0(names(tmp2)[-1], nm)
   tmp <- left_join(tmp, tmp2, by = "sample_id")
}
tmp

相关问题