R语言 沿着多列数据广播长表

bnl4lu3b  于 2023-03-20  发布在  其他
关注(0)|答案(2)|浏览(93)

我有一个困难的时间理解dcast,不能得到正确的命令,以获得我想要的。我会给予一个最小的repro。

#generate the data
ID <- c('a','a','a','b','b','b')
Parameter <- c('p1','p2','p3','p1','p2','p3')
Value <- c('yes','no','3','yes','yes','2')
Comment <- c(NA,'Deduced','To verify',NA,'Deduced','Verified')
Source <- c('Exp.1','Exp.1','Exp.1+2','DB2','DB2','DB2')
Person <- c('X','X','X','Y','Y','Z')
long.data <- data.frame(ID,Parameter,Value,Comment,Source,Person)

  ID Parameter Value   Comment  Source Person
1  a        p1   yes      <NA>   Exp.1      X
2  a        p2    no   Deduced   Exp.1      X
3  a        p3     3 To verify Exp.1+2      X
6  b        p1   yes      <NA>     DB2      Y
7  b        p2   yes   Deduced     DB2      Y
8  b        p3     2  Verified     DB2      Y

我想将其转换为以下wide.data格式:

ID Person  p1 p1-Comment p1-Source   p2 p2-Comment p2-Source p3 p3-Comment p3-Source
1  a      X yes       <NA>     Exp.1   no    Deduced     Exp.1  3  To verify   Exp.1+2
2  b      Y yes       <NA>       DB2  yes    Deduced       DB2  2   Verified       DB2

我可以假设每个ID都有相同的Person,我相信我可以进行dcast,但是我还没有找到一个合理的方法来做到这一点,而不会输出垃圾列,可能有一个相对简单的方法来做到这一点,我只是缺少。

pkmbmrz7

pkmbmrz71#

我们可以使用pivot_wider

library(tidyr)
library(dplyr)
pivot_wider(long.data, names_from = Parameter,
   values_from = c(Value, Comment, Source ), 
     names_glue = "{Parameter}-{.value}", names_vary = "slowest")) %>% 
   filter(!is.na(`p1-Value`))
  • 输出
# A tibble: 2 × 11
  ID    Person `p1-Value` `p1-Comment` `p1-Source` `p2-Value` `p2-Comment` `p2-Source` `p3-Value` `p3-Comment` `p3-Source`
  <chr> <chr>  <chr>      <chr>        <chr>       <chr>      <chr>        <chr>       <chr>      <chr>        <chr>      
1 a     X      yes        <NA>         Exp.1       no         Deduced      Exp.1       3          To verify    Exp.1+2    
2 b     Y      yes        <NA>         DB2         yes        Deduced      DB2         <NA>       <NA>         <NA>
zfycwa2u

zfycwa2u2#

学习reshape

reshape(long.data, timevar='Parameter', idvar='ID', direction='wide')
#   ID Value.p1 Comment.p1 Source.p1 Person.p1 Value.p2 Comment.p2 Source.p2 Person.p2 Value.p3 Comment.p3 Source.p3 Person.p3
# 1  a      yes       <NA>     Exp.1         X       no    Deduced     Exp.1         X        3  To verify   Exp.1+2         X
# 4  b      yes       <NA>       DB2         Y      yes    Deduced       DB2         Y        2   Verified       DB2         Z
  • 数据:*
long.data <- structure(list(ID = c("a", "a", "a", "b", "b", "b"), Parameter = c("p1", 
"p2", "p3", "p1", "p2", "p3"), Value = c("yes", "no", "3", "yes", 
"yes", "2"), Comment = c(NA, "Deduced", "To verify", NA, "Deduced", 
"Verified"), Source = c("Exp.1", "Exp.1", "Exp.1+2", "DB2", "DB2", 
"DB2"), Person = c("X", "X", "X", "Y", "Y", "Z")), class = "data.frame", row.names = c(NA, 
-6L))

相关问题