r -重新调整数据

vfh0ocws  于 2023-03-15  发布在  其他
关注(0)|答案(1)|浏览(118)

这是我的

df <- data.frame(SNP = c("rs123","rs567"), 
A1 = c("A","C"),
A2 = c("G","T"),
A1A1.x = c(16,52),
A1A2.x = c(104,342),
A2A2.x = c(127,408),
A1A1.y = c(14,80),
A1A2.y = c(122,186),
A2A2.y = c(107,183),
P_HWE = c(0.4293,0.09432),
A1A1_x = c(6.4777328,6.4837905),
A1A2_x = c(42.10526,42.64339),
A2A2_x = c(51.417,50.87282),
A1A1_y = c(5.7613169,17.8173719),
A1A2_y = c(50.20576,41.42539),
A2A2_y = c(44.03292,40.75724))

这就是我需要的

SNP A1  A2  Genotypes   n_controls  f_controls  n_cases f_cases P_HWE
rs123   A   G   AA  16  6.4777328   14  5.7613169   0.4293
rs123   A   G   GA  104 42.10526    122 50.20576    0.4293
rs123   A   G   GG  127 51.417  107 44.03292    0.4293
rs567   C   T   CC  52  6.4837905   80  17.8173719  0.09432
rs567   C   T   TC  342 42.64339    186 41.42539    0.09432
rs567   C   T   TT  408 50.87282    183 40.75724    0.09432

有什么建议吗?提前感谢!

xa9qqrwz

xa9qqrwz1#

使用pivot_longer,您可以将数据重塑为long,使用expand.grid,您可以创建基因型。

library(dplyr)
library(tidyr)
df %>% 
  pivot_longer(-c(SNP, A1, A2), 
               names_to = c(".value"),
               names_pattern = ".*([.|_].*)") %>% 
  mutate(Genotypes = do.call(paste0, expand.grid(unique(c(A1, A2)), unique(c(A1, A2)))[c(1, 2, 4), ]),
         .by = SNP, .after = A2) %>% 
  fill(`_HWE`)

# A tibble: 6 × 9
  SNP   A1    A2    Genotypes    .x    .y `_HWE`  `_x`  `_y`
  <chr> <chr> <chr> <chr>     <dbl> <dbl>  <dbl> <dbl> <dbl>
1 rs123 A     G     AA           16    14 0.429   6.48  5.76
2 rs123 A     G     GA          104   122 0.429  42.1  50.2 
3 rs123 A     G     GG          127   107 0.429  51.4  44.0 
4 rs567 C     T     CC           52    80 0.0943  6.48 17.8 
5 rs567 C     T     TC          342   186 0.0943 42.6  41.4 
6 rs567 C     T     TT          408   183 0.0943 50.9  40.8

相关问题