R语言 计算组内和时间内的百分比差异

vbopmzt1  于 2023-02-10  发布在  其他
关注(0)|答案(1)|浏览(164)

我确实有 Dataframe ,如下所示:

A <- data.frame(GroupName = c(rep(c("A", "B", "C"), each = 6)),
           ObservationName = c("alpha", "beta", "gamma", "alpha", "beta", "gamma", rep(c("delta", "epsilon"),3), rep(c("zeta", "eta", "theta"),2)),
           Date = rep(rep(seq(as.Date("2010-1-1"), as.Date("2010-3-1"), by = "month"), each =3), 2),
           Value = runif(n = 18, min = 1, max = 10))

现在,我要计算始终在组(组名)内、同一月(日期)内以及该组内所有观测值之间(如果测量在同一月(和组)内)的值的百分比变化。输出结果应如下所示(仅组A的前四种情况,其余情况应如下所示):

B <- data.frame(GroupName = c("A", "A", "A", "A"),
           ObservationName1 = c("alpha", "alpha", "beta", "alpha"),
           ObservationName2 = c("beta", "gamma", "gamma", "beta"),
           Date = c(as.Date("2010-1-1"), as.Date("2010-1-1"), as.Date("2010-1-1"), as.Date("2010-2-1")),
           Diff = c(abs(A[1,4]-A[2,4])/ ((A[1,4]+A[2,4])/2), #alpha beta Jan
                    abs(A[1,4]-A[3,4])/ ((A[1,4]+A[3,4])/2), #alpha gamma Jan
                    abs(A[2,4]-A[4,4])/ ((A[2,4]+A[4,4])/2), #beta gamma Jan
                    abs(A[4,4]-A[5,4])/ ((A[4,4]+A[5,4])/2))) #alpha gamma Feb
2w3rbyxf

2w3rbyxf1#

使用combn

library(dplyr) #1.1.0
A %>% 
  reframe(result = combn(seq_along(Value), 2, function(i)
    list(ObservationName1 = ObservationName[i][1],
         ObservationName2 = ObservationName[i][2],
         Diff = abs(diff(Value[i])) / (sum(Value[i]) / 2)), # The pairs
    simplify = FALSE),
    .by = c(GroupName, Date)) %>% 
  unnest_wider(result)

输出

# A tibble: 18 × 5
   GroupName Date       ObservationName1 ObservationName2     Diff
   <chr>     <date>     <chr>            <chr>               <dbl>
 1 A         2010-01-01 alpha            beta             0.414   
 2 A         2010-01-01 alpha            gamma            0.250   
 3 A         2010-01-01 beta             gamma            0.168   
 4 A         2010-02-01 alpha            beta             0.675   
 5 A         2010-02-01 alpha            gamma            0.480   
 6 A         2010-02-01 beta             gamma            1.07    
 7 B         2010-03-01 delta            epsilon          1.55    
 8 B         2010-03-01 delta            delta            1.11    
 9 B         2010-03-01 epsilon          delta            0.779   
10 B         2010-01-01 epsilon          delta            0.841   
11 B         2010-01-01 epsilon          epsilon          0.559   
12 B         2010-01-01 delta            epsilon          0.321   
13 C         2010-02-01 zeta             eta              0.770   
14 C         2010-02-01 zeta             theta            0.000664
15 C         2010-02-01 eta              theta            0.770   
16 C         2010-03-01 zeta             eta              0.799   
17 C         2010-03-01 zeta             theta            0.350   
18 C         2010-03-01 eta              theta            0.483

dplyr 1.1.0之前:

A %>% 
  group_by(GroupName, Date) %>% 
  summarise(result = combn(seq_along(Value), 2, function(i)
    list(ObservationName1 = ObservationName[i][1],
         ObservationName2 = ObservationName[i][2],
         Diff = abs(diff(Value[i])) / (sum(Value[i]) / 2)), # The pairs
    simplify = FALSE)) %>% 
  ungroup() %>% 
  unnest_wider(result)

相关问题