R语言 如何从相应的平均值中减去所有列?

k4ymrczo  于 2023-01-18  发布在  其他
关注(0)|答案(1)|浏览(152)

我有这个数据框:

dat=structure(list(date = as.Date(c("1983-01-01", "1984-01-01", 
                        "1984-02-01", 
                        "1985-01-01", "1985-02-01", "1986-01-01")), 
               rig = c(68.1, 62.4, 
                       
                       67.5, 78.9, 81.7, 72.2), pass = c(9.57, 10.49, 11.97, 
                                                         11.43, 9.54, 
                                                         
                                                         8.98)), 
          row.names = c(NA, 6L), class = "data.frame")

我希望对于一年中的每个月(i),计算该月在所有年份中的平均值(m)和该月在所有年份中的数据值的标准差(std),然后将值替换为(i-m)/std
我们可以这样来计算平均值和标准差

> dat%>%     mutate(month =month(as.Date(date))) %>% group_by(month) 
    %>%summarise(across(where(is.numeric), mean, na.rm = TRUE))%>% ungroup
     # A tibble: 2 × 3
      month   rig  pass
      <dbl> <dbl> <dbl>
  1     1  70.4  10.1
   2     2  74.6  10.8
   > dat%>%     mutate(month =month(as.Date(date))) %>% group_by(month) 
   %>%summarise(across(where(is.numeric), sd, na.rm = TRUE))%>% ungroup
   # A tibble: 2 × 3
    month   rig  pass
    <dbl> <dbl> <dbl>
   1     1  6.95  1.07
   2     2 10.0   1.72

我如何将这些值减去和除以每个相应的列:预期结果,通过的示例:

(9.57-10.1175)/1.073169  =   -0.5101713

         date       rig     pass
  1 1983-01-01 -0.3310809  -0.5101713
  2 1984-01-01 -1.151586 0.3471028
  3 1984-02-01 -0.7071068 0.707107
  4 1985-01-01 1.22356 1.223013
  5 1985-02-01  0.7071068  -0.707107
  6 1986-01-01 0.2591068  -1.059945
at0kjp5o

at0kjp5o1#

您可以使用scale()

dat=structure(list(date = as.Date(c("1983-01-01", "1984-01-01", 
                                    "1984-02-01", 
                                    "1985-01-01", "1985-02-01", "1986-01-01")), 
                   rig = c(68.1, 62.4, 
                           
                           67.5, 78.9, 81.7, 72.2), pass = c(9.57, 10.49, 11.97, 
                                                             11.43, 9.54, 
                                                             
                                                             8.98)), 
              row.names = c(NA, 6L), class = "data.frame")

library(dplyr)
library(lubridate)
dat%>%     
  mutate(month =month(as.Date(date))) %>% 
  group_by(month) %>% 
  mutate(across(where(is.numeric), ~c(scale(.x))))
#> # A tibble: 6 × 4
#> # Groups:   month [2]
#>   date          rig   pass month
#>   <date>      <dbl>  <dbl> <dbl>
#> 1 1983-01-01 -0.331 -0.510     1
#> 2 1984-01-01 -1.15   0.347     1
#> 3 1984-02-01 -0.707  0.707     2
#> 4 1985-01-01  1.22   1.22      1
#> 5 1985-02-01  0.707 -0.707     2
#> 6 1986-01-01  0.259 -1.06      1

或者,如果您希望更明确地说明转换:

dat%>%     
  mutate(month =month(as.Date(date))) %>% 
  group_by(month) %>% 
  mutate(across(where(is.numeric), ~(.x - mean(.x))/sd(.x)))
#> # A tibble: 6 × 4
#> # Groups:   month [2]
#>   date          rig   pass month
#>   <date>      <dbl>  <dbl> <dbl>
#> 1 1983-01-01 -0.331 -0.510     1
#> 2 1984-01-01 -1.15   0.347     1
#> 3 1984-02-01 -0.707  0.707     2
#> 4 1985-01-01  1.22   1.22      1
#> 5 1985-02-01  0.707 -0.707     2
#> 6 1986-01-01  0.259 -1.06      1

reprex package(v2.0.1)于2023年1月17日创建

相关问题