在R中将原因旋转到它们自己的列中

zkure5ic  于 2023-05-26  发布在  其他
关注(0)|答案(4)|浏览(116)

我在R工作。
这是我的数据。每所学校都告诉他们每个科目有多少空缺。他们可以给予四个理由来解释为什么他们认为他们有这么多的空缺。
例如,学校1给出了英语有这么多空缺的两个原因(工资和地点)。

data <- data.frame(school_id = c("1", "1", "2", "3", "3"), 
                   subject = c("English", "English",  "Maths", "French", "French"),
                   vacancies = c(40, 40, 20, 40, 10),
                   Reason = c("Pay", "Location", "Pay", "Experience", "Location")

| 学校ID|主题|空缺|理由|
| - -----|- -----|- -----|- -----|
| 一个|英语|四十|支付|
| 一个|英语|四十|位置|
| 2|数学|二十个|支付|
| 3|法语|四十|支付|
| 3|英语|十个|体验|
我想旋转原因,所以学校报告的每个科目都有一行空缺,但我似乎无法让它与pivot_longer或spread一起工作。
这就是我想要的...
| 学校ID|主题|空缺|理由1|理由2|理由3|理由4|
| - -----|- -----|- -----|- -----|- -----|- -----|- -----|
| 一个|英语|四十|支付|位置|不适用|不适用|
| 2|数学|二十个|支付|不适用|不适用|不适用|
| 3|法语|四十|支付|不适用|不适用|不适用|
| 3|英语|十个|体验|不适用|不适用|不适用|
有人有什么想法吗?
Thanks:)

vc6uscn9

vc6uscn91#

这与OP的期望输出相匹配,并且很清楚每一步发生了什么。
首先,我们按school_idsubject对数据进行分组,以确定哪个原因是第一个,哪个原因是第二个,等等。然后我们基于school_idsubject透视表,使用Reason作为值,Reason_Name作为列名。

data %>% 
  group_by(school_id, subject) %>% 
  mutate(Reason_Name = paste0("Reason", row_number())) %>%
  ungroup() %>%
  pivot_wider(
    id_cols = c(school_id, subject, vacancies),
    values_from = Reason,
    names_from = Reason_Name
  ) 

# A tibble: 4 × 5
  school_id subject vacancies Reason1    Reason2 
  <chr>     <chr>       <dbl> <chr>      <chr>   
1 1         English        40 Pay        Location
2 2         Maths          20 Pay        NA      
3 3         English        10 Experience NA      
4 3         French         40 Pay        NA

如果我们想把这段代码压缩一点,我们可以说:

data %>% 
  mutate(
    Reason_Name = paste0("Reason", row_number()), 
    .by = c(school_id, subject))
  ) %>%
  pivot_wider(
    id_cols = c(school_id, subject, vacancies),
    values_from = Reason,
    names_from = Reason_Name
  )
twh00eeo

twh00eeo2#

library(tidyverse)
data %>%
   mutate(rowid = row_number(), .by = c(school_id, subject)) %>%
   pivot_wider(id_cols = c(school_id, subject, vacancies),
               names_from = rowid, 
               values_from = Reason,
               names_prefix = "Reason")

# A tibble: 4 × 5
  school_id subject vacancies Reason1    Reason2 
  <chr>     <chr>       <dbl> <chr>      <chr>   
1 1         English        40 Pay        Location
2 2         Maths          20 Pay        NA      
3 3         French         40 Experience NA      
4 3         French         10 NA         Location
axr492tv

axr492tv3#

这并不完全是你所描述的你想要的输出,但类似于GuedesBF的解决方案(我们有一个类似的思考过程,它出现;我是独立完成的)。但我认为有这样一个dataframe将是最有用的:

require(tidyverse)

data %>% 
  pivot_wider(names_from = Reason, names_prefix = "Reason_", values_from = Reason) %>% 
  mutate(across(.cols = -c(school_id, subject, vacancies), .fns = \(x) !is.na(x)))

# output
# A tibble: 4 × 6
  school_id subject vacancies Reason_Pay Reason_Location Reason_Experience
  <chr>     <chr>       <dbl> <lgl>      <lgl>           <lgl>            
1 1         English        40 TRUE       TRUE            FALSE            
2 2         Maths          20 TRUE       FALSE           FALSE            
3 3         French         40 FALSE      FALSE           TRUE             
4 3         French         10 FALSE      TRUE            FALSE
o0lyfsai

o0lyfsai4#

library(tidyr)

data |>
    pivot_wider(names_from = Reason,
                values_from = Reason,
                names_prefix = "Reason_") |>
    rename_with(.cols = starts_with("Reason_"),
                .fn =\(x) glue::glue("Reason_{1:3}"))

# A tibble: 4 × 6
  school_id subject vacancies Reason_1 Reason_2 Reason_3  
  <chr>     <chr>       <dbl> <chr>    <chr>    <chr>     
1 1         English        40 Pay      Location NA        
2 2         Maths          20 Pay      NA       NA        
3 3         French         40 NA       NA       Experience
4 3         French         10 NA       Location NA

相关问题