透视数据框以显示每组的行指标和R列年数

hkmswyz6  于 12个月前  发布在  其他
关注(0)|答案(2)|浏览(70)

我在R中有以下 Dataframe

df <- data.frame(
  year = c(2018, 2018, 2019, 2019), 
  group = c("A", "A", "B", "B"), 
  metric = c(100, 150, 150, 200), 
  metric_percent = c(0.1, 0.2,0.3, 0.4))

字符串
| 年|组|度量|公制百分比|
| --|--|--|--|
| 2018 |一| 100 |0.1秒|
| 2018 |一| 150 |0.2秒|
| 2019 |B| 150 |0.3个单位|
| 2019 |B| 200 |0.4个单位|
当我运行下面的代码时,输出为headcount & year创建了两列,为representation和year创建了两列。

df |> 
  pivot_wider(names_from = year,
              values_from = c(headcount, representation))

# Output
# A tibble: 2 × 5
  group headcount_2018 headcount_2019 representation_2018 representation_2019
  <chr> <list>         <list>         <list>              <list>             
1 A     <dbl [2]>      <NULL>         <dbl [2]>           <NULL>             
2 B     <NULL>         <dbl [2]>      <NULL>              <dbl [2]>


在我的真实的示例中,我有几个组和指标要显示,因此我需要查看表中的数据,以便查看每个组的每个指标的演变。
这就是我想要在表中显示数据的方式(我将使用flextable或gt来格式化它)

df2 <- data.frame(
  group = c("A", "A", "B", "B"),
  metric = c("headcount", "representation", "headcount", "representation"),
  "2018" = c(100, .1, 150, 0.3),
  "2019" = c(150, 0.2, 200, 0.4))

df2


| 组|度量| 2018 | 2019 |
| --|--|--|--|
| 一|员工人数|一百点|一百五十点|
| 一|表示|0.1秒|0.2秒|
| B|员工人数|一百五十点|两百点|
| B|表示|0.3个单位|0.4个单位|
先谢谢你了!

z9gpfhce

z9gpfhce1#

正如@stefan在评论中所说,你的数据在发布时不能被重塑为所需的格式。

dat
# A tibble: 4 × 4
  year  group headcount representation
  <chr> <chr>     <dbl>          <dbl>
1 2018  A           100            0.1
2 2019  A           150            0.2
3 2018  B           150            0.3
4 2019  B           200            0.4

字符串
它可以被重新塑造成预期的格式

library(dplyr)
library(tidyr)

dat %>% 
  pivot_longer(-c(year, group), names_to="metric") %>% 
  pivot_wider(names_from=year, values_from=value)
# A tibble: 4 × 4
  group metric         `2018` `2019`
  <chr> <chr>           <dbl>  <dbl>
1 A     headcount       100    150  
2 A     representation    0.1    0.2
3 B     headcount       150    200  
4 B     representation    0.3    0.4

数据

dat <- structure(list(year = c("2018", "2019", "2018", "2019"), group = c("A", 
"A", "B", "B"), headcount = c(100, 150, 150, 200), representation = c(0.1, 
0.2, 0.3, 0.4)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-4L))

rbl8hiat

rbl8hiat2#

假设您的示例数据中有一个错别字,这是一个很好的方法:

library(tidyr)
df <- data.frame(
  year = c(2018, 2019, 2018, 2019), 
  group = c("A", "A", "B", "B"), 
  metric = c(100, 150, 150, 200), 
  metric_percent = c(0.1, 0.2,0.3, 0.4))

df |>
  pivot_longer(starts_with("metric"), names_to = "metric") |>
  mutate(
    metric = case_match(
      metric,
      "metric" ~ "headcount",
      "metric_percent" ~ "representation"
    )
  ) |>
  pivot_wider(names_from = year, values_from = value)
# # A tibble: 4 × 4
#   group metric         `2018` `2019`
#   <chr> <chr>           <dbl>  <dbl>
# 1 A     headcount       100    150  
# 2 A     representation    0.1    0.2
# 3 B     headcount       150    200  
# 4 B     representation    0.3    0.4

字符串

相关问题