如何创建一个新列,其列名的值等于r中的1

sg2wtvxw  于 2024-01-03  发布在  其他
关注(0)|答案(2)|浏览(125)

我有一个这样的DF:


的数据

df <- tribble(
~pro1,~pro2,~pro3,~pro4,~pro5,~pro6,~pro7,~pro8,~pro9,~count,
0,1,0,1,0,0,0,0,0,137,
1,1,0,0,0,0,0,0,0,109,
0,1,0,0,0,1,0,0,0,104,
0,0,0,1,1,0,0,0,0,79,
1,1,0,1,0,0,0,0,0,57,
0,1,1,0,0,0,0,0,0,57,
0,1,1,0,0,1,0,0,0,54,
0,1,0,0,0,0,1,0,0,45,
0,1,0,0,1,0,0,0,0,30,
1,1,0,1,1,0,0,0,0,28,
1,0,0,1,0,0,0,0,0,25,
0,1,0,1,1,0,0,0,0,23,
0,1,0,1,0,0,1,0,0,19,
0,1,0,1,0,1,0,0,0,18,
0,0,1,0,0,1,0,0,0,18,
1,1,0,0,1,0,0,0,0,17,
0,1,1,1,1,1,1,1,0,15,
0,0,0,1,0,0,1,0,0,15,
0,1,1,1,0,1,0,0,0,14,
0,0,0,0,0,0,1,1,0,14)

字符串
基本上,它是一起购买的产品的汇总。例如,第一行表示有137个案例,其中pro2pro4一起购买。可视化此表的最佳方法是什么?我正在考虑添加一个名为“ID”的新列,并使用值为1的列名的串联。因此,对于第一行,ID应该是"pro2-pro4"。但是我想不出创建这样的ID的方法。
从本质上讲,我只是想显示哪些产品一起购买,以及每个组合的数量。

inn6fuwd

inn6fuwd1#

对于每个列,可以将值1替换为其列名,将值0替换为NA,然后通过将字符串粘贴在一起将所有生产列合并为一个列。

library(tidyverse)

df %>%
  mutate(across(pro1:pro9, ~ if_else(.x == 1, cur_column(), NA))) %>%
  unite(pro, pro1:pro9, sep = "-", na.rm = TRUE)

# # A tibble: 20 × 2
#    pro                                count
#    <chr>                              <dbl>
#  1 pro2-pro4                            137
#  2 pro1-pro2                            109
#  3 pro2-pro6                            104
#  4 pro4-pro5                             79
#  5 pro1-pro2-pro4                        57
#  6 pro2-pro3                             57
#  7 pro2-pro3-pro6                        54
#  8 pro2-pro7                             45
#  9 pro2-pro5                             30
# 10 pro1-pro2-pro4-pro5                   28
# 11 pro1-pro4                             25
# 12 pro2-pro4-pro5                        23
# 13 pro2-pro4-pro7                        19
# 14 pro2-pro4-pro6                        18
# 15 pro3-pro6                             18
# 16 pro1-pro2-pro5                        17
# 17 pro2-pro3-pro4-pro5-pro6-pro7-pro8    15
# 18 pro4-pro7                             15
# 19 pro2-pro3-pro4-pro6                   14
# 20 pro7-pro8                             14

字符串

rsaldnfx

rsaldnfx2#

使用apply的基本R等价:

df$ID <- apply(df[, 1:9], 1, function(row) {
  paste(names(df)[1:9][row == 1], collapse = "-")
})

  pro1  pro2  pro3  pro4  pro5  pro6  pro7  pro8  pro9 count ID                                
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>                             
 1     0     1     0     1     0     0     0     0     0   137 pro2-pro4                         
 2     1     1     0     0     0     0     0     0     0   109 pro1-pro2                         
 3     0     1     0     0     0     1     0     0     0   104 pro2-pro6                         
 4     0     0     0     1     1     0     0     0     0    79 pro4-pro5                         
 5     1     1     0     1     0     0     0     0     0    57 pro1-pro2-pro4                    
 6     0     1     1     0     0     0     0     0     0    57 pro2-pro3                         
 7     0     1     1     0     0     1     0     0     0    54 pro2-pro3-pro6                    
 8     0     1     0     0     0     0     1     0     0    45 pro2-pro7                         
 9     0     1     0     0     1     0     0     0     0    30 pro2-pro5                         
10     1     1     0     1     1     0     0     0     0    28 pro1-pro2-pro4-pro5               
11     1     0     0     1     0     0     0     0     0    25 pro1-pro4                         
12     0     1     0     1     1     0     0     0     0    23 pro2-pro4-pro5                    
13     0     1     0     1     0     0     1     0     0    19 pro2-pro4-pro7                    
14     0     1     0     1     0     1     0     0     0    18 pro2-pro4-pro6                    
15     0     0     1     0     0     1     0     0     0    18 pro3-pro6                         
16     1     1     0     0     1     0     0     0     0    17 pro1-pro2-pro5                    
17     0     1     1     1     1     1     1     1     0    15 pro2-pro3-pro4-pro5-pro6-pro7-pro8
18     0     0     0     1     0     0     1     0     0    15 pro4-pro7                         
19     0     1     1     1     0     1     0     0     0    14 pro2-pro3-pro4-pro6               
20     0     0     0     0     0     0     1     1     0    14 pro7-pro8

字符串

相关问题