我在R中有一个 Dataframe ,其中有两个字段name和fruit,我怎样才能得到一个拥有不止一种水果的名字列表?

62lalag4  于 2023-03-27  发布在  其他
关注(0)|答案(3)|浏览(86)

下面是定义数据的R代码:

library(tidyverse)

name <- c("John","John","John","Peter","Paul","Paul", "Steven", "Steven")<p>
fruit <- c("Apple","Apple","Banana","Banana","Cherry","Cherry","Apple","Banana")<p>
grunnur <- data.frame(name,fruit) <p>
grunnur$name <- factor(grunnur$name) <p>
grunnur$fruit <- factor(grunnur$fruit) <p>

我知道John和Steven已经尝试了两种或两种以上的水果,但是我怎么才能通过使用代码来列出他们的名字呢?

oyxsuwqo

oyxsuwqo1#

以下是dplyr的方法:
1.我们先用name分组。
1.使用n_distinct计算每组不同水果值的数量。

  1. filter只有那些超过一个不同的水果:
library(dplyr)

grunnur %>%
  summarise(n_fruits = n_distinct(fruit), .by=name) %>%
  filter(n_fruits > 1) %>%
  select(name)
name
1   John
2 Steven
mwg9r5ms

mwg9r5ms2#

你可以用filtern_distinct来检查组是否有多个唯一值。如果你想有一个Steven和John的列表,你可以用pulluniquesplit来检查组的唯一值:

library(tidyverse) 
grunnur %>%
  group_by(name) %>%
  filter(n_distinct(fruit) > 1)
#> # A tibble: 5 × 2
#> # Groups:   name [2]
#>   name   fruit 
#>   <chr>  <chr> 
#> 1 John   Apple 
#> 2 John   Apple 
#> 3 John   Banana
#> 4 Steven Apple 
#> 5 Steven Banana

grunnur %>%
  group_by(name) %>%
  filter(n_distinct(fruit) > 1) %>%
  pull(name) %>%
  unique() %>%
  split(., .)
#> $John
#> [1] "John"
#> 
#> $Steven
#> [1] "Steven"

创建于2023-03-24使用reprex v2.0.2

lmyy7pcs

lmyy7pcs3#

使用base R

subset(grunnur,  name %in% names(Filter(\(x) length(x) > 1, 
  with(unique(grunnur[c("fruit", "name")]), split(fruit, name)))))
  • 输出
name  fruit
1   John  Apple
2   John  Apple
3   John Banana
7 Steven  Apple
8 Steven Banana

相关问题