R语言选择要在ggplot2中绘制的顶部和底部10%

wz1wpwve 于 2023-09-27 发布在其他

关注(0)|答案(1)|浏览(120)

我想很容易地选择要在ggplot中绘制的变量平均值的顶部和底部10%。我有一个更大的数据集，在2年的时间内，“治疗”已经重复。我想找出2年内的平均值，并仅绘制平均值为所有治疗的顶部和底部10%的治疗。
目前，我已经为某个变量创建了所有治疗的图，然后找到顶部和底部的10%，并使用subset（）选择这些治疗仅包含在最终图中。这太耗时，并且不能容易地转移到另一个变量。
我用星球大战的数据集复制了这个：

ggplot(subset(starwars,homeworld %in% c("Quermia","Kashyyyk","Kalee","Kamino","Aleen Minor","Endor","Vulpter","Malastare")), aes(x=`homeworld`, y=`height`, fill = homeworld)) +
  geom_boxplot(outlier.shape = NA) +
  stat_summary(fun.y=mean, geom="point", shape=20, size=5, color="red", fill="red") +
  theme(legend.position = 'none') +
  theme(axis.text.x = element_text(angle = 40,hjust = 1, vjust = 1,face = "bold",
                                   colour = "black", size = rel(0.8)))

理想情况下，我会有一行代码可以复制并用于，在星球大战的例子中，质量而不是高度。使用我目前的方法，我必须绘制所有的家园世界，然后选择我想添加到最终的情节。

r

来源：https://stackoverflow.com/questions/77180824/selecting-top-and-bottom-10-to-be-plotted-in-ggplot2

1条答案

按热度按时间

vdgimpew1#

一种方法是创建一个函数来为您创建子集，这样您就不必像下面的best_worst()函数那样做了。它接受数据、分组变量和要计算其均值的变量，并返回具有最高和最低均值的prop*n组。然后可以在图中使用此数据。

library(dplyr)
library(ggplot2)
best_worst <- function(.data, .group, .vbl, prop = .1, ...){
  sum_data <- .data %>% 
    group_by({{.group}}) %>% 
    filter(!is.na({{.vbl}})) %>% 
    summarise(x = mean({{.vbl}}, na.rm=TRUE)) %>% 
    arrange(x)
  n <- nrow(sum_data)
  n_keep <- floor(n*prop)
  top <- sum_data %>% 
          ungroup %>% 
          slice_head(n=n_keep) %>% 
          select({{.group}}) %>% 
          pull()
  bottom <- sum_data %>% 
    ungroup %>% 
    slice_tail(n=n_keep) %>% 
    select({{.group}}) %>% 
    pull()
  .data %>% filter({{.group}} %in% c(top, bottom))
}

ggplot(best_worst(starwars, homeworld, height), aes(x=`homeworld`, y=`height`, fill = homeworld)) +
  geom_boxplot(outlier.shape = NA) +
  stat_summary(fun=mean, geom="point", shape=20, size=5, color="red", fill="red") +
  theme(legend.position = 'none') +
  theme(axis.text.x = element_text(angle = 40,hjust = 1, vjust = 1,face = "bold",
                                   colour = "black", size = rel(0.8)))

创建于2023-09-26附带reprex v2.0.2

展开查看全部

赞(0）回复(0）举报 2023-09-27

我来回答

R语言选择要在ggplot2中绘制的顶部和底部10%

1条答案

相关问题

热门标签

最新问答

R语言 选择要在ggplot2中绘制的顶部和底部10%

1条答案

相关问题

热门标签

最新问答

R语言选择要在ggplot2中绘制的顶部和底部10%