R语言 在ggplot2中为具有多个变量的条形图分组数据时出现问题

yjghlzjz  于 2022-12-20  发布在  其他
关注(0)|答案(1)|浏览(167)

我有一个 Dataframe df:

df=data.frame("temp"=c(60.80,46.04,26.96,24.98),"humid"=c(93.79,53.33,50.34,54.65),"wind_speed"=c(40.27,39.12,14.96, 13.81), "date" =c("2013-01-01","2013-01-03","2013-02-01", "2013-02-02"))

df$date <- as.Date(df$date, "%Y-%m-%d")

  temp   humid    wind_speed      date

1 60.80  93.79     40.27          2013-01-01
2 46.04  53.33     39.12          2013-01-03
3 26.96  50.34     14.96          2013-02-01
4 24.98  54.65     13.81          2013-02-02

我用下面这行代码将它转换为如下所示:

df_mod<- cbind(df[4], stack(df_w_delays_mod[1:3]))

  metric      values          date

  temp        60.80          2013-01-01
  temp        46.04          2013-01-03
  temp        26.96          2013-02-01
  temp        24.98          2013-02-02
  humid       93.79          2013-01-01  
  humid       53.33          2013-01-03
  humid       50.34          2013-02-01
  humid       54.65          2013-02-02
  wind_speed  40.27          2013-01-01
  wind_speed  39.12          2013-01-03
  wind_speed  14.96          2013-02-01
  wind_speed  13.81          2013-02-02

那么我已经提取了月份:

transform(df,  month = month(date, label=TRUE))

  metric      values         month

  temp        60.80          Jan
  temp        46.04          Jan
  temp        26.96          Feb
  temp        24.98          Feb
...

现在,我尝试构建一个类似的图表。

我想得到每个条形高度的平均值,所以我想按月和变量分组,然后取每个月内的平均值。
我正在尝试这个代码,但它给我错误。

df_mod %>%
group_by(metric) %>% 
  summarize(mean= mean(values)) %>% 
ggplot(aes(fill=metric, y=mean, x=month)) + 
  geom_bar(position="dodge", stat="identity") +
  theme_bw()+
  labs(title="Weather metrics", 
       x="", y = "values")

Don't know how to automatically pick scale for object of type function. Defaulting to continuous.
Error in `f()`:
! Aesthetics must be valid data columns. Problematic aesthetic(s): x = month. 
Did you mistype the name of a data column or forget to add after_stat()?
Run `rlang::last_error()` to see where the error occurred.

我还尝试过group_by(月份、公制),结果显示

`summarise()` has grouped output by '.groups'. You can override using the `.groups` argument.

有人能帮我一下吗?

nwnhqdif

nwnhqdif1#

问题是你必须group_bymetricmonth。否则你只计算每个metric的平均值,结果month列被删除。这就是你得到错误的原因,因为ggplot2现在认为你想把month函数Map到x上。
注意:我还切换到了tidyverse方法来重塑数据。

library(tidyr)
library(ggplot2)
library(lubridate)
library(dplyr)

df_mod <- df %>%
  tidyr::pivot_longer(-date, names_to = "metric", values_to = "values") %>%
  mutate(month = month(date, label = TRUE))

df_mod %>%
  group_by(month, metric) %>%
  summarize(mean = mean(values)) %>%
  ggplot(aes(fill = metric, y = mean, x = month)) +
  geom_col(position = "dodge") +
  theme_bw() +
  labs(
    title = "Weather metrics",
    x = "", y = "values"
  )

相关问题