我的数据与下面的foo
没有什么不同。观察结果按时间顺序排列,并对不同的组进行测量。
library(tidyverse)
n <- 500
r <- 0.001
a <- 5
foo <- data.frame(Time = 1:n) %>%
mutate(A = a*(1+r)^Time + rnorm(n), B = rnorm(n,mean=a,sd=2)) %>%
pivot_longer(cols = -Time, names_to = "Group", values_to = "Measurement")
foo %>% ggplot(mapping = aes(x=Time,y=Measurement,color=Group)) +
geom_line()
我想知道的是,在给定的时间段内(比如在这个例子中〉250),有多少测量值高于或低于整个数据集上一组的给定分位数值。例如,使用整个数据集计算的第10、第50和第90分位数。
fooQuants <- foo %>% group_by(Group) %>%
reframe(Measurement=quantile(Measurement, probs = c(0.1,0.5,0.9))) %>%
add_column(Quant = rep(paste0("Q",c(0.1,0.5,0.9)),2))
# eg for group A with Time > 250 this is the number of times that Measurement is above 90th quant
A90th <- fooQuants %>% filter(Group == "A", Quant == "Q0.9") %>% pull(Measurement)
foo %>% filter(Time > 250, Group == "A") %>% summarise(n=sum(Measurement >= A90th))
# eg for group B with Time > 250 this is the number of times that Measurement is above 90th quant
B90th <- fooQuants %>% filter(Group == "B", Quant == "Q0.9") %>% pull(Measurement)
foo %>% filter(Time > 250, Group == "B") %>% summarise(n=sum(Measurement >= B90th))
我想对foo
中的所有组和fooQuants
中的所有不同分位数水平进行此操作。
#foo %>% filter(Time > 250) %>% select(Group,Measurement) %>%
# group_by(Group) %>%
# summarise(`the number of instances where the measurement is gt or lt each quantile in fooQuants`)
感谢小费!
1条答案
按热度按时间gojuced71#
结果