R语言 在ggplot2 x轴中添加断点

kwvwclae  于 2023-11-14  发布在  其他
关注(0)|答案(2)|浏览(130)

我在绘制覆盖率与位置的关系,我有一个非常简单的代码,如下所示:

positions <- 1:200000
coverage <- rep(0, length(positions))
coverage[1:200] <- 2000
coverage[30001:30100] <- 5000
coverage[50001:50100] <- 500
coverage[170001:170300] <- 500
cov <- data.frame(position = positions, coverage = coverage)

ggplot(data = cov, aes(x = position, y = coverage)) +
  geom_line() +
  xlab("Position") +
  ylab("Coverage") +
  ggtitle("Coverage vs. Position")

字符串
这给我留下了这样的东西:
enter image description here
这里的问题是,具有高覆盖率的区域被非常长的零覆盖率轨道分隔开。我想缩短这些区域,以便覆盖率区域可见。例如,当有超过100个连续零时,将x轴切割。这可能吗?提前感谢!

5lwkijsr

5lwkijsr1#

这是一个有点doozie!
在绘制它之前,您需要弄清楚有多少个0是连续的。

library(tidyverse)
library(ggplot2)
library(ggbreak)

breakers <- breakers <- cov %>% 
mutate(gr = cumsum(coverage == 0), 
       gs = cumsum(coverage != lag(coverage, default = 0))) %>% 
group_by(gs, coverage) %>% 
summarise(min_pos = min(position), 
          max_pos = max(position), 
          min_gr = min(gr), 
          max_gr = max(gr)) %>% 
ungroup() %>% 
mutate(diff_pos = max_pos - min_pos, diff_gr = max_gr - min_gr) %>% 
filter(coverage == 0)

字符串
然后你可以在此基础上进行绘图。让我说清楚,我找不到一种方法来编程化这个。

ggplot(data = cov, aes(x = position, y = coverage)) +
geom_line() +
xlab("Position") +
ylab("Coverage") +
ggtitle("Coverage vs. Position") + 
scale_x_break(breaks = c(breakers$min_pos[1], breakers$max_pos[1]), scales = 'free') + 
scale_x_break(breaks = c(breakers$min_pos[2], breakers$max_pos[2]), scales = 'free') + 
scale_x_break(breaks = c(breakers$min_pos[3], breakers$max_pos[3]), scales = 'free') + 
scale_x_break(breaks = c(breakers$min_pos[4], breakers$max_pos[4]), scales = 'free')


最后的图看起来像
x1c 0d1x的数据

dgtucam1

dgtucam12#

我的方法是取+/-带宽平均值,每次平均值变为或停止为零时创建一个“部分”,过滤掉无聊的零部分,留下尖峰周围区域的视图。

library(dplyr)
bandwidth = 200
cov %>%
  mutate(avg = slider::slide_dbl(coverage, mean, 
                                 .before = bandwidth, .after = bandwidth)) %>%
  mutate(section = cumsum((avg > 0) != (lag(avg>0,1,0)))) %>%
  filter(avg != 0) %>%
  ggplot(aes(position, coverage)) +
  geom_line() +
  facet_wrap(~section, scales = "free_x")

字符串
x1c 0d1x的数据

相关问题