在差异表达值的数据框架中,对每组显著上调和下调的基因进行计数。显著性应通过FDR(假发现率=来自Benjamini的调整p值)和倍数变化来定义。结果应为每组上调和下调的曲线图。(甜蜜的奖励:在图中显示不同的Fc水平(例如:0.5,1,2,4,〉4).我的解决方案似乎太复杂了,一定有更简单的方法。
示例数据
# create dex df
gene_creator <- paste("gene",1:1000,sep="")
genes = sample(gene_creator,100)
dex_A <- data.frame(
gene = genes,
group = "group_A",
logFC = sample(c(-5:5), replace=T, size=100),
FDR = sample(c(0.01,1), replace=T, size=100)
)
dex_B <- data.frame(
gene = genes,
group = "group_B",
logFC = sample(c(-5:5), replace=T, size=100),
FDR = sample(c(0.01,1), replace=T, size=100)
)
dex_C <- data.frame(
gene = genes,
group = "group_C",
logFC = sample(c(-5:5), replace=T, size=100),
FDR = sample(c(0.01,1), replace=T, size=100)
)
dex_D <- data.frame(
gene = genes,
group = "group_D",
logFC = sample(c(-5:5), replace=T, size=100),
FDR = sample(c(0.01,1), replace=T, size=100)
)
dex_df <- rbind(dex_A, dex_B, dex_C, dex_D)
溶液
library("tidyverse")
# FC up
dex_up <- dex_df %>%
group_by(group) %>%
filter(FDR <= 0.05) %>%
filter(logFC > 0.5 ) %>%
summarise(n_up = n())
# Fc down
dex_down <- dex_df %>%
group_by(group) %>%
filter(FDR <= 0.05) %>%
filter(logFC < 0.5 ) %>%
summarise(n_down = n())
# format
dex_comb <- left_join(dex_up, dex_down, by = c("group"))
dex_comb$n_down <- dex_comb$n_down * -1
dex_comb_long <- dex_comb %>% pivot_longer(!group, names_to = "direction", values_to = "n")
# plot
dex_comb_long %>%
ggplot(aes(x = group, y = n, fill = direction)) +
geom_bar(stat="identity", position="identity") +
geom_text(aes(label=n, vjust = -sign(n))) +
ggtitle("Dex numbers per group")
1条答案
按热度按时间axr492tv1#
计算满足条件的次数的常用方法是
sum()
该条件:以下是简化的示例数据:
我建议在图中包含
logFC
值: