我已经做了一系列物种的配对卡方比较,并且有一个包含每个物种对的p值的数据框,我想使用类似于相关矩阵的ggplot创建一个可视化,这样我们就可以看到哪些物种对具有显著的p值,一种颜色的p值〉0.05,另一种颜色的p值〈或= 0.5,类似于下面的内容:(https://www.researchgate.net/publication/293654442_cooccur_Probabilistic_Species_Co-Occurrence_Analysis_in_R)
我已经尝试遵循这些指南:
http://www.sthda.com/english/wiki/ggplot2-quick-correlation-matrix-heatmap-r-software-and-data-visualization
https://www.youtube.com/watch?v=E3De2A73ako
没有太大的成功。我有麻烦1)把它变成一个三角形和2)有颜色是二元的,而不是梯度。我宁愿保持物种的秩序,因为我有他们在我的 Dataframe 。任何建议将是惊人的。我一直在驾驶自己的大手帕与这一个。
我已经附上了宽格式和长格式的 Dataframe 的数据输出!
非常感谢!
Wide.df:
structure(list(Species1 = c(NA, 8.29661485364936e-14, 0.197328035330918,
3.73447143215061e-05, 0.0200245227839388, 0.000158518795659732,
0.999999999999999, 0.999999999999999, 0.200661174254572, 0.0259711510883781,
0.624415639985824, 0.999999999999995, 0.999999999999991), Species2 = c(8.29661485364936e-14,
NA, 0.911548596172133, 0.0808672833723648, 0.00232408080140171,
0.00257625010714883, 0.999999999999777, 0.999999999999777, 0.789829339795786,
0.198077829941541, 0.768466049890891, 0.999999999999885, 0.999999999999994
), Species3 = c(0.197328035330918, 0.911548596172133, NA, 0.181696316252709,
0.575238254258972, 0.939866701307512, 0.999999999999999, 0.999999999999999,
0.999999999999999, 0.507611967012476, 0.673527222144056, 0.999999999999996,
1), Species4 = c(3.73447143215061e-05, 0.0808672833723648, 0.181696316252709,
NA, 1.40996305374498e-10, 0.405410680293625, 0.999999999999999,
0.999999999999999, 1.56797538624063e-08, 8.50447159522988e-05,
0.00644405295214749, 0.88381428087806, 0.99999999999993), Species5 = c(0.0200245227839388,
0.00232408080140171, 0.575238254258972, 1.40996305374498e-10,
NA, 0.622095128306733, 0.999999999999959, 0.999999999999959,
0.000115476294641169, 1.60499061530966e-19, 4.67520836455185e-05,
0.528580893876124, 0.795044191844885), Species6 = c(0.000158518795659732,
0.00257625010714883, 0.939866701307512, 0.405410680293625, 0.622095128306733,
NA, 0.999999999999999, 0.999999999999999, 0.200661174254572,
0.410636112084533, 0.999999999999929, 0.999999999999995, 0.999999999999991
), Species7 = c(0.999999999999999, 0.999999999999777, 0.999999999999999,
0.999999999999999, 0.999999999999959, 0.999999999999999, NA,
0.999999999999299, 0.779426832974571, 0.999999999999933, 0.999999999999999,
0.999999999999998, 0.999999999999687), Species8 = c(0.999999999999999,
0.999999999999777, 0.999999999999999, 0.999999999999999, 0.999999999999959,
0.999999999999999, 0.999999999999299, NA, 0.999999999999999,
0.611136265859179, 0.999999999999999, 0.999999999999998, 0.999999999999687
), Species9 = c(0.200661174254572, 0.789829339795786, 0.999999999999999,
1.56797538624063e-08, 0.000115476294641169, 0.200661174254572,
0.779426832974571, 0.999999999999999, NA, 0.0311037604732729,
0.0122054515551129, 0.999999999999984, 0.999999999999999), Species10 = c(0.0259711510883781,
0.198077829941541, 0.507611967012476, 8.50447159522988e-05, 1.60499061530966e-19,
0.410636112084533, 0.999999999999933, 0.611136265859179, 0.0311037604732729,
NA, 0.0403275386741277, 0.508244635418544, 0.999999999999999),
Species11 = c(0.624415639985824, 0.768466049890891, 0.673527222144056,
0.00644405295214749, 4.67520836455185e-05, 0.999999999999929,
0.999999999999999, 0.999999999999999, 0.0122054515551129,
0.0403275386741277, NA, 1, 0.999999999999823), Species12 = c(0.999999999999995,
0.999999999999885, 0.999999999999996, 0.88381428087806, 0.528580893876124,
0.999999999999995, 0.999999999999998, 0.999999999999998,
0.999999999999984, 0.508244635418544, 1, NA, 0.999999999998991
), Species13 = c(0.999999999999991, 0.999999999999994, 1,
0.99999999999993, 0.795044191844885, 0.999999999999991, 0.999999999999687,
0.999999999999687, 0.999999999999999, 0.999999999999999,
0.999999999999823, 0.999999999998991, NA)), row.names = c("Species1",
"Species2", "Species3", "Species4", "Species5", "Species6", "Species7",
"Species8", "Species9", "Species10", "Species11", "Species12",
"Species13"), class = "data.frame")
long.df:
structure(list(SpeciesA = c("Species1", "Species2", "Species3",
"Species4", "Species5", "Species6", "Species7", "Species8", "Species9",
"Species10", "Species11", "Species12", "Species13", "Species1",
"Species2", "Species3", "Species4", "Species5", "Species6", "Species7",
"Species8", "Species9", "Species10", "Species11", "Species12",
"Species13", "Species1", "Species2", "Species3", "Species4",
"Species5", "Species6", "Species7", "Species8", "Species9", "Species10",
"Species11", "Species12", "Species13", "Species1", "Species2",
"Species3", "Species4", "Species5", "Species6", "Species7", "Species8",
"Species9", "Species10", "Species11", "Species12", "Species13",
"Species1", "Species2", "Species3", "Species4", "Species5", "Species6",
"Species7", "Species8", "Species9", "Species10", "Species11",
"Species12", "Species13", "Species1", "Species2", "Species3",
"Species4", "Species5", "Species6", "Species7", "Species8", "Species9",
"Species10", "Species11", "Species12", "Species13", "Species1",
"Species2", "Species3", "Species4", "Species5", "Species6", "Species7",
"Species8", "Species9", "Species10", "Species11", "Species12",
"Species13", "Species1", "Species2", "Species3", "Species4",
"Species5", "Species6", "Species7", "Species8", "Species9", "Species10",
"Species11", "Species12", "Species13", "Species1", "Species2",
"Species3", "Species4", "Species5", "Species6", "Species7", "Species8",
"Species9", "Species10", "Species11", "Species12", "Species13",
"Species1", "Species2", "Species3", "Species4", "Species5", "Species6",
"Species7", "Species8", "Species9", "Species10", "Species11",
"Species12", "Species13", "Species1", "Species2", "Species3",
"Species4", "Species5", "Species6", "Species7", "Species8", "Species9",
"Species10", "Species11", "Species12", "Species13", "Species1",
"Species2", "Species3", "Species4", "Species5", "Species6", "Species7",
"Species8", "Species9", "Species10", "Species11", "Species12",
"Species13", "Species1", "Species2", "Species3", "Species4",
"Species5", "Species6", "Species7", "Species8", "Species9", "Species10",
"Species11", "Species12", "Species13"), SpeciesB = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L,
12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 13L, 13L), .Label = c("Species1", "Species2", "Species3",
"Species4", "Species5", "Species6", "Species7", "Species8", "Species9",
"Species10", "Species11", "Species12", "Species13"), class = "factor"),
p_value = c(NA, 8.29661485364936e-14, 0.197328035330918,
3.73447143215061e-05, 0.0200245227839388, 0.000158518795659732,
0.999999999999999, 0.999999999999999, 0.200661174254572,
0.0259711510883781, 0.624415639985824, 0.999999999999995,
0.999999999999991, 8.29661485364936e-14, NA, 0.911548596172133,
0.0808672833723648, 0.00232408080140171, 0.00257625010714883,
0.999999999999777, 0.999999999999777, 0.789829339795786,
0.198077829941541, 0.768466049890891, 0.999999999999885,
0.999999999999994, 0.197328035330918, 0.911548596172133,
NA, 0.181696316252709, 0.575238254258972, 0.939866701307512,
0.999999999999999, 0.999999999999999, 0.999999999999999,
0.507611967012476, 0.673527222144056, 0.999999999999996,
1, 3.73447143215061e-05, 0.0808672833723648, 0.181696316252709,
NA, 1.40996305374498e-10, 0.405410680293625, 0.999999999999999,
0.999999999999999, 1.56797538624063e-08, 8.50447159522988e-05,
0.00644405295214749, 0.88381428087806, 0.99999999999993,
0.0200245227839388, 0.00232408080140171, 0.575238254258972,
1.40996305374498e-10, NA, 0.622095128306733, 0.999999999999959,
0.999999999999959, 0.000115476294641169, 1.60499061530966e-19,
4.67520836455185e-05, 0.528580893876124, 0.795044191844885,
0.000158518795659732, 0.00257625010714883, 0.939866701307512,
0.405410680293625, 0.622095128306733, NA, 0.999999999999999,
0.999999999999999, 0.200661174254572, 0.410636112084533,
0.999999999999929, 0.999999999999995, 0.999999999999991,
0.999999999999999, 0.999999999999777, 0.999999999999999,
0.999999999999999, 0.999999999999959, 0.999999999999999,
NA, 0.999999999999299, 0.779426832974571, 0.999999999999933,
0.999999999999999, 0.999999999999998, 0.999999999999687,
0.999999999999999, 0.999999999999777, 0.999999999999999,
0.999999999999999, 0.999999999999959, 0.999999999999999,
0.999999999999299, NA, 0.999999999999999, 0.611136265859179,
0.999999999999999, 0.999999999999998, 0.999999999999687,
0.200661174254572, 0.789829339795786, 0.999999999999999,
1.56797538624063e-08, 0.000115476294641169, 0.200661174254572,
0.779426832974571, 0.999999999999999, NA, 0.0311037604732729,
0.0122054515551129, 0.999999999999984, 0.999999999999999,
0.0259711510883781, 0.198077829941541, 0.507611967012476,
8.50447159522988e-05, 1.60499061530966e-19, 0.410636112084533,
0.999999999999933, 0.611136265859179, 0.0311037604732729,
NA, 0.0403275386741277, 0.508244635418544, 0.999999999999999,
0.624415639985824, 0.768466049890891, 0.673527222144056,
0.00644405295214749, 4.67520836455185e-05, 0.999999999999929,
0.999999999999999, 0.999999999999999, 0.0122054515551129,
0.0403275386741277, NA, 1, 0.999999999999823, 0.999999999999995,
0.999999999999885, 0.999999999999996, 0.88381428087806, 0.528580893876124,
0.999999999999995, 0.999999999999998, 0.999999999999998,
0.999999999999984, 0.508244635418544, 1, NA, 0.999999999998991,
0.999999999999991, 0.999999999999994, 1, 0.99999999999993,
0.795044191844885, 0.999999999999991, 0.999999999999687,
0.999999999999687, 0.999999999999999, 0.999999999999999,
0.999999999999823, 0.999999999998991, NA)), row.names = c(NA,
-169L), class = "data.frame")
2条答案
按热度按时间h79rfbju1#
这里有一个粗略的开始。基本的图是一个简单的分箱热图,但有几个阶段我们需要经历:
cut
对数据进行二进制;我随意地把它们分成0、0. 25、0. 5和1的小桶,你可以根据自己的需要随意改变数字;factor
s,并且为了正确地排序它们,我们需要解析出数字,因此先解析出as.integer(gsub(..))
,然后解析出factor
。num*
大于y轴的值;这是任意的,但符合你的样本图。ggplot(.)
的数据中保留numA == numB
,子集化出瓦片的对角线,然后 * 仅 * 保留标签的对角线(geom_text
)。expand=
两个轴;实际数量可以取决于画布(PDF、HTML等)。我使用了base-R(R-4)变换/子集,这很容易转换为使用dplyr的
mutate
/filter
。7vux5j2d2#
在r2 evans的帮助下解决了!我最终使用的代码如下!
这就产生了下面的图:
谢谢大家!
Dataframe 的Dput如下所示: