R语言 数据点的阴影区域

ujv3wf0j  于 2024-01-03  发布在  其他
关注(0)|答案(1)|浏览(92)

我有以下数据点:

c(1553872.0297353, 3987850.7333411, 3149751.04758742, 139690.256184131, 
2659359.58444917, 4424034.10243952, -164716.863161922, 1032117.65430564, 
1012016.7065469, 4846572.29657933, 3226129.15792111, 4017430.43881163, 
3828193.63192272, 3829708.57524592, 1332388.79071537, 2089023.58090538, 
1644671.18495059, 4460304.42245758, 3371828.92991641, 4113191.0440754, 
3113751.73357746, 617900.379054606, 1317373.8428064, 6300890.66407368, 
3296661.03616896, 4118518.49087673, 4542392.60001633, 2195633.29577509, 
3704255.4295885, 3256168.15662825, 3859226.70793027, 4302788.70683497, 
1212610.90730169, 1100156.02674204, 3877506.61645749, 2679633.27812409, 
3873805.93153843, -260211.296112984, 3245236.76979178, 1287142.02860096, 
940739.460417479, 2994090.66052949, 4945187.39388016, 4245739.22159749, 
1834826.91901863, 4007125.98655838, 2441363.58320388, 3927976.3634389, 
3311368.65232602, 7075557.77947983, 3243219.14157882, 2517657.51519752, 
3171624.62320739, 3402860.88835126, 4842785.56583616, 3855876.39565095, 
2405788.12178841, 2880195.38919339, 3290479.8769342, 5214395.40981439, 
4303049.42485616, 2195917.90046817, 3177092.87433431, 2380356.21216434, 
3387837.07527694, 1638340.56836534, 4622169.45155907, 2364584.07782942, 
3739518.62696525, 3297125.04237121, 1406550.84702262, 4524851.84638035, 
5300405.1815232, 2307646.3613227, 2102213.83460057, 2520455.84518903, 
4988206.87073815, 2121162.4699674, 4603996.13556966, 4977903.73829612, 
5327575.83245304, 4454316.67896575, 3115751.54495466, 3802810.69212559, 
-719107.265338242, 879256.548205465, 3757467.72037339, 1397266.77760947, 
2683252.17093566, 2267063.20041564, 3507007.12497479, 2671586.12385416, 
2883476.2559073, 1646404.25714463, 1480966.75076908, 6262630.29895663, 
2270844.80306551, 4490116.75684258, 3300223.17061254, 1470747.71921301
)

字符串
我想生成一个图,并在x轴为负的相应密度图的区域上加阴影。或者也可以使用直方图。类似于这样:

ggplot(mapping=aes(res_cost))+geom_density(fill="skyblue")


但在这种情况下,我只是想从0到左边的阴影区。我也想包括显示在图中的阴影区的百分比。我不是新鲜的所有这些我知道我可以使用分位数函数,但我不记得很好这些概念。


的数据
我在关注一个帖子,

plt<-ggplot(mapping=aes(res_cost))+geom_density(fill="skyblue")

d<-ggplot_build(plt)$data[[1]]
p <- plt + geom_area(data = subset(d, x <= 0), aes(x=x, y=y), 
 fill="red") +
  geom_segment(x=0, xend=0, 
               y=0, yend=approx(x = d$x, y = d$y, xout = 0)$y,
               colour="blue", linewidth=3)



但是我不知道如何纠正的是第一。如何获得红色曲线下的面积百分比?以及为什么我看不到边界,我希望完整的内核密度,但没有极端。此外,如果我需要应用相同的想法,但条形图或直方图,因为这样做我得到了读取多边形。

lnvxswe2

lnvxswe21#

我认为最简洁的方法是预先计算密度并将其绘制为geom_area

library(ggplot2)

df <- within(as.data.frame(density(res_cost, n = 1e4)[c("x", "y")]), z <- x < 0)

ggplot(df) +
  geom_area(aes(x, y, fill = z), color = "black") +
  scale_fill_manual(values = c("skyblue", "red3"), guide = "none")

字符串


的数据
红色曲线的比例是

sum(df$y[df$z])/sum(df$y)
#> [1] 0.02912582


也就是说,2.9%。
要对直方图做同样的事情,我们可以这样做:

ggplot() +
  geom_histogram(aes(res_cost, fill = after_stat(x < 0))) +
  scale_fill_manual(values = c("skyblue", "red3"), guide = "none")


相关问题