是否有一种更简单的方法来创建具有置信区间的预测概率图,用于具有交互作用项的多项式logistic回归?我有下面的代码,这似乎是一个很长的路要做的事情,我可以只做一行代码的其他回归类型。我还想知道,如果创建三个图形为我的每个y
水平是最好的方式来表示结果。
另外,我想知道是否有一种方法可以使geom_ribbon
看起来更平滑。
# data frame
df <- data.frame(
rating = c("1 Better", "1 Better", "1 Better", "2 Medium", "2 Medium", "2 Medium", "3 Worse", "3 Worse", "3 Worse","1 Better", "1 Better", "1 Better", "2 Medium", "2 Medium", "2 Medium", "3 Worse", "3 Worse", "3 Worse"),
count = c(2, 0, 5, 8, 10, 3, 2, 1, 0, 0, 9, 1, 0, 5, 7, 2, 9, 0),
case = c("Y", "N", "Y", "Y", "N", "Y", "N", "Y", "N", "N", "Y", "N", "Y", "N", "N", "Y", "N", "Y"),
cool = c(1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1)
)
# regression model
fit <- multinom(rating ~ count * case + cool, data = df)
字符串
似乎下一步将是生成用于预测的数据。
# new data for prediction
new_data <- expand.grid(
count = seq(min(df$count), max(df$count), length.out = 100),
case = unique(df$case),
cool = unique(df$cool)
)
# simulations
n_sims <- 1000
# simulate predictions
simulate_preds <- function(model, newdata, n) {
preds <- predict(model, newdata = newdata, type = "probs")
simulated_preds <- replicate(n, {
beta_sims <- matrix(rnorm(length(preds), 0, 0.1), ncol = nrow(preds))
preds + t(beta_sims)
})
simulated_preds
}
simulated_probs <- simulate_preds(fit, newdata = new_data, n = n_sims)
# confidence intervals
quantiles <- apply(simulated_probs, c(1, 2), function(x) {
quantile(x, c(0.025, 0.975))
})
型
因此,当我合并这些分位数时,我将有三组独立的上限和下限:
# combine to create new df
new_data <- cbind(new_data, quantiles[1,,], quantiles[2,,])
# rename columns
colnames(new_data) <- c("count", "case", "cool", "Better_Low", "Medium_Low", "Worse_Low", "Better_High", "Medium_High", "Worse_High")
型
下面是我的图表:
ggplot(new_data, aes(x = count, color = factor(cool))) +
facet_wrap(~ yes) +
geom_ribbon(aes(ymin = Better_Low, ymax = Better_High)) +
labs(x = "Count", y = "Predicted Probability", color = "Cool", linetype = "Case") +
ggtitle("Predicted Probabilities by Rating, Case, and Coolness with Confidence Intervals") +
theme_bw()
型
我重复这一点为中度和更糟...
但是,当我为geom_ribbon函数添加stat和method命令时,它会抛出一个错误:
ggplot(new_data, aes(x = count, color = factor(cool))) +
facet_wrap(~ yes) +
geom_ribbon(aes(ymin = Better_Low, ymax = Better_High), alpha = 0.3, stat = "smooth", method = "loess") +
labs(x = "Count", y = "Predicted Probability", color = "Cool", linetype = "Case") +
ggtitle("Predicted Probabilities by Rating, Case, and Coolness with Confidence Intervals") +
theme_bw()
`geom_smooth()` using formula = 'y ~ x'
Error in `geom_ribbon()`:
! Problem while computing stat.
ℹ Error occurred in the 1st layer.
Caused by error in `compute_layer()`:
! `stat_smooth()` requires the following missing aesthetics: y
Run `rlang::last_trace()` to see where the error occurred.
型
1条答案
按热度按时间omqzjyyz1#
你可以使用
marginaleffects
包来实现。你的示例数据太小太奇怪,无法生成合理的图表,但下面的代码还是显示了结果。如果你想了解更多,你可以阅读这两个小插曲:字符串
的数据
免责声明:我是
marginaleffects
的作者。