R语言 如何在gamlss中拟合比例数据(非计数)的beta-二项式模型

qxgroojn  于 2023-01-22  发布在  其他
关注(0)|答案(1)|浏览(98)

我想拟合β-二项式回归。我没有计数,只有比例。下面是一个例子:

library(dplyr)
library(gamlss)

df <- tibble(
  cluster = LETTERS[1:20]
) |>
  mutate(
    p = rbeta(n(), 1, 1),
    n = as.integer(3 * runif(n()))
  )

fit <- gamlss(
  p ~ log(n),
  weights = n,
  data = df,
  family = BB(mu.link='identity')
)

我得到错误:

Error in while (abs(olddv - dv) > cc && itn < cyc) { : 
  missing value where TRUE/FALSE needed
In addition: There were 50 or more warnings (use warnings() to see the first 50)

警告如下所示:

In dbinom(x, size = bd, prob = mu, log = log) : non-integer x = 0.834502

请注意,我不想四舍五入成功的数量,如mutate(y = round(p * n))

5jvtdoz2

5jvtdoz21#

BB()家族的帮助文件建议,因变量应该是成功和失败次数的两列矩阵。(成功的概率)得到n(试验次数),那么你可以使成功次数为k=floor(p*n),失败次数为notk = n-k,然后,你可以像我下面做的那样做。

library(dplyr)
library(gamlss)

df <- tibble(
  cluster = LETTERS[1:20]
) |>
  mutate(
    p = rbeta(n(), 1, 1),
    n = as.integer(100 * runif(n()))
  )

df <- df %>% 
  mutate(k = floor(p*n), 
         notk = n-k)

fit <- gamlss(
  cbind(k, notk) ~ cluster,
  data = df,
  family = BB(mu.link='logit')
)

#> ******************************************************************
#> Family:  c("BB", "Beta Binomial") 
#> 
#> Call:  
#> gamlss(formula = cbind(k, notk) ~ cluster, family = BB(mu.link = "logit"),  
#>     data = df) 
#> 
#> Fitting method: RS() 
#> 
#> ------------------------------------------------------------------
#> Mu link function:  logit
#> Mu Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  8.130e-01  3.836e-01   2.119  0.03406 *  
#> clusterB    -8.130e-01  2.036e+00  -0.399  0.68973    
#> clusterC    -3.686e+01  1.000e+05   0.000  0.99971    
#> clusterD    -2.970e+00  6.922e-01  -4.291 1.78e-05 ***
#> clusterE     3.618e-01  4.843e-01   0.747  0.45508    
#> clusterF    -3.381e-01  5.317e-01  -0.636  0.52479    
#> clusterG    -3.569e+00  6.506e-01  -5.485 4.13e-08 ***
#> clusterH    -1.118e+00  4.356e-01  -2.566  0.01030 *  
#> clusterI    -1.712e+00  4.453e-01  -3.845  0.00012 ***
#> clusterJ     1.825e+00  6.315e-01   2.889  0.00386 ** 
#> clusterK    -3.686e+01  1.000e+05   0.000  0.99971    
#> clusterL    -5.247e-01  4.602e-01  -1.140  0.25419    
#> clusterM     1.439e+00  7.167e-01   2.008  0.04464 *  
#> clusterN     9.161e-02  4.721e-01   0.194  0.84613    
#> clusterO    -2.405e+00  1.000e+05   0.000  0.99998    
#> clusterP     3.034e-01  5.583e-01   0.543  0.58686    
#> clusterQ    -1.523e+00  5.389e-01  -2.826  0.00471 ** 
#> clusterR    -2.498e+00  6.208e-01  -4.024 5.73e-05 ***
#> clusterS     1.006e+00  5.268e-01   1.910  0.05619 .  
#> clusterT    -6.228e-02  4.688e-01  -0.133  0.89433    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> ------------------------------------------------------------------
#> Sigma link function:  log
#> Sigma Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) -36.034363   0.005137   -7014   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> ------------------------------------------------------------------
#> No. of observations in the fit:  20 
#> Degrees of Freedom for the fit:  21
#>       Residual Deg. of Freedom:  -1 
#>                       at cycle:  9 
#>  
#> Global Deviance:     69.49748 
#>             AIC:     111.4975 
#>             SBC:     132.4079 
#> ******************************************************************

reprex package(v2.0.1)于2023年1月20日创建

相关问题