使用rmultinom来估计节日聚会礼物的分配,以获得一个列表/估计

cu6pst1q  于 2024-01-03  发布在  其他
关注(0)|答案(1)|浏览(234)

有一个6人的节日聚会,6人中的每一个人都有一个关于他们的gifts_brought的估计,每个人都有一个关于他们收到另一个人礼物的机会的估计(gifts_received_pct).一个人可以从另一个人那里收到的礼物数量没有限制。他们只能给予/接收来自自己团队的礼物,他们不能把自己带来的礼物送给自己。
我的真实的问题有1000个不同的迭代/估计参数gifts_broughtgifts_received_pct,我想估计所有1000个估计中每个人收到的礼物总数。为了这个练习的目的,我将使所有的估计相同,但我想明确的是,实际上,我所有的嵌套都对参数有不同的估计,这就是为什么我不能只做rmultinom(1000, ...)
首先构建虚拟代码。

  1. name <- c('Aaron', 'Susie', 'Sam', 'Emma', 'Jennifer', 'Steve')
  2. giftsBrought <- c(5, 3, 4, 2, 3, 6)
  3. team <- c('Sales', 'Sales', 'Sales', 'IT', 'IT', 'IT')
  4. gifts_received_pct <- c(.2, .3, .1, .2, .2, .1) # rmultinom does not require normilazation
  5. giftsDF <- data.frame(name, team, giftsBrought, gifts_received_pct, stringsAsFactors = FALSE)
  6. giftsEstimationList <- list()
  7. for(i in 1:1000){
  8. giftsEstimationList[[i]] <- giftsDF
  9. }

字符串
接下来,这是我如何为其中一个数组获取gifts_received计算:

  1. giftsReceivedDF <- lapply(1:nrow(giftsDF), function(i){
  2. probs <- giftsDF
  3. probs$gifts_received_pct[probs$team != giftsDF$team[i] | probs$name == giftsDF$name[i]] <- 0 # set other_team_pct and own_pct to 0
  4. rmultinom(1, giftsDF$giftsBrought[i], probs$gifts_received_pct)
  5. })
  6. Reduce(`+`, giftsReceivedDF)


我相信这是正确的-当仔细查看giftsReceivedDF时,似乎没有人收到过自己的礼物,而另一个团队也没有收到任何礼物。
让我困惑的是如何及时地在giftsEstimationList中的所有1000个嵌套框中运行它。我最初试图用一堆for循环来强制执行所有内容,但我不相信这是最有效的,而且时间在这里相当重要。

nfzehxib

nfzehxib1#

一个模拟礼物交换的函数。它在所有团队和迭代中被向量化。它只需要循环最大参与者数量(示例数据中为3)。

  1. library(matrixStats) # for `rowCumsums` and `colCumsums`
  2. library(data.table)
  3. f <- function(giftsEstimationList) {
  4. # combine the list into a single table
  5. dt <- rbindlist(giftsEstimationList, TRUE, FALSE, "iter")
  6. # get the size of each exchange group (a team within an iteration)
  7. n <- dt[,.N, .(iter, team)][[3]]
  8. maxcol <- max(n) # the maximum sized exchange group
  9. # a matrix of relative probabilities of the destination probabilities of
  10. # each participant's gifts (row 1, column 3 = probability that a gift from
  11. # Aaron will go to Sam in iteration 1; row 11, column 1 = probability that
  12. # Jennifer's gift will go to Emma in iteration 2)
  13. m <- matrix(
  14. unlist(
  15. dt[
  16. ,.(.(c(gifts_received_pct*(1 - diag(maxcol)),
  17. numeric(.N*(maxcol - .N))))),
  18. .(iter, team)
  19. ][[3]]
  20. ),
  21. nrow(dt), maxcol, 1
  22. )
  23. # get the probability to use in `rbinom`
  24. # (see https://en.wikipedia.org/wiki/Multinomial_distribution#Sampling_using_repeated_conditional_binomial_samples)
  25. m[,2:maxcol] <- m[,2:maxcol]/rowCumsums(m)[,2:maxcol]
  26. giftsRemaining <- dt$giftsBrought
  27. # distribute the gifts
  28. for (j in maxcol:2) {
  29. m[,j] <- rbinom(nrow(dt), giftsRemaining, m[,j])
  30. giftsRemaining <- giftsRemaining - m[,j]
  31. }
  32. m[,1] <- giftsRemaining
  33. # aggregate the gifts received for each participant
  34. dt[
  35. ,giftsReceived := diff(rbind(0, colCumsums(m)[cumsum(n),]))[
  36. sequence(n, 1:nrow(dt), length(n))
  37. ]
  38. ]
  39. }

字符串
演示:

  1. set.seed(474180891)
  2. system.time(dt <- f(giftsEstimationList))
  3. #> user system elapsed
  4. #> 0.06 0.00 0.06
  5. dim(dt)
  6. #> [1] 6000 6
  7. # show the first 12 rows of the result
  8. dt[1:12]
  9. #> iter name team giftsBrought gifts_received_pct giftsReceived
  10. #> 1: 1 Aaron Sales 5 0.2 3
  11. #> 2: 1 Susie Sales 3 0.3 7
  12. #> 3: 1 Sam Sales 4 0.1 2
  13. #> 4: 1 Emma IT 2 0.2 4
  14. #> 5: 1 Jennifer IT 3 0.2 7
  15. #> 6: 1 Steve IT 6 0.1 0
  16. #> 7: 2 Aaron Sales 5 0.2 4
  17. #> 8: 2 Susie Sales 3 0.3 6
  18. #> 9: 2 Sam Sales 4 0.1 2
  19. #> 10: 2 Emma IT 2 0.2 7
  20. #> 11: 2 Jennifer IT 3 0.2 3
  21. #> 12: 2 Steve IT 6 0.1 1

展开查看全部

相关问题