我正在运行一个函数,用于对 Dataframe 的不同子集进行多个加权t检验。我的功能基本上如下:
library(weights)
group_list <- list(unique(df$group))
t_tests <- for (g in group_list){wtd.t.test(x=df[df$group == g,]$var2[df[df$group == g,]$var1=="A"],y=df[df$group == g,]$var2[df[df$group == g,]$var1=="B"],
weight=df[df$group == g,]$weight[df[df$group == g,]$var1=="A"],weighty=df[df$group == g,]$weight[df[df$group == g,]$var1=="B"],samedata=FALSE)}
其中,var2
是关注的结果变量。我想测试var1
=“A”和var1
=“B”的均值之间差异的显著性,并对变量group
的不同值的每个数据子集执行此操作。
我使用了上面的代码,但错误是Error in wtd.t.test(x = df[df$group == g, : object 'out' not found
我的函数结构不正确吗?我如何对 Dataframe 的每个子集进行加权t测试?
更新:建议使用嵌套tibles的新方法
我的新方法如下:
library(weights)
library(tidyverse)
df %>%
nest(-group) %>%
mutate(fit = map(data, ~ wtd.t.test(x=.%>%filter(var1 == "A")$var2,y=.%>% filter(var1 == "B")$var2,
weight=.%>% filter(var1 == "A")$weight,weighty=.%>% filter(var1 == "B")$weight,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
新的错误消息为:
Error in `mutate()`:
ℹ In argument: `fit = map(...)`.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `weight / mean(weight, na.rm = TRUE)`:
! non-numeric argument to binary operator
Backtrace:
1. ... %>% unnest(results)
10. purrr::map(...)
11. purrr:::map_("list", .x, .f, ..., .progress = .progress)
15. .f(.x[[i]], ...)
16. weights::wtd.t.test(...)
我所有的变量都是数字,除了Var1,它不用于计算,所以我不清楚为什么会出现这个错误消息。任何建议将不胜感激。
如果我重新格式化代码如下:
df %>%
nest(-country) %>%
mutate(fit = map(data, ~ wtd.t.test(x=filter(.,var1 == "A")$var2,y=filter(.,var1 == "B")$var2,
weight=filter(.,var1 == "A")$weight,weighty=filter(.,var1 == "B")$weight,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
现在错误变为:
Error in `mutate()`:
ℹ In argument: `fit = map(...)`.
Caused by error in `map()`:
ℹ In index: 1.
Caused by error in `wtd.t.test()`:
! object 'out' not found
Backtrace:
1. ... %>% unnest(results)
10. purrr::map(...)
11. purrr:::map_("list", .x, .f, ..., .progress = .progress)
15. .f(.x[[i]], ...)
16. weights::wtd.t.test(...)
更新2
下面是使用可复制示例更新的新代码:
library(weights)
library(tidyverse)
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~ wtd.t.test(x=.%>%filter(gear == 3)$disp,y=.%>% filter(gear = 4)$disp,
weight=.%>% filter(gear == 3)$wt,weighty=.%>% filter(gear == 4)$wt,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
并重新格式化:
mtcars %>%
nest(-cyl) %>%
mutate(fit = map(data, ~ wtd.t.test(x=filter(.,gear == 3)$disp,y=filter(.,gear == 4)$disp,
weight=filter(.,gear == 3)$weight,weighty=filter(.,gear == 4)$weight,samedata=FALSE)),
results = map(fit, glance)) %>%
unnest(results)
1条答案
按热度按时间k5ifujac1#
对于那些感兴趣的人,解决方案(使用
mtcars
数据集作为示例数据)如下:这将产生一个 Dataframe ,其中包含
cyl
的每个级别的每个数据子集的t检验统计量。