R语言 通过组合和排除来重新分类虚拟变量

smdnsysy  于 2023-03-05  发布在  其他
关注(0)|答案(1)|浏览(124)

我有一个有8个种族变量的数据集,我需要在不同的条件下将其改为7个。受访者被指示选择尽可能多的种族。例如,ethnicity_2表示"白人"。一些"混合"种族的人选择了"白人"加上另一个种族。我想做一个新的种族,"白人",这意味着回答者选择了"白人"而不是其他种族。2我试过了,但是失败了。3下面是我试过的代码的一部分。

ethnicities.19 <- c("ethnicity_1", "ethnicity_2", "ethnicity_3", "ethnicity_4",
                    "ethnicity_5", "ethnicity_6", "ethnicity_7", "ethnicity_8")
bar <- foo %>%
    select(ID, ethnicity_1:ethnicity_8) %>%
    mutate(across(.cols=ethnicity_1:ethnicity_8, .fns=function(x) { ifelse(is.na(x), 0, x)} )) %>%
    rowwise() %>%
     mutate(dnresp=ifelse(sum(eval(as.name(ethnicities.19)))==0, 1, 0),
##         dnresp=ifelse(!any(eval(as.name(ethnicities.19))==1), 1, 0),
            white=ifelse(eval(as.name(ethnicities.19[2]))==1 & sum(eval(as.name(ethnicities.19[-c(2)])))==0, 1, 0))

等等。有了这段代码,dnresp变量几乎在每一种情况下都会被不恰当地设置。而且有几个选择ethnicity_2和其他种族的人会被标记为"白人"。
我也试过这个:

dnresp=ifelse(!any({{ethnicities.19}}==1), 1, 0))

但这告诉我所有的受访者都没有选择种族。奇怪的是,我从这个代码中也得到了同样的结果:

dnresp=ifelse(!any({{ethnicities.19}}==0), 1, 0))

感谢你的帮助。
以下是一个数据样本:

structure(list(ID = c("ATL_01", "ATL_02", "ATL_03", "ATL_04", 
"ATL_05", "ATL_06", "ATL_07", "ATL_08", "ATL_09", "ATL_10", "ATL_11", 
"ATL_12", "ATL_13", "ATL_14", "ATL_15", "ATL_16", "ATL_17", "ATL_18", 
"ATL_19", "ATL_20"), ethnicity_1 = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), ethnicity_2 = c(1, 1, 1, NA, 1, 
NA, 1, NA, NA, 1, NA, NA, 1, NA, NA, NA, NA, 1, NA, NA), ethnicity_3 = c(NA, 
NA, NA, 1, NA, 1, NA, 1, 1, NA, 1, 1, NA, 1, 1, 1, 1, NA, 1, 
1), ethnicity_4 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), ethnicity_5 = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), ethnicity_6 = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_), ethnicity_7 = c(NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_), ethnicity_8 = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))
ovfsdjhp

ovfsdjhp1#

你可以考虑向量化在R中,使用rowSums我们可以创建一个向量,在所有非种族_2变量中有0&将它与种族_2中有1结合起来。

foo$white <- +((rowSums(foo[setdiff(ethnicities.19, 'ethnicity_2')], na.rm=TRUE) == 0) &
                 foo$ethnicity_2 == 1)

对于根本没有值的无响应(我假设这就是dnrsp的意思),我们可以在整个变量集上使用!is.na,其中我们希望rowSums0

foo$dnrsp <- +(rowSums(!is.na(foo[ethnicities.19])) == 0)

+将布尔值强制转换为整数,您也可以使用as.integer(),但它更长。

给出:

head(as.data.frame(foo))
#       ID ethnicity_1 ethnicity_2 ethnicity_3 ethnicity_4 ethnicity_5 ethnicity_6 ethnicity_7 ethnicity_8 white dnrsp
# 1 ATL_01          NA           1          NA          NA          NA          NA          NA          NA     1     0
# 2 ATL_02          NA           1          NA          NA          NA          NA          NA          NA     1     0
# 3 ATL_03          NA           1          NA          NA          NA          NA          NA          NA     1     0
# 4 ATL_04          NA          NA           1          NA          NA          NA          NA          NA     0     0
# 5 ATL_05          NA           1          NA          NA          NA          NA          NA          NA     1     0
# 6 ATL_06          NA          NA           1          NA          NA          NA          NA          NA     0     0

相关问题