如果值大于100,我尝试为df中的选定列分配“1”,否则为“0”。
vz201408 <- as.numeric(c("0.00","300.00","500.00","0.00","0.00","210","0.00","0.00","0.00","0.00"))
vz201409 <- as.numeric(c("0.00","200.00","200.00","500.00","0.00","500","100","0.00","0.00","0.00"))
vz201410 <- as.numeric(c("100","100","100","0.00","0.00","21.00","0.00","0.00","0.00","482"))
vz201411 <- as.numeric(c("0.00","600.00","900.00","100.00","120.00","0.00","50.00","0.00","800.00","664 "))
vz201412 <- as.numeric(c("30.00","40.00","50.00","0.00","9.00","8.00","1.00","4.00","0.00","0.00"))
vz201501 <- as.numeric(c("500.00","100.00","100.00","500.00","0.00","200.00","0.00","0.00","100.00","200.00"))
vz201502 <- as.numeric(c("500.00","100.00","100.00","500.00","0.00","200.00","0.00","0.00","100.00","200.00"))
vz201503 <- as.numeric(c("500.00","100.00","100.00","600.00","0.00","200.00","0.00","0.00","90.00","100.00"))
vz201604 <- as.numeric(c("50.00","10.00","10.00","400.00","100.00","200.00","0.00","0.00","10.00","20.00"))
vz201701 <- as.numeric(c("100.00","10.00","10.00","50.00","0.00","200.00","0.00","300.00","100.00","200.00"))
df <- data.frame(vz201408,vz201409, vz201410, vz201411,vz201412,vz201501,vz201502,vz201503,vz201604,vz201701)
我正在使用这个循环,但它错误地分配了1/0。即使值低于阈值100,它也会给出“1”。请提供任何建议?谢谢!
# Select only the following columns
col_indices <- grep("^vz2014(1[0-2])$|^vz2015\\d{2}$|^vz2016(0[1-9]|1[0-2])$", names(df))
col_names <- names(df)[col_indices]
for (c in col_names) {
for (i in 1:length(col_names)) {
df[, paste0('Empl_vz', i)] <- ifelse(df[, c] >= 100,
1,
0)
}
}
2条答案
按热度按时间yvt65v4c1#
如果您的
c
已经表示了一个列名,那么为什么还要迭代所有列呢?相反,我认为我们可以对列名进行seq_along
并使用单个循环。(另外,您正在派生逻辑0/1,因此我们可以简化条件并删除ifelse
。)仅供参考,进程失败的原因是,虽然它正确地计算了每一列,但它覆盖了循环中上一次迭代的结果。您的基本操作失败了,只是您的循环应该准确地触发
length(col_names)
(这里是5)次,而不是触发5 * 5
次,在第21-25次,它用col_names[5]
中的列覆盖了所有列。一个更快的方法,没有循环:
+(.)
工作的原因是通过R的类到类的强制。您可以对逻辑(TRUE
/FALSE
)进行基本的数学运算,因为R将其转换为1
/0
。最简单的数学运算是数字或表达式前的一元+
。(尝试+TRUE
和+FALSE
进行确认。)nzk0hqpo2#
不确定,但也许您希望输出具有原始数据框维度的数据框。使用
if
/else
条件来标识所需的列。