使用row_number()- R时跳过特定观测

vjhs03f7  于 2023-05-20  发布在  其他
关注(0)|答案(3)|浏览(168)

我本质上是在寻找一个“next”语句,我可以在dplyr ifelse语句中使用它,尽管其他R替代方案也是受欢迎的。
下面是到目前为止的代码:

df1 <- data%>%
  arrange(Var1, Var2, Var3, Var4, Var5)%>%
  group_by(Var1)%>%
  distinct(Var1, Var2, Var3, Var4, Var5)%>%
  mutate(Var6 = ifelse(Var4 == "COMPLETE", row_number(), row_number()+1))

输出为(相关版本)

| Var4         | Var6         |
  | ------------ | -------------|
  | COMPLETE     | 1            |
**| INCOMPLETE   | 3            |**
  | COMPLETE     | 3            |
  | COMPLETE     | 4            |
  | COMPLETE     | 5            |
**| INCOMPLETE   | 7            |**
  | COMPLETE     | 7            |
  | COMPLETE     | 8            |
  | COMPLETE     | 9            |

预期输出为

| Var4         | Var6         |
  | ------------ | -------------|
  | COMPLETE     | 1            |
**| INCOMPLETE   | 2            |**
  | COMPLETE     | 2            |
  | COMPLETE     | 3            |
  | COMPLETE     | 4            |
**| INCOMPLETE   | 5            |**
  | COMPLETE     | 5            |
  | COMPLETE     | 6            |
  | COMPLETE     | 7            |

总之,我的目标是当Var4 == INCOMPLETE时,我可以忽略该行并继续使用row_number()

u0njafvf

u0njafvf1#

这里有一个方法

library(data.table)
library(dplyr)
library(tidyr)
setDT(df1)[Var4 == "COMPLETE", Var6 := .I]
df1 %>% 
   fill(Var6, .direction = "updown")
  • 输出
Var4 Var6
1:   COMPLETE    1
2: INCOMPLETE    2
3:   COMPLETE    2
4:   COMPLETE    3
5:   COMPLETE    4
6: INCOMPLETE    5
7:   COMPLETE    5
8:   COMPLETE    6
9:   COMPLETE    7

tidyverse

df1 %>% 
   mutate(Var6 = na_if(replace(Var4, Var4 == "COMPLETE", 
     seq_len(sum(Var4 == "COMPLETE"))), "INCOMPLETE")) %>%
   fill(Var6, .direction = "updown")
        Var4 Var6
1   COMPLETE    1
2 INCOMPLETE    2
3   COMPLETE    2
4   COMPLETE    3
5   COMPLETE    4
6 INCOMPLETE    5
7   COMPLETE    5
8   COMPLETE    6
9   COMPLETE    7

数据

df1 <- structure(list(Var4 = c("COMPLETE", "INCOMPLETE", "COMPLETE", 
"COMPLETE", "COMPLETE", "INCOMPLETE", "COMPLETE", "COMPLETE", 
"COMPLETE")), class = "data.frame", row.names = c(NA, -9L))
kqhtkvqz

kqhtkvqz2#

我们可以使用cumsumreplacecase_when

library(dplyr)

df1 %>% mutate(var6 = cumsum(Var4=='COMPLETE') %>%
                      replace(., Var4=='INCOMPLETE', . +1))

#OR

df1 %>% mutate(var6 = cumsum(Var4=='COMPLETE') %>% 
                      case_when(Var4=='INCOMPLETE', ~ . +1))
von4xj4u

von4xj4u3#

谢谢你的回答。我不能让你的答案工作,但它帮助我工作了:

df1 <- data%>%
      arrange(Var1, Var2, Var3, Var4, Var5)%>%
      group_by(Var1)%>%
      distinct(Var1, Var2, Var3, Var4, Var5)%>%
      mutate(Var6 = cumsum(Var4=='COMPLETE') %>%
               ifelse(Var4=='INCOMPLETE', .+1, .))

相关问题