R语言 如何将第一天和最后一天的行填充到一个日期序列中?

7hiiyaii  于 2023-04-18  发布在  其他
关注(0)|答案(3)|浏览(119)

如果我有这张table:

a <- as.data.frame(matrix(c("1", "1", "1", "1", "2", "2", "A first day", "A last day", "B first day", "B last day", "A first day", "A last day", 3, 5, 10, 14, 2, 5), ncol = 3))
colnames(a) = c("Patient", "Treatment", "Day")

它看起来像这样:
| 病人|治疗|日|
| --------------|--------------|--------------|
| 1个|第一天|三|
| 1个|最后一天|五|
| 1个|B第一天|10个|
| 1个|B最后一天|十四岁|
| 二|第一天|二|
| 二|最后一天|五|
我怎样才能转换它,使每天从第一天到最后一天都有自己的一行,使表看起来像这样?
| 病人|治疗|日|
| --------------|--------------|--------------|
| 1个|A|三|
| 1个|A|四|
| 1个|A|五|
| 1个|B|10个|
| 1个|B|十一|
| 1个|B|十二岁|
| 1个|B|十三|
| 1个|B|十四岁|
| 二|A|二|
| 二|A|三|
| 二|A|四|
| 二|A|五|
谢谢大家!

lo8azlld

lo8azlld1#

我们需要为每个患者创建唯一的治疗组,因此我们需要去掉列中的“第一天”和“最后一天”字符串。然后group_byPatientTreatment列以创建唯一组,并使用reframe基于Dayminmax生成新的Day序列。

library(tidyverse)

a %>% 
  mutate(Treatment = sub(" first day| last day", "", Treatment),
         Day = as.integer(Day)) %>% 
  group_by(Patient, Treatment) %>% 
  reframe(Day = min(Day):max(Day))

# A tibble: 12 × 3
   Patient Treatment   Day
   <chr>   <chr>     <int>
 1 1       A             3
 2 1       A             4
 3 1       A             5
 4 1       B            10
 5 1       B            11
 6 1       B            12
 7 1       B            13
 8 1       B            14
 9 2       A             2
10 2       A             3
11 2       A             4
12 2       A             5
bxgwgixi

bxgwgixi2#

首先使用sub删除多余的文本。然后将by拆分为ID变量(在本例中为1:2)和cbind1 Dataframe 的第一行减去日期变量-3,然后在do.call中使用:来获得序列,最后rbind所有内容。我添加了一些额外的列来证明它们是保留的。

a$Treatment <- sub('\\s.*', '', a$Treatment)
by(a, a[1:2], \(x) cbind(x[1, -3], Day=do.call(`:`, as.list(as.integer(sort(x[[3]])))), row.names=NULL)) |> 
  do.call(what=rbind)
#    Patient Treatment Sex   X Day
# 1        1         A   m 0.5   3
# 2        1         A   m 0.5   4
# 3        1         A   m 0.5   5
# 4        2         A   f 0.4   2
# 5        2         A   f 0.4   3
# 6        2         A   f 0.4   4
# 7        2         A   f 0.4   5
# 8        1         B   f 0.3  10
# 9        1         B   f 0.3  11
# 10       1         B   f 0.3  12
# 11       1         B   f 0.3  13
# 12       1         B   f 0.3  14
  • 数据:*
a <- structure(list(Patient = c(1L, 1L, 1L, 1L, 2L, 2L), Treatment = c("A first day", 
"A last day", "B first day", "B last day", "A first day", "A last day"
), Day = c(3L, 5L, 10L, 14L, 2L, 5L), Sex = c("m", "m", "f", 
"f", "f", "f"), X = c(0.5, 0.5, 0.3, 0.3, 0.4, 0.4)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))
6fe3ivhb

6fe3ivhb3#

以下是completefill组合的解决方案:

library(dplyr)
library(tidyr)

a %>%
  separate(Treatment, into = c("Treatment", "first_last"), sep = " ", extra = "drop") %>% 
  group_by(Patient, Treatment) %>%
  mutate(Day = as.integer(Day)) %>% 
  complete(Day = seq(min(Day), max(Day))) %>%
  fill(Treatment, .direction = "updown") %>% 
  select(-first_last) %>% 
  ungroup()
# A tibble: 12 × 3
   Patient Treatment   Day
   <chr>   <chr>     <int>
 1 1       A             3
 2 1       A             4
 3 1       A             5
 4 1       B            10
 5 1       B            11
 6 1       B            12
 7 1       B            13
 8 1       B            14
 9 2       A             2
10 2       A             3
11 2       A             4
12 2       A             5

相关问题