R语言 将df从wide整形为long创建一个带有事件名称的新列

eivnm1vs  于 2023-03-27  发布在  其他
关注(0)|答案(1)|浏览(117)

我有一个 Dataframe ,其中包含各种事件开始和结束的不同时间点的数据。 Dataframe 看起来像这样:

Date    Event 1 start   Event 1 end Event 2 start   Event 2 end Event 3 start   Event 3 end
01/01/2020  06:35   18:02   06:13   18:24   03:15   15:18
02/01/2020  06:36   18:02   06:13   18:24   04:02   15:57
03/01/2020  06:36   18:01   06:13   18:24   04:50   16:37
04/01/2020  06:36   18:01   06:14   18:24   05:39   17:21
05/01/2020  06:36   18:01   06:14   18:23   06:28   18:06
06/01/2020  06:37   18:01   06:14   18:23   07:18   18:54
07/01/2020  06:37   18:01   06:14   18:23   08:07   19:43
08/01/2020  06:37   18:01   06:15   18:23   08:56   20:34
09/01/2020  06:38   18:01   06:15   18:23   09:43   21:26
10/01/2020  07:38   19:01   07:15   19:23   10:43   22:26
11/01/2020  08:38   20:01   08:15   20:23   11:43   23:26
12/01/2020  09:38   21:01   09:15   21:23   12:43   00:26

我想重塑它从宽到长创建一个新的“事件”列,所以我希望得到这样的东西:

Date    Event   Start   End
01/01/2020  Event 1 06:24   18:10
01/01/2020  Event 2 18:10   18:54
01/01/2020  Event 3 18:54   23:59
02/01/2020  Event 1 06:24   18:10
02/01/2020  Event 2 18:10   18:54
02/01/2020  Event 3 18:54   23:59
03/01/2020  Event 1 06:24   18:10
03/01/2020  Event 2 18:10   18:54
03/01/2020  Event 3 18:54   23:59
04/01/2020  Event 1 06:24   18:10
04/01/2020  Event 2 18:10   18:54
04/01/2020  Event 3 18:54   23:59

我试过使用dplyr的gather函数,但它没有给我想要的结果。
任何帮助都将不胜感激。
谢谢!

mspsb9vt

mspsb9vt1#

这些名称有点脏,所以一个可能的方法是在透视之前整理它们。

tibble::tribble(
         ~Date, ~Event.1.start, ~Event.1.end, ~Event.2.start, ~Event.2.end, ~Event.3.start, ~Event.3.end,
  "01/01/2020",        "06:35",      "18:02",        "06:13",      "18:24",        "03:15",      "15:18",
  "02/01/2020",        "06:36",      "18:02",        "06:13",      "18:24",        "04:02",      "15:57",
  "03/01/2020",        "06:36",      "18:01",        "06:13",      "18:24",        "04:50",      "16:37",
  "04/01/2020",        "06:36",      "18:01",        "06:14",      "18:24",        "05:39",      "17:21",
  "05/01/2020",        "06:36",      "18:01",        "06:14",      "18:23",        "06:28",      "18:06",
  "06/01/2020",        "06:37",      "18:01",        "06:14",      "18:23",        "07:18",      "18:54",
  "07/01/2020",        "06:37",      "18:01",        "06:14",      "18:23",        "08:07",      "19:43",
  "08/01/2020",        "06:37",      "18:01",        "06:15",      "18:23",        "08:56",      "20:34",
  "09/01/2020",        "06:38",      "18:01",        "06:15",      "18:23",        "09:43",      "21:26",
  "10/01/2020",        "07:38",      "19:01",        "07:15",      "19:23",        "10:43",      "22:26",
  "11/01/2020",        "08:38",      "20:01",        "08:15",      "20:23",        "11:43",      "23:26",
  "12/01/2020",        "09:38",      "21:01",        "09:15",      "21:23",        "12:43",      "00:26"
  ) %>%
  pivot_longer(-Date) %>%
  # create separation to be able to pivot wider using values from start and end 
  mutate(event = readr::parse_number(stringr::str_replace(pattern = "\\.", replacement="", string=name)),
         name = stringr::str_extract(name, pattern="start|end")) %>%
  pivot_wider(id_cols = c(Date, event), names_from = name, values_from = value)

这就产生了

# A tibble: 36 × 4
   Date       event start end  
   <chr>      <dbl> <chr> <chr>
 1 01/01/2020     1 06:35 18:02
 2 01/01/2020     2 06:13 18:24
 3 01/01/2020     3 03:15 15:18
 4 02/01/2020     1 06:36 18:02
 5 02/01/2020     2 06:13 18:24
 6 02/01/2020     3 04:02 15:57
 7 03/01/2020     1 06:36 18:01
 8 03/01/2020     2 06:13 18:24
 9 03/01/2020     3 04:50 16:37
10 04/01/2020     1 06:36 18:01
# … with 26 more rows
# ℹ Use `print(n = ...)` to see more rows

请注意,您给定的数据集可能略有不同。我必须从您提供的复制粘贴中清除它们。如果您提供数据的str(),可能会更容易,但主要思想在这里。
1.使用Date透视所有内容
1.从变量name创建新变量
1.将轴旋转得更宽,这样我们就可以使用新列来获取名称和值

相关问题