生成包含时间(分钟-R)的列

mdfafbf1  于 2023-02-06  发布在  其他
关注(0)|答案(1)|浏览(78)

在R中
我有一个包含24400个观测值(行)的数据框。在这个数据框中,我有一个包含小时索引的df $hours列(如下所示)。基本上,每个小时在传递到下一个小时之前重复大约60次,直到到达第23个小时,然后再次返回"o"(如下所示
df $小时

1] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[12] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[23] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[34] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[45] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[56] "00" "00" "00" "00" "00" "01" "01" "01" "01" "01" "01"
[67] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[78] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[89] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[100] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[111] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "02"
[122] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[133] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[144] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[155] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[166] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[177] "02" "02" "02" "02" "03" "03" "03" "03" "03" "03" "03"
[188] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[199] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[210] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[221] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[232] "03" "03" "03" "03" "03" "03" "03" "03" "03" "04" "04"
[243] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[254] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[265] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[276] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[287] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[298] "04" "04" "04" "05" "05" "05" "05" "05" "05" "05" "05"
[309] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[320] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[331] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[342] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[353] "05" "05" "05" "05" "05" "05" "05" "05" "06" "06" "06"
[364] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[375] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[386] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[397] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[408] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[419] "06" "06" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[430] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[441] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[452] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[463] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[474] "07" "07" "07" "07" "07" "07" "07" "08" "08" "08" "08"
[485] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[496] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[507] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[518] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[529] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[540] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[551] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[562] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[573] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[584] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[595] "18" "18" "18" "18" "18" "19" "19" "19" "19" "19" "19"
[606] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[617] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[628] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[639] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[650] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "20"
[661] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[672] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[683] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[694] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[705] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[716] "20" "20" "20" "20" "21" "21" "21" "21" "21" "21" "21"
[727] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[738] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[749] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[760] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[771] "21" "21" "21" "21" "21" "21" "21" "21" "21" "22" "22"
[782] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[793] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[804] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[815] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[826] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[837] "22" "22" "22" "23" "23" "23" "23" "23" "23" "23" "23"
[848] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[859] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[870] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[881] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[892] "23" "23" "23" "23" "23" "23" "00" "00" "00" "00" "00"
[903] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[914] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[925] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[936] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[947] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[958] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[969] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[980] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[991] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[ reached getOption("max.print") -- omitted 23459 entries ]

我的目标是生成另一列df $minutes,显示每小时内的分钟数。例如,如果我们开始遵循我给出的命令,它将是
| 色谱柱A|B栏|
| - ------|- ------|
| 无|1个|
| 无|第二章|
| 无|三个|
| 。|。|
| 。|。|
| 。|。|
| 1个|六十一|
| 1个|六十二|
| 。|。|
| 。|。|
| 。|。|
| 二十三|小行星138|
| 无|1个|
| 无|1个|
| 无|第二章|
| 无|三个|
| 。|。|
| 。|。|
| 。|。|
| 1个|六十一|
| 1个|六十二|
| 。|。|
| 。|。|
| 。|。|
| 二十三|小行星138|
| 无|1个|
请不要做两件事:1-我没有每个小时的确切分钟数(有时是60分钟,有时更少)2-我没有一天中的所有小时(8 - 18之间有一个间隙)
我试过这个,但我认为这是建议我们每个小时有相同的分钟数(也就是60分钟)

`timepf <- as.numeric(df$hours)
n <- length(timepf)

xtime <- numeric()
start <- 1
while (start <= n) {
  end <- min(start + 59, n)
  part <- timepf[start:end]
  xtime_part <- part * 60 + (1:length(part))
  xtime <- c(xtime, xtime_part)
  start <- end + 1
}
xtime
length(xtime)
`

xtime是我们要创建的minutes列。

yfjy0ee7

yfjy0ee71#

我将使用小得多的数据进行演示,每天仅3小时,每小时仅4分钟。

df <- data.frame(hour = rep(sprintf("%02i", rep(0:2, each=4)), 3))

请注意,我的代码将假设完整的60分钟,所以它应该按原样为您的数据工作。
不过,我们需要做的第一件事是说明"天":每次hour"00"以外的值增加到"00"时,增加一天,之后,就简单地变成summarize by group

library(dplyr)
df %>%
  mutate(day = cumsum(hour == "00" & lag(hour != "00", default = FALSE))) %>%
  group_by(day, hour) %>%
  mutate(minute = as.integer(hour) * 60 + row_number()) %>%
  ungroup()
# # A tibble: 36 × 3
#    hour    day minute
#    <chr> <int>  <dbl>
#  1 00        0      1
#  2 00        0      2
#  3 00        0      3
#  4 00        0      4
#  5 01        0     61
#  6 01        0     62
#  7 01        0     63
#  8 01        0     64
#  9 02        0    121
# 10 02        0    122
# # … with 26 more rows
# # ℹ Use `print(n = ...)` to see more rows

相信我的话,在day = 1(第二天),minute重置为1(或者看看下面的基本R代码,它是值)。

碱基R

df$day <- cumsum(df$hour == "00" & c(FALSE, df$hour[-nrow(df)] != "00"))
df$day <- cumsum(df$hour == "00" & c(FALSE, df$hour[-nrow(df)] != "00"))

我会注意到,如果您认为您可能有一天的重新启动缺少任何"00"小时,您可以使用另一种方法来确定day,假设hour中的任何 * 减少 * 都意味着新的一天:

df$day2 <- cumsum(diff(as.integer(df$hour)) < 0)

从这里开始,我们将使用ave

df$minute <- ave(as.integer(df$hour), df[c("day","hour")], FUN = function(z) as.integer(z) * 60 + seq_along(z))
df
#    hour day minute
# 1    00   0      1
# 2    00   0      2
# 3    00   0      3
# 4    00   0      4
# 5    01   0     61
# 6    01   0     62
# 7    01   0     63
# 8    01   0     64
# 9    02   0    121
# 10   02   0    122
# 11   02   0    123
# 12   02   0    124
# 13   00   1      1
# 14   00   1      2
# 15   00   1      3
# 16   00   1      4
# 17   01   1     61
# 18   01   1     62
# 19   01   1     63
# 20   01   1     64
# 21   02   1    121
# 22   02   1    122
# 23   02   1    123
# 24   02   1    124
# 25   00   2      1
# 26   00   2      2
# 27   00   2      3
# 28   00   2      4
# 29   01   2     61
# 30   01   2     62
# 31   01   2     63
# 32   01   2     64
# 33   02   2    121
# 34   02   2    122
# 35   02   2    123
# 36   02   2    124

相关问题