在R中
我有一个包含24400个观测值(行)的数据框。在这个数据框中,我有一个包含小时索引的df $hours列(如下所示)。基本上,每个小时在传递到下一个小时之前重复大约60次,直到到达第23个小时,然后再次返回"o"(如下所示
df $小时
1] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[12] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[23] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[34] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[45] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[56] "00" "00" "00" "00" "00" "01" "01" "01" "01" "01" "01"
[67] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[78] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[89] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[100] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[111] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "02"
[122] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[133] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[144] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[155] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[166] "02" "02" "02" "02" "02" "02" "02" "02" "02" "02" "02"
[177] "02" "02" "02" "02" "03" "03" "03" "03" "03" "03" "03"
[188] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[199] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[210] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[221] "03" "03" "03" "03" "03" "03" "03" "03" "03" "03" "03"
[232] "03" "03" "03" "03" "03" "03" "03" "03" "03" "04" "04"
[243] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[254] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[265] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[276] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[287] "04" "04" "04" "04" "04" "04" "04" "04" "04" "04" "04"
[298] "04" "04" "04" "05" "05" "05" "05" "05" "05" "05" "05"
[309] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[320] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[331] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[342] "05" "05" "05" "05" "05" "05" "05" "05" "05" "05" "05"
[353] "05" "05" "05" "05" "05" "05" "05" "05" "06" "06" "06"
[364] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[375] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[386] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[397] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[408] "06" "06" "06" "06" "06" "06" "06" "06" "06" "06" "06"
[419] "06" "06" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[430] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[441] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[452] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[463] "07" "07" "07" "07" "07" "07" "07" "07" "07" "07" "07"
[474] "07" "07" "07" "07" "07" "07" "07" "08" "08" "08" "08"
[485] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[496] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[507] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[518] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[529] "08" "08" "08" "08" "08" "08" "08" "08" "08" "08" "08"
[540] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[551] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[562] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[573] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[584] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
[595] "18" "18" "18" "18" "18" "19" "19" "19" "19" "19" "19"
[606] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[617] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[628] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[639] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "19"
[650] "19" "19" "19" "19" "19" "19" "19" "19" "19" "19" "20"
[661] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[672] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[683] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[694] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[705] "20" "20" "20" "20" "20" "20" "20" "20" "20" "20" "20"
[716] "20" "20" "20" "20" "21" "21" "21" "21" "21" "21" "21"
[727] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[738] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[749] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[760] "21" "21" "21" "21" "21" "21" "21" "21" "21" "21" "21"
[771] "21" "21" "21" "21" "21" "21" "21" "21" "21" "22" "22"
[782] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[793] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[804] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[815] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[826] "22" "22" "22" "22" "22" "22" "22" "22" "22" "22" "22"
[837] "22" "22" "22" "23" "23" "23" "23" "23" "23" "23" "23"
[848] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[859] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[870] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[881] "23" "23" "23" "23" "23" "23" "23" "23" "23" "23" "23"
[892] "23" "23" "23" "23" "23" "23" "00" "00" "00" "00" "00"
[903] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[914] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[925] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[936] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[947] "00" "00" "00" "00" "00" "00" "00" "00" "00" "00" "00"
[958] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[969] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[980] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[991] "01" "01" "01" "01" "01" "01" "01" "01" "01" "01"
[ reached getOption("max.print") -- omitted 23459 entries ]
我的目标是生成另一列df $minutes,显示每小时内的分钟数。例如,如果我们开始遵循我给出的命令,它将是
| 色谱柱A|B栏|
| - ------|- ------|
| 无|1个|
| 无|第二章|
| 无|三个|
| 。|。|
| 。|。|
| 。|。|
| 1个|六十一|
| 1个|六十二|
| 。|。|
| 。|。|
| 。|。|
| 二十三|小行星138|
| 无|1个|
| 无|1个|
| 无|第二章|
| 无|三个|
| 。|。|
| 。|。|
| 。|。|
| 1个|六十一|
| 1个|六十二|
| 。|。|
| 。|。|
| 。|。|
| 二十三|小行星138|
| 无|1个|
请不要做两件事:1-我没有每个小时的确切分钟数(有时是60分钟,有时更少)2-我没有一天中的所有小时(8 - 18之间有一个间隙)
我试过这个,但我认为这是建议我们每个小时有相同的分钟数(也就是60分钟)
`timepf <- as.numeric(df$hours)
n <- length(timepf)
xtime <- numeric()
start <- 1
while (start <= n) {
end <- min(start + 59, n)
part <- timepf[start:end]
xtime_part <- part * 60 + (1:length(part))
xtime <- c(xtime, xtime_part)
start <- end + 1
}
xtime
length(xtime)
`
xtime是我们要创建的minutes列。
1条答案
按热度按时间yfjy0ee71#
我将使用小得多的数据进行演示,每天仅3小时,每小时仅4分钟。
请注意,我的代码将假设完整的60分钟,所以它应该按原样为您的数据工作。
不过,我们需要做的第一件事是说明"天":每次
hour
从"00"
以外的值增加到"00"
时,增加一天,之后,就简单地变成summarize by group。天
相信我的话,在day = 1(第二天),
minute
重置为1
(或者看看下面的基本R代码,它是值)。碱基R
我会注意到,如果您认为您可能有一天的重新启动缺少任何
"00"
小时,您可以使用另一种方法来确定day
,假设hour
中的任何 * 减少 * 都意味着新的一天:从这里开始,我们将使用
ave
: