pandas 在python中获取时间序列的小时数

kse8i1jr 于 2023-01-11 发布在 Python

关注(0)|答案(1)|浏览(276)

这看起来像是个小问题我有一个数据点列表，这些数据点每5分钟记录一次，重叠时间为2.5分钟（2.5分钟）。我还有记录开始的时间戳和另一个需要开始计时的时间戳（例如，计时器开始）：x1c 0d1x
我需要计算从计时器开始到记录结束已经过去了多少小时，并创建一个 Dataframe ，其中一列是记录，另一列是记录所属的计时器开始的小时：例如：
| 记录|计时器启动后小时数|
| - ------|- ------|
| 零点二六二|无|
| 零点二四三|无|
| 零点二六三|无|
| 0.342| 1个|
| 零点七六五|1个|
| 零点一一一|1个|
| ...|...|
这是我在python中的做法：

import numpy as np
import pandas as pd
from math import floor

recordings = list(np.random.rand(1000)) # an example of recording

chronometer_start = 1670000000 #timestamp
start_recording = 1673280570 #timestamp
gap_in_seconds = start_recording - chronometer_start

# given that the recordings are of 5 minutes each but with 2.5 minutes overlap,
# I can calculate how many Null values to add at the beginning of the recording to
# fill the gap from the chronometer start:
gap_in_n_records = round(gap_in_seconds / 60 / 2.5)

# fill the gap with null values
recordings = [np.nan for _ in range(gap_in_n_records)] + recordings 

minutes = [5] # the first recording has no overlap
for _ in range(len(recordings)-1):
    minutes += [minutes[-1]+2.5]
hours = pd.Series(minutes).apply(lambda x: floor(x/60))

df = pd.DataFrame({
    'recording' : recordings,
    'hour' : hours
})

但我担心我会犯一些错误，因为我的数据与我的结果不一致。有更好的方法吗？

pandas

来源：https://stackoverflow.com/questions/75060194/get-the-time-in-hours-of-a-time-series-in-python

1条答案

按热度按时间

bq3bfh9z1#

首先，总结一下，看看我的理解是否正确。你有一个在某个时间点开始的计时器（可能是几天前/几周前），你有一个数据点，所有这些数据点都需要5分钟。你要寻找的是数据点结束的小时（计时器开始后）。
对于前5条记录，将是：
| 记录索引|记录开始后分钟|
| - ------| - ------|
| 1个|五个|
| 第二章|七点五|
| 三个|十个|
| 四个|十二点五|
| 五个|十五|
所以我们可以总结成一个公式：
数据点n自记录开始以来经过的时间（分钟）：5+（n-1）* 2.5
我们可以使用此公式和DataFrame的索引来计算自记录开始以来经过的时间，然后将记录开始和计时器开始之间经过的时间相加：

import numpy as np
import pandas as pd

df = pd.DataFrame({"recordings": np.random.rand(1000)})

chronometer_start = 1670000000  # timestamp
start_recording = 1673280570  # timestamp
gap_in_seconds = start_recording - chronometer_start  

# since the index of a pandas DataFrame starts at 0, we can make use of that (idx=n-1)
df["seconds_passed_since_chronometer_start"] = 5 + df.index * (2.5 * 60) + (gap_in_seconds) 

# assuming that the first hour after the chronometer starts is hour 0, the column would be: 
df["hours"] = df["seconds_passed_since_chronometer_start"].apply(lambda x: int(x) // 3600)

final_df = df[["recordings", "hours"]]

赞(0）回复(0）举报 2023-01-11

我来回答

pandas 在python中获取时间序列的小时数

1条答案

相关问题

热门标签

最新问答