matplotlib 如何从pd.cut绘制装箱日期时间

lnxxn5zx  于 2023-05-07  发布在  其他
关注(0)|答案(2)|浏览(123)

所以我有这样的数据:

LFrame  Date_Time   DoW run_time    az  el  distance    pass_ID SV_ID   Direction   PFD_Jy
0   3114360965  2023-03-29 17:25:20 Wednesday   62720.0 349.254117  12.199639   2.171043e+06    2023_03_29_154  154 SB  8.505332
1   3114360977  2023-03-29 17:25:21 Wednesday   62721.0 349.216316  12.294878   2.164688e+06    2023_03_29_154  154 SB  1085.548185
2   3114360988  2023-03-29 17:25:22 Wednesday   62722.0 349.178240  12.390492   2.158335e+06    2023_03_29_154  154 SB  515.828602
3   3114360999  2023-03-29 17:25:23 Wednesday   62723.0 349.139888  12.486484   2.151987e+06    2023_03_29_154  154 SB  344.120530
4   3114361010  2023-03-29 17:25:24 Wednesday   62724.0 349.101256  12.582857   2.145641e+06    2023_03_29_154  154 SB  37.207705

...
我把它收起来,因为它在1秒,我不想把它画得那么密集:

binned = SV_pfd_data.groupby(pd.cut(SV_pfd_data.Date_Time, SV_pfd_data.shape[0]//5), as_index=True).mean() # ~1 min bins
binned = binned.reset_index()

从而得到以下数据:

Date_Time   LFrame  run_time    az  el  distance    SV_ID   PFD_Jy
0   (2023-03-29 17:25:19.408999936, 2023-03-29 17:...   3.114361e+09    62722.5 349.158693  12.438994   2.155165e+06    154.0   448.731927
1   (2023-03-29 17:25:25.008474624, 2023-03-29 17:...   3.114361e+09    62728.0 348.943580  12.972617   2.120295e+06    154.0   213.259464
2   (2023-03-29 17:25:30.016949248, 2023-03-29 17:...   3.114361e+09    62733.0 348.740192  13.468287   2.088688e+06    154.0   556.595627
3   (2023-03-29 17:25:35.025423616, 2023-03-29 17:...   3.114361e+09    62738.0 348.529055  13.974280   2.057173e+06    154.0   872.418091
  • 请注意,绑定日期/时间的分辨率优于微秒 * 不确定原因

但是,当我绘制时:

timeOnly = matdates.DateFormatter('%H:%M:%S')

fig, ax = plt.subplots(figsize=(10,5))
ax.plot_date(binned.Date_Time, binned.PFD_Jy,
       label=r"$\nu$=1611.1")
ax.set_ylabel("PFD [Jy]")
ax.set_xlabel("Date-time")
ax.set_xticklabels(SV_pfd_data.Date_Time, rotation = 65, fontsize=10)
ax.xaxis.set_major_formatter(timeOnly)

我得到一个错误:

OverflowError: int too big to convert
af7jpaap

af7jpaap1#

您需要将装箱的Date_Time列转换回datetime对象

binned = SV_pfd_data.groupby(pd.cut(SV_pfd_data.Date_Time, SV_pfd_data.shape[0]//5), as_index=True).mean()

binned['Date_Time'] = pd.to_datetime(binned['Date_Time'])

timeOnly = matdates.DateFormatter('%H:%M:%S')
fig, ax = plt.subplots(figsize=(10,5))
ax.plot_date(binned.Date_Time, binned.PFD_Jy, label=r"$\nu$=1611.1")
ax.set_ylabel("PFD [Jy]")
ax.set_xlabel('Date-time')
ax.set_xticklabels(binned.Date_Time.dt.strftime('%H:%M:%S'), rotation=65, fontsize=10)
ax.xaxis.set_major_formatter(timeOnly)
plt.show()
sg24os4d

sg24os4d2#

这一点:

binned = SV_pfd_data.groupby(pd.cut(SV_pfd_data.Date_Time, SV_pfd_data.shape[0]//5), as_index=True).mean() # ~1 min bins

不要cut,因为它返回interval作为bin,而是使用pd.Grouper来groupby time interval:

binned = SV_pfd_data.groupby(pd.Grouper(key='Date_Time', freq='5T')).mean()

注意不需要重置binned的索引即可绘图,plot_date已弃用:

timeOnly = matdates.DateFormatter('%H:%M:%S')

fig, ax = plt.subplots(figsize=(10,5))

# pass ax to pandas plot
binned.plot(label=r"$\nu$=1611.1", ax=ax)
ax.set_ylabel("PFD [Jy]")
ax.set_xlabel("Date-time")

# why ticks as original series when your data is grouped?
# ax.set_xticklabels(SV_pfd_data.Date_Time, rotation = 65, fontsize=10)

ax.xaxis.set_major_formatter(timeOnly)

相关问题