pandas x轴上的绘图时间

9lowa7mx  于 2023-05-12  发布在  其他
关注(0)|答案(3)|浏览(129)

我正在处理一个只包含datetime对象的数据集,我已经检索了一周中的一天,并在一个单独的列中重新格式化了时间,如下所示(转换函数如下所示):

datetime            day_of_week time_of_day
0   2021-06-13 12:56:16 Sunday      20:00:00
5   2021-06-13 12:56:54 Sunday      20:00:00
6   2021-06-13 12:57:27 Sunday      20:00:00
7   2021-07-16 18:55:42 Friday      20:00:00
8   2021-07-16 18:56:03 Friday      20:00:00
9   2021-06-04 18:42:06 Friday      20:00:00
10  2021-06-04 18:49:05 Friday      20:00:00
11  2021-06-04 18:58:22 Friday      20:00:00

我想做的是用x-axis = time_of_day(从00:00:0023:59:59)创建一个kde图,y-axis是一天中每个小时每个day_of_week的计数,hue = day_of_week。本质上,我有七个不同的分布,代表一周中每一天发生的事情。
下面是一个数据样本和我的代码。任何帮助将不胜感激:

df = pd.DataFrame([
    '2021-06-13 12:56:16',
    '2021-06-13 12:56:16',
    '2021-06-13 12:56:16',
    '2021-06-13 12:56:16',
    '2021-06-13 12:56:54',
    '2021-06-13 12:56:54',
    '2021-06-13 12:57:27',
    '2021-07-16 18:55:42',
    '2021-07-16 18:56:03',
    '2021-06-04 18:42:06',
    '2021-06-04 18:49:05',
    '2021-06-04 18:58:22',
    '2021-06-08 21:31:44',
    '2021-06-09 02:14:30',
    '2021-06-09 02:20:19',
    '2021-06-12 18:05:47',
    '2021-06-15 23:46:41',
    '2021-06-15 23:47:18',
    '2021-06-16 14:19:08',
    '2021-06-17 19:08:17',
    '2021-06-17 22:37:27',
    '2021-06-21 23:31:32',
    '2021-06-23 20:32:09',
    '2021-06-24 16:04:21',
    '2020-05-22 18:29:02',
    '2020-05-22 18:29:02',
    '2020-05-22 18:29:02',
    '2020-05-22 18:29:02',
    '2020-08-31 21:38:07',
    '2020-08-31 21:38:22',
    '2020-08-31 21:38:42',
    '2020-08-31 21:39:03',
], columns=['datetime'])

def convert_date(date):
    return calendar.day_name[date.weekday()]

def convert_hour(time):
    return time[:2]+':00:00'

df['day_of_week'] = pd.to_datetime(df['datetime']).apply(convert_date)
df['time_of_day'] = df['datetime'].astype(str).apply(convert_hour)
xzv2uavs

xzv2uavs1#

让我们试试:
1.转换datetimeto_datetime
1.从day_of_week codes创建分类列(以便分类排序功能正确)
1.将time_of_day归一化为一天(因此比较功能正确)。这使得所有事件似乎都发生在同一天内,使绘图逻辑简单得多。
1.绘制kdeplot
1.设置xaxis格式化程序只显示HH:MM:SS

import calendar

import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt, dates as mdates

# df = pd.DataFrame({...})

# Convert to datetime
df['datetime'] = pd.to_datetime(df['datetime'])
# Create Categorical Column
cat_type = pd.CategoricalDtype(list(calendar.day_name), ordered=True)
df['day_of_week'] = pd.Categorical.from_codes(
    df['datetime'].dt.day_of_week, dtype=cat_type
)
# Create Normalized Date Column
df['time_of_day'] = pd.to_datetime('2000-01-01 ' +
                                   df['datetime'].dt.time.astype(str))

# Plot
ax = sns.kdeplot(data=df, x='time_of_day', hue='day_of_week')

# X axis format
ax.set_xlim([pd.to_datetime('2000-01-01 00:00:00'),
             pd.to_datetime('2000-01-01 23:59:59')])
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M:%S'))

plt.tight_layout()
plt.show()

请注意,此处的样本量较小:

如果在y上寻找count,那么histplot可能更好:

ax = sns.histplot(data=df, x='time_of_day', hue='day_of_week')

gfttwv5a

gfttwv5a2#

我会直接使用pandas的Timestamp。顺便说一下,你的convert_hour函数似乎做错了。它为所有数据提供time_of_the day作为20:00:00。

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt


sns.set_context("paper", font_scale=2)
sns.set_style('whitegrid')

df['day_of_week'] = df['datetime'].apply(lambda x: pd.Timestamp(x).day_name())
df['time_of_day'] = df['datetime'].apply(lambda x: pd.Timestamp(x).hour)

plt.figure(figsize=(8, 4))

for idx, day in enumerate(days):
    sns.kdeplot(df[df.day_of_week == day]['time_of_day'], label=day)

星期三的kde看起来有点奇怪,因为时间在2到20之间变化,因此图中的长尾从-20到40。

a7qyws3x

a7qyws3x3#

下面是一个简单的代码和使用df.plot.kde .
添加了更多数据,以便为每个day_of_week提供多个值,以便kde绘图。简化代码以删除函数。

df1 = pd.DataFrame([
    '2020-09-01 16:39:03',
    '2020-09-02 16:39:03',
    '2020-09-03 16:39:03',
    '2020-09-04 16:39:03',
    '2020-09-05 16:39:03',
    '2020-09-06 16:39:03',
    '2020-09-07 16:39:03',
    '2020-09-08 16:39:03',
], columns=['datetime'])
df = pd.concat([df,df1]).reset_index(drop=True)
df['day_of_week'] = pd.to_datetime(df['datetime']).dt.day_name()
df['time_of_day'] = df['datetime'].str.split(expand=True)[1].str.split(':',expand=True)[0].astype(int)
df.pivot(columns='day_of_week').time_of_day.plot.kde()

图:

相关问题