pandas panda Dataframe 分组并按工作日排序

eblbsuwk  于 2022-12-28  发布在  其他
关注(0)|答案(2)|浏览(185)

我有Pandas数据框,其中包括Day of Week列。

df_weekday = df.groupby(['Day of Week']).sum()
df_weekday[['Spent', 'Clicks', 'Impressions']].plot(figsize=(16,6), subplots=True);

绘制DataFrame时,按字母顺序显示"星期几":'Friday', 'Monday', 'Saturday', 'Sunday' , 'Tuesday' , 'Thursday', 'Wednesday'.
如何按正确的工作日顺序'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'排序和显示df_weekday?

ubof19bj

ubof19bj1#

您可以先使用ordered catagorical:

cats = [ 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

df['Day of Week'] = df['Day of Week'].astype('category', categories=cats, ordered=True)

pandas 0.21.0+中用途:

from pandas.api.types import CategoricalDtype
cat_type = CategoricalDtype(categories=cats, ordered=True)
df['Day of Week'] = df['Day of Week'].astype(cat_type)

reindex

df_weekday = df.groupby(['Day of Week']).sum().reindex(cats)
jogvjijk

jogvjijk2#

一个简单而健壮的解决方案是在多索引中包含天数以便自动排序。

birthdays = df.groupby([df['date'].dt.day_of_week,df['date'].dt.day_name()])['births'].sum()
birthdays = birthdays.droplevel(0,'index')

生日数据的完整示例

# group and sort by day-of-week

import pandas as pd
host = 'raw.github.com'
user = 'fivethirtyeight'
repo = 'data'
branch = 'master'
file = 'births/US_births_2000-2014_SSA.csv'
url = f'https://{host}/{user}/{repo}/{branch}/{file}'
df = pd.read_csv(url,sep=',',header=0)
df['date'] = df[['year','month','date_of_month']].astype(str).apply('-'.join,axis=1)
df['date'] = pd.to_datetime(df['date'])
df = df[['date','births']]
df.head()

import seaborn as sns
birthdays = df.groupby([df['date'].dt.day_of_week,df['date'].dt.day_name()])['births'].sum()
birthdays = birthdays.droplevel(0,'index')
sns.barplot(data=birthdays.reset_index(),x='date',y='births')

相关问题