如何使用pandas.value_counts计算(a列)中事件发生的次数,以及(b列)中规定的groupby year次数

cu6pst1q  于 2021-09-08  发布在  Java
关注(0)|答案(1)|浏览(251)

我已经预处理了一份包含美国紧急情况和灾难历史信息的df,现在包含了1960-2017年间的“``[”地点、灾难类型、开始日期、结束日期、灾难长度、年份]。
现在,我想创建2个新的dfs。
=每年发生灾难的次数,
=每年发生各类灾害的次数。
这是我目前试图计算每年发生的灾难数量并创建一个新的df的尝试,但我不确定如何让它具体计算每年的灾难数量。


# Number of each Disaster each year

df_yearly_dcount=df_time.groupby(df_time['Start_year']).count()

至于第二个,我不太确定每年有多少次灾难,因为我需要先弄清楚第一次灾难,然后才能继续前进,继续分离。
这是完整的代码:

import numpy as np
import matplotlib.pyplot as plt 
import pandas as pd 
import seaborn as sns 

from scipy.stats import zscore

# Import Datased

df = pd.read_csv('database.csv')

df_time = (df[['County','Disaster Type','Start Date', 'End Date']][0: :])

# Preprocessing

# Number of NaN values

df_nan = df[['County','Disaster Type','Start Date', 'End Date']].isna().sum()

# NaN values as a percentage as total

df_nan_number = [(df_nan.sum(axis=0)), str((((539/45330)*100))) +'%']

# Remove NaN values

df_time.dropna(subset = ["County", 'End Date'], inplace=True)

# Set Date Format

df_time['Start_Date_A'] = pd.to_datetime(df['Start Date'], format='%m/%d/%Y')
df_time['End_Date_A'] = pd.to_datetime(df['End Date'], format='%m/%d/%Y')

# Create new column == Disaster Length

df_time['Disaster_Length'] = (df_time.Start_Date_A - df_time.End_Date_A).dt.days

# Create new column == start year

df_time['Start_year'] = df_time['Start_Date_A'].dt.year

# Dropped  Old Date Formats from df

df_time = df_time.drop(columns=['Start Date', 'End Date'], axis=1)

# Replace 0 day values with 1 to indicate a Disaster length of 1 Day

df_time['Disaster_Length'] = df_time['Disaster_Length'].replace({0:1})

# Replace all values with absolute values so all days are represented as positive numeric values

df_time['Disaster_Length'] = df_time['Disaster_Length'].abs()

# Locating man-made and and non 'natural' disasters, sorting Disaster types, and analyzing value counts

df_DTypes= df_time['Disaster Type'].values

df_DTypes=pd.DataFrame(df_DTypes)

df_DType_VCounts=(df_DTypes.value_counts()).sort_values(ascending=True)

Df_DType_Natural=(df_DType_VCounts.drop(['Human Cause', 'Chemical', 'Dam/Levee Break', 'Terrorism','Other'],axis=0)).sort_values(ascending=True)

df_time = df_time.rename(columns={'Disaster Type': 'Disaster_Type'})

# Removing non-natural disasters from main df_time

df_time = df_time[(df_time.Disaster_Type != 'Human Cause') & (df_time.Disaster_Type != 'Chemical') & (df_time.Disaster_Type != 'Dam/Levee Break') & (df_time.Disaster_Type != 'Terrorism') & (df_time.Disaster_Type != 'Other') ]

# Analysis

# Dataframe with mean disaster length for each year

df_yearly_mean = df_time.groupby(['Start_year']).mean()

# Number of Disasters per year

df_yearly_dcount=df_time.groupby(df_time['Start_year']).count().reset_index(name='Disaster_Type')

# Number of each Disaster each year

这是df的可复制样品:

,County,Disaster_Type,Start_Date_A,End_Date_A,Disaster_Length,Start_year
89,Clay County,Flood,1959-01-29,1959-01-29,1,1959
181,Alpine County,Flood,1964-12-24,1964-12-24,1,1964
182,Amador County,Flood,1964-12-24,1964-12-24,1,1964
183,Butte County,Flood,1964-12-24,1964-12-24,1,1964
184,Colusa County,Flood,1964-12-24,1964-12-24,1,1964
185,Del Norte County,Flood,1964-12-24,1964-12-24,1,1964
186,El Dorado County,Flood,1964-12-24,1964-12-24,1,1964
187,Glenn County,Flood,1964-12-24,1964-12-24,1,1964
188,Humboldt County,Flood,1964-12-24,1964-12-24,1,1964
189,Lake County,Flood,1964-12-24,1964-12-24,1,1964
190,Lassen County,Flood,1964-12-24,1964-12-24,1,1964
191,Marin County,Flood,1964-12-24,1964-12-24,1,1964
192,Mendocino County,Flood,1964-12-24,1964-12-24,1,1964
193,Modoc County,Flood,1964-12-24,1964-12-24,1,1964
194,Napa County,Flood,1964-12-24,1964-12-24,1,1964
195,Nevada County,Flood,1964-12-24,1964-12-24,1,1964
196,Placer County,Flood,1964-12-24,1964-12-24,1,1964
197,Plumas County,Flood,1964-12-24,1964-12-24,1,1964
198,Sacramento County,Flood,1964-12-24,1964-12-24,1,1964
199,San Joaquin County,Flood,1964-12-24,1964-12-24,1,1964
200,Shasta County,Flood,1964-12-24,1964-12-24,1,1964
201,Sierra County,Flood,1964-12-24,1964-12-24,1,1964
202,Siskiyou County,Flood,1964-12-24,1964-12-24,1,1964
203,Solano County,Flood,1964-12-24,1964-12-24,1,1964
204,Sonoma County,Flood,1964-12-24,1964-12-24,1,1964
205,Stanislaus County,Flood,1964-12-24,1964-12-24,1,1964
206,Sutter County,Flood,1964-12-24,1964-12-24,1,1964
207,Tehama County,Flood,1964-12-24,1964-12-24,1,1964
208,Trinity County,Flood,1964-12-24,1964-12-24,1,1964
209,Tuolumne County,Flood,1964-12-24,1964-12-24,1,1964
210,Yolo County,Flood,1964-12-24,1964-12-24,1,1964
211,Yuba County,Flood,1964-12-24,1964-12-24,1,1964
212,Baker County,Flood,1964-12-24,1964-12-24,1,1964
213,Benton County,Flood,1964-12-24,1964-12-24,1,1964
214,Clackamas County,Flood,1964-12-24,1964-12-24,1,1964
215,Clatsop County,Flood,1964-12-24,1964-12-24,1,1964
216,Columbia County,Flood,1964-12-24,1964-12-24,1,1964
217,Coos County,Flood,1964-12-24,1964-12-24,1,1964
218,Crook County,Flood,1964-12-24,1964-12-24,1,1964
219,Curry County,Flood,1964-12-24,1964-12-24,1,1964
220,Deschutes County,Flood,1964-12-24,1964-12-24,1,1964
221,Douglas County,Flood,1964-12-24,1964-12-24,1,1964
222,Gilliam County,Flood,1964-12-24,1964-12-24,1,1964
223,Grant County,Flood,1964-12-24,1964-12-24,1,1964
224,Harney County,Flood,1964-12-24,1964-12-24,1,1964
225,Hood River County,Flood,1964-12-24,1964-12-24,1,1964
226,Jackson County,Flood,1964-12-24,1964-12-24,1,1964
227,Jefferson County,Flood,1964-12-24,1964-12-24,1,1964
228,Josephine County,Flood,1964-12-24,1964-12-24,1,1964
229,Klamath County,Flood,1964-12-24,1964-12-24,1,1964
230,Lake County,Flood,1964-12-24,1964-12-24,1,1964
231,Lane County,Flood,1964-12-24,1964-12-24,1,1964
232,Lincoln County,Flood,1964-12-24,1964-12-24,1,1964
233,Linn County,Flood,1964-12-24,1964-12-24,1,1964
234,Malheur County,Flood,1964-12-24,1964-12-24,1,1964
235,Marion County,Flood,1964-12-24,1964-12-24,1,1964
236,Morrow County,Flood,1964-12-24,1964-12-24,1,1964
237,Multnomah County,Flood,1964-12-24,1964-12-24,1,1964
238,Polk County,Flood,1964-12-24,1964-12-24,1,1964
239,Sherman County,Flood,1964-12-24,1964-12-24,1,1964
240,Tillamook County,Flood,1964-12-24,1964-12-24,1,1964
241,Umatilla County,Flood,1964-12-24,1964-12-24,1,1964
242,Union County,Flood,1964-12-24,1964-12-24,1,1964
243,Wallowa County,Flood,1964-12-24,1964-12-24,1,1964
244,Wasco County,Flood,1964-12-24,1964-12-24,1,1964
245,Washington County,Flood,1964-12-24,1964-12-24,1,1964
246,Wheeler County,Flood,1964-12-24,1964-12-24,1,1964
247,Yamhill County,Flood,1964-12-24,1964-12-24,1,1964
248,Asotin County,Flood,1964-12-29,1964-12-29,1,1964
249,Benton County,Flood,1964-12-29,1964-12-29,1,1964
250,Clark County,Flood,1964-12-29,1964-12-29,1,1964
251,Columbia County,Flood,1964-12-29,1964-12-29,1,1964
252,Cowlitz County,Flood,1964-12-29,1964-12-29,1,1964
253,Garfield County,Flood,1964-12-29,1964-12-29,1,1964
254,Grays Harbor County,Flood,1964-12-29,1964-12-29,1,1964
255,King County,Flood,1964-12-29,1964-12-29,1,1964
256,Kittitas County,Flood,1964-12-29,1964-12-29,1,1964
257,Klickitat County,Flood,1964-12-29,1964-12-29,1,1964
258,Lewis County,Flood,1964-12-29,1964-12-29,1,1964
259,Mason County,Flood,1964-12-29,1964-12-29,1,1964
260,Pacific County,Flood,1964-12-29,1964-12-29,1,1964
261,Pierce County,Flood,1964-12-29,1964-12-29,1,1964
262,Skamania County,Flood,1964-12-29,1964-12-29,1,1964
263,Snohomish County,Flood,1964-12-29,1964-12-29,1,1964
264,Spokane County,Flood,1964-12-29,1964-12-29,1,1964
265,Wahkiakum County,Flood,1964-12-29,1964-12-29,1,1964
266,Walla Walla County,Flood,1964-12-29,1964-12-29,1,1964
267,Whitman County,Flood,1964-12-29,1964-12-29,1,1964
268,Yakima County,Flood,1964-12-29,1964-12-29,1,1964
269,Ada County,Flood,1964-12-31,1964-12-31,1,1964
270,Bannock County,Flood,1964-12-31,1964-12-31,1,1964
271,Benewah County,Flood,1964-12-31,1964-12-31,1,1964
272,Blaine County,Flood,1964-12-31,1964-12-31,1,1964
273,Boise County,Flood,1964-12-31,1964-12-31,1,1964
274,Bonneville County,Flood,1964-12-31,1964-12-31,1,1964
275,Butte County,Flood,1964-12-31,1964-12-31,1,1964
276,Camas County,Flood,1964-12-31,1964-12-31,1,1964
277,Caribou County,Flood,1964-12-31,1964-12-31,1,1964
278,Cassia County,Flood,1964-12-31,1964-12-31,1,1964
279,Clearwater County,Flood,1964-12-31,1964-12-31,1,1964
tktrz96b

tktrz96b1#

你可以打电话 size 在…上 groupby 去拿计数。


# Number of Disasters each year.

df.groupby('Start_year').size()
Start_year
1959     1
1964    99
dtype: int64

# Number of each disasters for each year.

df.groupby(['Start_year', 'Disaster_Type']).size()
Start_year  Disaster_Type
1959        Flood             1
1964        Flood            99
dtype: int64

相关问题