我基本上在pandas中有一个数据框,里面有一个国家列表,但我想按大洲和地区对每个国家进行分类。我想在原始数据框架中再添加两列。
import numpy as np
import pandas as pd
import pandasql as psql
import matplotlib.pyplot as plt
import plotly.express as px
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())
suicide_df = pd.read_csv(r"C:\Users\slaye\Downloads\suicide_rates\master.csv")
display(suicide_df)
country year sex age suicides_no population suicides/100k pop country-year HDI for year gdp_for_year ($) gdp_per_capita ($) generation
0 Albania 1987 male 15-24 years 21 312900 6.71 Albania1987 NaN 2,156,624,900 796 Generation X
1 Albania 1987 male 35-54 years 16 308000 5.19 Albania1987 NaN 2,156,624,900 796 Silent
2 Albania 1987 female 15-24 years 14 289700 4.83 Albania1987 NaN 2,156,624,900 796 Generation X
3 Albania 1987 male 75+ years 1 21800 4.59 Albania1987 NaN 2,156,624,900 796 G.I. Generation
4 Albania 1987 male 25-34 years 9 274300 3.28 Albania1987 NaN 2,156,624,900 796 Boomers
... ... ... ... ... ... ... ... ... ... ... ... ...
27815 Uzbekistan 2014 female 35-54 years 107 3620833 2.96 Uzbekistan2014 0.675 63,067,077,179 2309 Generation X
27816 Uzbekistan 2014 female 75+ years 9 348465 2.58 Uzbekistan2014 0.675 63,067,077,179 2309 Silent
27817 Uzbekistan 2014 male 5-14 years 60 2762158 2.17 Uzbekistan2014 0.675 63,067,077,179 2309 Generation Z
27818 Uzbekistan 2014 female 5-14 years 44 2631600 1.67 Uzbekistan2014 0.675 63,067,077,179 2309 Generation Z
27819 Uzbekistan 2014 female 55-74 years 21 1438935 1.46 Uzbekistan2014 0.675 63,067,077,179 2309 Boomers
字符串
现在我一直在手动编译嵌套字典,但我想知道是否有更快的方法来做到这一点。
regions = {'Asia': {'Central Asia', 'South Asia', 'East Asia', 'SouthEast Asia', 'West Asia', 'Middle East'},
'Europe': {'Western Europe', 'Eastern Europe', 'Mediterranean', 'Nordic'},
'North America': {'United States'}
型
我很抱歉,如果这是一个草率的职位,我不知道在stackoverflow格式。
1条答案
按热度按时间c9qzyr3d1#
字符串
你可以用这个代码