I have a data set that details polling data in different states and the percentage of people who have voted for either DEM or REP in that state. What my data frame looks like:
I'm essentially trying to find the average percentage of people in X state voting for either DEM or REP. So my output would be something like:
New Hampshire | DEM | 55% New Hampshire | REP | 45% Maine | DEM | 45% Maine | REP | 54% etc.
I initially thought of simply iterating over the entire dataset, and assigning new pct variables for each state's DEM percentage or REP percentage, but I felt that that is inefficient.
I'm thinking of sorting the data such that it has state1, DEM | state1, REP | state2, DEM | state3, REP etc. and then finding averages. But I am not too experienced with pandas (which is what I'm attempting to use). Perhaps someone can point me in the right direction.
2条答案
按热度按时间njthzxwz1#
IIUC,将
pandas.concat
与GroupBy.mean
一起使用:这将返回一个(
pandas.core.frame.DataFrame
),您可以将它赋给一个变量:4jb9z9bj2#
尝试使用
df.groupby(['state','party'])['pct'].mean()