我希望压缩dataframe中的列,并对值进行聚合:
数据
state range type Q1 24 Q2 24 stat
NY up AA 2 2 grow
NY up AA 1 0 re
NY up BB 1 1 grow
NY up BB 0 0 re
NY up DD 2 3 grow
NY up DD 0 1 re
CA low AA 0 2 grow
CA low AA 1 0 re
CA low BB 0 1 grow
CA low BB 0 0 re
CA low DD 0 3 grow
CA low DD 1 0 re
dataframe:
import pandas as pd
data = {
'state': ['NY', 'NY', 'NY', 'NY', 'NY', 'NY', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA'],
'range': ['up', 'up', 'up', 'up', 'up', 'up', 'low', 'low', 'low', 'low', 'low', 'low'],
'type': ['AA', 'AA', 'BB', 'BB', 'DD', 'DD', 'AA', 'AA', 'BB', 'BB', 'DD', 'DD'],
'Q1 24': [2, 1, 1, 0, 2, 0, 0, 1, 0, 0, 0, 1],
'Q2 24': [2, 0, 1, 0, 3, 1, 2, 0, 1, 0, 3, 0],
'stat': ['grow', 're', 'grow', 're', 'grow', 're', 'grow', 're', 'grow', 're', 'grow', 're']
}
df = pd.DataFrame(data)
print(df)
期望
state qtr type range stat
NY Q1 24 AA01 up grow
NY Q1 24 AA02 up grow
NY Q1 24 AA03 up re
NY Q1 24 BB01 up grow
NY Q1 24 DD01 up grow
NY Q1 24 DD02 up grow
CA Q1 24 AA01 low re
CA Q1 24 DD01 low re
NY Q2 24 AA01 up grow
NY Q2 24 AA02 up grow
NY Q2 24 BB01 up grow
NY Q2 24 DD01 up grow
NY Q2 24 DD02 up grow
NY Q2 24 DD03 up grow
NY Q2 24 DD04 up re
CA Q2 24 AA01 low grow
CA Q2 24 AA02 low grow
CA Q2 24 BB01 low grow
CA Q2 24 DD01 low grow
CA Q2 24 DD02 low grow
CA Q2 24 DD03 low grow
做
(df
.pivot_longer(
index = slice('state', 'type'),
names_to = ("qtr", ".value"),
names_sep = " ")
)
更新计数
newdf=newdf.assign(count=newdf['type']+(newdf.groupby(['state','type'])['type'].cumcount()+1).astype(str))
然而,这只是一个片段,但我仍在试图弄清楚如何解聚计数。我正在研究这个。任何建议都很感激。
1条答案
按热度按时间pieyvz9o1#
您可以在
Q1 24
和Q2 24
值上melt
现有的 Dataframe ,然后过滤掉所有具有0
计数的行。然后可以使用index.repeat
为每行创建count
副本。然后使用groupby
计算每个('state', 'type', 'qtr')
组的行数,并使用该数字生成新的type
值:输出: