python 使用pandas对字典列表中的相同键求和

jucafojl  于 2023-09-29  发布在  Python
关注(0)|答案(1)|浏览(137)

mport pandas as pd
customer1 = {'name': 'John Smith',"qty": 10, 'income': 35, 'email': '[email protected]'}
customer2 = {'name': 'John Smith', "qty": 10,'income': 28, 'phone': '555-555-5555',"other": "something", 'email': '[email protected]'}
customer3 = {'name': 'Bob Johnson',"qty": 10,'income': 20, 'address': '123 Main St', 'email': '[email protected]',"c2":"kanel","c3":"pong"}
customer3 = {'name': 'Joe Johnson', "qty": 10,'income': 8, 'address': '123 Main St', 'email': '[email protected]',"c2":"kanel","c3":"pong"}

data = [customer1, customer2, customer3]
df = pd.DataFrame.from_dict(data)
print(df)

我想把qtyincome相加,如果有相同的名字,结果如下所示

result = [
     {'name': 'John Smith', "qty": 20,'income': 63, 'phone': '555-555-5555',"other": "something", 'email': '[email protected]','email2': '[email protected]'},
     {'name': 'Bob Johnson',"qty": 10,'income': 20, 'address': '123 Main St', 'email': '[email protected]',"c2":"kanel","c3":"pong"},
     {'name': 'Joe Johnson', "qty": 10,'income': 8, 'address': '123 Main St', 'email': '[email protected]',"c2":"kanel","c3":"pong"}

]

kmbjn2e3

kmbjn2e31#

如果你只想合计quty/income的总和,那么使用groupby.aggsum作为方法,其他所有列first

funcs = ({'qty': 'sum', 'income': 'sum'}
        |dict.fromkeys(['email', 'phone', 'other', 'address', 'c2', 'c3'], 'first')
        )
out = df.groupby('name', as_index=False, sort=False).agg(funcs)

对于python < 3.9,你可以用途:

funcs = {'qty': 'sum', 'income': 'sum'}
funcs = {**funcs, **dict.fromkeys(['email', 'phone', 'other', 'address', 'c2', 'c3'], 'first')}

输出量:

name  qty  income                  email         phone      other      address     c2    c3
0   John Smith   20      63  [email protected]  555-555-5555  something         None   None  None
1  Bob Johnson   10      20  [email protected]          None       None  123 Main St  kanel  pong
2  Joe Johnson   10       8  bob.johns[email protected]          None       None  123 Main St  kanel  pong

邮件作为新栏目

如果您还希望在输出中显示电子邮件的新列,则可以将其聚合为列表,单独转换为DataFrame,然后将join转换为输出:

funcs = ({'qty': 'sum', 'income': 'sum', 'email': list}
        |dict.fromkeys(['phone', 'other', 'address', 'c2', 'c3'], 'first')
        )
out = df.groupby('name', as_index=False, sort=False).agg(funcs)
out = out.join(pd.DataFrame(out.pop('email').tolist()).rename(columns=lambda x: f'email_{x+1}'))

输出量:

name  qty  income         phone      other      address     c2    c3                email_1               email_2
0   John Smith   20      63  555-555-5555  something         None   None  None  [email protected]  [email protected]
1  Bob Johnson   10      20          None       None  123 Main St  kanel  pong  [email protected]                  None
2  Joe Johnson   10       8          None       None  123 Main St  kanel  pong  [email protected]                  None

相关问题