pandas 在python中合并2个dataframe,但保留列

j0pj023g  于 2023-08-01  发布在  Python
关注(0)|答案(2)|浏览(131)

我想在python中合并一个 Dataframe

df 1
Product_Type Product_Colour   Notes
0 Shirt        Black           2 L and 2 S
1 Shirt        Black           1 XL and 2 M
2 Shirt        Black           2 XS and 2 S

df2
   Product_Type Product_Colour   Code
13 Shirt        Black             KI
14 Shirt        Black             KI
25 Shirt        Black             KI

......

that df2 still have many rows. I just want to , if both Product type and colour appears in both df, take the code information and make df3

this is expected result,

Product_Type Product_Colour   Notes             Code
0 Shirt        Black          2 L and 2 S         KI
1 Shirt        Black          1 XL and 2 M        KI
2 Shirt        Black          2 XS and 2 S        KI

but i got this instead
Product_Type Product_Colour   Notes             Code
0 Shirt        Black          2 L and 2 S         KI

字符串
我用pd.merge(df1,df2 on=['Product_Type','Product_Colour'], how='left')做的,它似乎不能处理产品类型和产品颜色的非唯一值。

zengzsys

zengzsys1#

由于您对唯一值感兴趣,因此可以使用drop_duplicates函数在合并之前使df2不同。

df1.merge(df2.drop_duplicates(), how = 'left')
      Product_Type Product_Colour         Notes Code
    0        Shirt          Black   2 L and 2 S   KI
    1        Shirt          Black  1 XL and 2 M   KI
    2        Shirt          Black  2 XS and 2 S   KI

字符串
完整的解决方案,考虑到列:

df1.merge(df2.drop_duplicates(), on=['Product_Type','Product_Colour'], how='left')

np8igboo

np8igboo2#

import pandas as pd

# Sample data
df1 = pd.DataFrame({
    'Product_Type': ['Shirt', 'Shirt', 'Shirt'],
    'Product_Colour': ['Black', 'Black', 'Black'],
    'Notes': ['2 L and 2 S', '1 XL and 2 M', '2 XS and 2 S']
})

df2 = pd.DataFrame({
    'Product_Type': ['Shirt', 'Shirt', 'Shirt'],
    'Product_Colour': ['Black', 'Black', 'Black'],
    'Code': ['KI', 'KI', 'KI']
})

# Group and merge DataFrames
df3 = df1.groupby(['Product_Type', 'Product_Colour'], as_index=False)['Notes'].first()
df3 = pd.merge(df3, df2, on=['Product_Type', 'Product_Colour'], how='left')

print(df3)

字符串
输出量:

Product_Type Product_Colour          Notes Code
0        Shirt          Black   2 L and 2 S   KI
1        Shirt          Black  1 XL and 2 M   KI
2        Shirt          Black  2 XS and 2 S   KI

相关问题