如何使用pandas统一列名以附加 Dataframe ?

bvjxkvbb  于 2021-09-29  发布在  Java
关注(0)|答案(3)|浏览(449)

我有两个 Dataframe ,如下所示

  1. df1 = pd.DataFrame({'person_id': [101,101,101,101,202,202,202],
  2. 'person_type':['A','A','B','C','D','B','A'],
  3. 'test_id':[1,2,3,3,4,4,5],
  4. 'login_date':['5/7/2013 09:27:00 AM','09/08/2013 11:21:00 AM','06/06/2014 08:00:00 AM','06/06/2014 05:00:00 AM','12/11/2011 10:00:00 AM','13/10/2012 12:00:00 AM','13/12/2012 11:45:00 AM']})
  5. df2 = pd.DataFrame({'subject_id': [101,101,101,101,202,202,202],
  6. 'test_date':['5/7/2013 09:27:00 AM','09/08/2013 11:21:00 AM','06/06/2014 08:00:00 AM','06/06/2014 05:00:00 AM','12/11/2011 10:00:00 AM','13/10/2012 12:00:00 AM','13/12/2012 11:45:00 AM']})

我想换个形状 df2df1 . 所谓形状,我指的只是列名。
例如:我想 df2 一模一样 df1 在列名方面,但保留df2的值。
我试过下面的方法

  1. df2.rename(columns={'subject_id':'person_id', 'test_date':'login_date'}, inplace=True)
  2. final_columns = df1.columns
  3. previous_columns = df2.columns.tolist()
  4. mapping = {previous_columns[i]: final_columns[i] for i in range(2)}
  5. df2.rename(mapping, inplace=True)
  6. final_df = df1.append(df2)

我希望我的输出如下所示

7xzttuei

7xzttuei1#

试用 pd.concat ```
import pandas as pd

pd.concat([
df1.assign(Data_From="df1"),
df2.assign(Data_From="df2")
.rename(columns={"subject_id": "person_id", "test_date": "login_date"})
])

person_id person_type test_id login_date Data_From
0 101 A 1.0 5/7/2013 09:27:00 AM df1
1 101 A 2.0 09/08/2013 11:21:00 AM df1
2 101 B 3.0 06/06/2014 08:00:00 AM df1
3 101 C 3.0 06/06/2014 05:00:00 AM df1
4 202 D 4.0 12/11/2011 10:00:00 AM df1
5 202 B 4.0 13/10/2012 12:00:00 AM df1
6 202 A 5.0 13/12/2012 11:45:00 AM df1
0 101 NaN NaN 5/7/2013 09:27:00 AM df2
1 101 NaN NaN 09/08/2013 11:21:00 AM df2
2 101 NaN NaN 06/06/2014 08:00:00 AM df2
3 101 NaN NaN 06/06/2014 05:00:00 AM df2
4 202 NaN NaN 12/11/2011 10:00:00 AM df2
5 202 NaN NaN 13/10/2012 12:00:00 AM df2
6 202 NaN NaN 13/12/2012 11:45:00 AM df2

展开查看全部
n6lpvg4x

n6lpvg4x2#

使用 concatkeys 论点

  1. df3 = pd.concat([df1,df2.rename(columns=
  2. {'subject_id' : 'person_id',
  3. 'test_date' : 'login_date'})],
  4. join='outer',
  5. keys=['df1','df2'])

然后使用 .loc 来切你的df。

  1. print(df3.loc['df1'])
  2. person_id person_type test_id login_date
  3. 0 101 A 1.0 5/7/2013 09:27:00 AM
  4. 1 101 A 2.0 09/08/2013 11:21:00 AM
  5. 2 101 B 3.0 06/06/2014 08:00:00 AM
  6. 3 101 C 3.0 06/06/2014 05:00:00 AM
  7. 4 202 D 4.0 12/11/2011 10:00:00 AM
  8. 5 202 B 4.0 13/10/2012 12:00:00 AM
  9. 6 202 A 5.0 13/12/2012 11:45:00 AM

打印(df3)

  1. person_id person_type test_id login_date
  2. df1 0 101 A 1.0 5/7/2013 09:27:00 AM
  3. 1 101 A 2.0 09/08/2013 11:21:00 AM
  4. 2 101 B 3.0 06/06/2014 08:00:00 AM
  5. 3 101 C 3.0 06/06/2014 05:00:00 AM
  6. 4 202 D 4.0 12/11/2011 10:00:00 AM
  7. 5 202 B 4.0 13/10/2012 12:00:00 AM
  8. 6 202 A 5.0 13/12/2012 11:45:00 AM
  9. df2 0 101 NaN NaN 5/7/2013 09:27:00 AM
  10. 1 101 NaN NaN 09/08/2013 11:21:00 AM
  11. 2 101 NaN NaN 06/06/2014 08:00:00 AM
  12. 3 101 NaN NaN 06/06/2014 05:00:00 AM
  13. 4 202 NaN NaN 12/11/2011 10:00:00 AM
  14. 5 202 NaN NaN 13/10/2012 12:00:00 AM
  15. 6 202 NaN NaN 13/12/2012 11:45:00 AM
展开查看全部
l7mqbcuq

l7mqbcuq3#

首先在两个df中指定列

  1. df1['DATA FROM']='df1'
  2. df2['DATA FROM']='df2'

最后:
通过 append() + rename() :

  1. df1.append(df2.rename(columns={'subject_id':'person_id','test_date':'login_date'}))


通过 concat() + rename() :

  1. pd.concat([df1,df2.rename(columns={'subject_id':'person_id','test_date':'login_date'})])

输出:

  1. person_id person_type test_id login_date DATA FROM
  2. 0 101 A 1.0 5/7/2013 09:27:00 AM df1
  3. 1 101 A 2.0 09/08/2013 11:21:00 AM df1
  4. 2 101 B 3.0 06/06/2014 08:00:00 AM df1
  5. 3 101 C 3.0 06/06/2014 05:00:00 AM df1
  6. 4 202 D 4.0 12/11/2011 10:00:00 AM df1
  7. 5 202 B 4.0 13/10/2012 12:00:00 AM df1
  8. 6 202 A 5.0 13/12/2012 11:45:00 AM df1
  9. 0 101 NaN NaN 5/7/2013 09:27:00 AM df2
  10. 1 101 NaN NaN 09/08/2013 11:21:00 AM df2
  11. 2 101 NaN NaN 06/06/2014 08:00:00 AM df2
  12. 3 101 NaN NaN 06/06/2014 05:00:00 AM df2
  13. 4 202 NaN NaN 12/11/2011 10:00:00 AM df2
  14. 5 202 NaN NaN 13/10/2012 12:00:00 AM df2
  15. 6 202 NaN NaN 13/12/2012 11:45:00 AM df2
展开查看全部

相关问题