如何合并Pandas Dataframe 中的两列,这两列都包含年份,但年份不同?

ukdjmx9f  于 2023-01-07  发布在  其他
关注(0)|答案(3)|浏览(111)

我有一个Pandas Dataframe 类似于这个(编辑):

Fruits  Year    Farm A  Fruits Year    Farm B
Apples  2021    2000    Apples 2022    2200
Apples  2020    1500    Apples 2021    2100
Apples  2019    1200    Apples 2020    1900
Orange  2021    1150    Apples 2019    1800
Orange  2020    1200   Oranges 2022    1200
Orange  2019    1300   Oranges 2021    1100
                       Oranges 2020    1250
                       Oranges 2019    1250

我想将此 Dataframe 转换为如下所示的内容(编辑):

Fruits  Year    Farm A   Farm B
    Apples  2022    -         2200
    Apples  2021    2000      2100
    Apples  2020    1500      1900
    Apples  2019    1200      1800
    Orange  2022    -         1200
    Orange  2021    1150      1100
    Orange  2020    1200      1250
    Orange  2019    1300      1250

有没有人帮忙?

    • 新编辑**:为了回答@Cole的问题,由于我创建了上面的数据框来简化问题,造成了很大的混乱,因此我将原始数据框粘贴到下面。从这个原始数据框中,我想将列***"财务KPI.1"、"年.1"*与列"财务KPI"、"年"***合并,用NaN填充剩余的空"单元格"。我为混乱道歉。
Financial KPI        Year AMZN (All numbers in thousands) BAC (All numbers in thousands) C (All numbers in thousands)                                      Financial KPI.1     Year.1 MSFT (All numbers in thousands)
0                                         Total Revenue         TTM                  502,191,000.00                  92,478,000.00                74,307,000.00                                        Total Revenue        TTM                  203,075,000.00
1                                         Total Revenue  12/31/2021                  469,822,000.00                  89,113,000.00                71,887,000.00                                        Total Revenue  6/29/2022                  198,270,000.00
2                                         Total Revenue  12/31/2020                  386,064,000.00                  85,528,000.00                75,494,000.00                                        Total Revenue  6/29/2021                  168,088,000.00
3                                         Total Revenue  12/31/2019                  280,522,000.00                  91,244,000.00                75,067,000.00                                        Total Revenue  6/29/2020                  143,015,000.00
4   Net Income from Continuing & Discontinued Operation         TTM                   11,323,000.00                  27,409,000.00                15,505,000.00                                        Total Revenue  6/29/2019                  125,843,000.00
5   Net Income from Continuing & Discontinued Operation  12/31/2021                   33,364,000.00                  31,978,000.00                21,952,000.00  Net Income from Continuing & Discontinued Operation        TTM                   69,789,000.00
6   Net Income from Continuing & Discontinued Operation  12/31/2020                   21,331,000.00                  17,894,000.00                11,047,000.00  Net Income from Continuing & Discontinued Operation  6/29/2022                   72,738,000.00
7   Net Income from Continuing & Discontinued Operation  12/31/2019                   11,588,000.00                  27,430,000.00                19,401,000.00  Net Income from Continuing & Discontinued Operation  6/29/2021                   61,271,000.00
8                                     Normalized Income         TTM                   12,157,600.00                  27,409,000.00                15,734,000.00  Net Income from Continuing & Discontinued Operation  6/29/2020                   44,281,000.00
9                                     Normalized Income  12/31/2021                   20,551,997.00                  31,978,000.00                21,945,000.00  Net Income from Continuing & Discontinued Operation  6/29/2019                   39,240,000.00
10                                    Normalized Income  12/31/2020                   21,331,000.00                  17,894,000.00                11,067,000.00                                    Normalized Income        TTM                   69,806,290.00
11                                    Normalized Income  12/31/2019                   11,588,000.00                  27,430,000.00                19,405,000.00                                    Normalized Income  6/29/2022                   72,447,420.00
12                                            Basic EPS         TTM                             1.1                           3.16                         7.45                                    Normalized Income  6/29/2021                   60,150,420.00
13                                            Basic EPS  12/31/2021                             3.3                            3.6                         10.8                                    Normalized Income  6/29/2020                   44,257,620.00
14                                            Basic EPS  12/31/2020                            2.13                           1.88                         5.45                                    Normalized Income  6/29/2019                   38,602,420.00
15                                            Basic EPS  12/31/2019                            1.17                           2.77                         8.63                                            Basic EPS        TTM                            9.29
16                                    Net_Profit_Margin         TTM                         2.42091                        29.6384                     21.17432                                            Basic EPS  6/29/2022                             9.7
17                                    Net_Profit_Margin  12/31/2021                         4.37442                       35.88478                     30.52708                                            Basic EPS  6/29/2021                            8.12
18                                    Net_Profit_Margin  12/31/2020                         5.52525                        20.9218                     14.65944                                            Basic EPS  6/29/2020                            5.82
19                                    Net_Profit_Margin  12/31/2019                         4.13087                       30.06225                     25.85024                                            Basic EPS  6/29/2019                            5.11
20                                    Price To Earnings         TTM                        76.36364                       10.48101                      6.07114                                    Net_Profit_Margin        TTM                        34.37463
21                                         Total Assets         TTM                             NaN                            NaN                          NaN                                    Net_Profit_Margin  6/29/2022                        36.53978
22                                         Total Assets  12/31/2021                  420,549,000.00               3,169,495,000.00             2,291,413,000.00                                    Net_Profit_Margin  6/29/2021                        35.78508
23                                         Total Assets  12/31/2020                  321,195,000.00               2,819,627,000.00             2,260,090,000.00                                    Net_Profit_Margin  6/29/2020                        30.94614
24                                         Total Assets  12/31/2019                  225,248,000.00               2,434,079,000.00             1,951,158,000.00                                    Net_Profit_Margin  6/29/2019                        30.67506
25              Total Liabilities Net Minority Interest         TTM                             NaN                            NaN                          NaN                                    Price To Earnings        TTM                        25.81485
26              Total Liabilities Net Minority Interest  12/31/2021                  282,304,000.00               2,899,429,000.00             2,088,741,000.00                                         Total Assets        TTM                             NaN
27              Total Liabilities Net Minority Interest  12/31/2020                  227,791,000.00               2,546,703,000.00             2,059,890,000.00                                         Total Assets  6/29/2022                  364,840,000.00
28              Total Liabilities Net Minority Interest  12/31/2019                  163,188,000.00               2,169,269,000.00             1,757,212,000.00                                         Total Assets  6/29/2021                  333,779,000.00
29                 Total Equity Gross Minority Interest         TTM                             NaN                            NaN                          NaN                                         Total Assets  6/29/2020                  301,311,000.00
30                 Total Equity Gross Minority Interest  12/31/2021                  138,245,000.00                 270,066,000.00               202,672,000.00                                         Total Assets  6/29/2019                  286,556,000.00
31                 Total Equity Gross Minority Interest  12/31/2020                   93,404,000.00                 272,924,000.00               200,200,000.00              Total Liabilities Net Minority Interest        TTM                             NaN
32                 Total Equity Gross Minority Interest  12/31/2019                   62,060,000.00                 264,810,000.00               193,946,000.00              Total Liabilities Net Minority Interest  6/29/2022                  198,298,000.00
33                                           Total Debt         TTM                             NaN                            NaN                          NaN              Total Liabilities Net Minority Interest  6/29/2021                  191,791,000.00
34                                           Total Debt  12/31/2021                  116,395,000.00                 303,870,000.00               282,347,000.00              Total Liabilities Net Minority Interest  6/29/2020                  183,007,000.00
35                                           Total Debt  12/31/2020                   84,389,000.00                 282,255,000.00               301,200,000.00              Total Liabilities Net Minority Interest  6/29/2019                  184,226,000.00
36                                           Total Debt  12/31/2019                   63,205,000.00                 265,060,000.00               293,809,000.00                 Total Equity Gross Minority Interest        TTM                             NaN
37                   Current_Ratio (assets/liabilities)  12/31/2021                          1.4897                        1.09314                      1.09703                 Total Equity Gross Minority Interest  6/29/2022                  166,542,000.00
38                   Current_Ratio (assets/liabilities)  12/31/2020                         1.41004                        1.10717                      1.09719                 Total Equity Gross Minority Interest  6/29/2021                  141,988,000.00
39                   Current_Ratio (assets/liabilities)  12/31/2019                          1.3803                        1.12207                      1.11037                 Total Equity Gross Minority Interest  6/29/2020                  118,304,000.00
40                                 Debt_to_Assets_Ratio  12/31/2021                         0.27677                        0.09587                      0.12322                 Total Equity Gross Minority Interest  6/29/2019                  102,330,000.00
41                                 Debt_to_Assets_Ratio  12/31/2020                         0.26273                         0.1001                      0.13327                                           Total Debt        TTM                             NaN
42                                 Debt_to_Assets_Ratio  12/31/2019                          0.2806                         0.1089                      0.15058                                           Total Debt  6/29/2022                   61,270,000.00
43                                                  NaN         NaN                             NaN                            NaN                          NaN                                           Total Debt  6/29/2021                   67,775,000.00
44                                                  NaN         NaN                             NaN                            NaN                          NaN                                           Total Debt  6/29/2020                   70,998,000.00
45                                                  NaN         NaN                             NaN                            NaN                          NaN                                           Total Debt  6/29/2019                   78,366,000.00
46                                                  NaN         NaN                             NaN                            NaN                          NaN                   Current_Ratio (assets/liabilities)  6/29/2022                         1.83986
47                                                  NaN         NaN                             NaN                            NaN                          NaN                   Current_Ratio (assets/liabilities)  6/29/2021                         1.74033
48                                                  NaN         NaN                             NaN                            NaN                          NaN                   Current_Ratio (assets/liabilities)  6/29/2020                         1.64645
49                                                  NaN         NaN                             NaN                            NaN                          NaN                   Current_Ratio (assets/liabilities)  6/29/2019                         1.55546
50                                                  NaN         NaN                             NaN                            NaN                          NaN                                 Debt_to_Assets_Ratio  6/29/2022                         0.16794
51                                                  NaN         NaN                             NaN                            NaN                          NaN                                 Debt_to_Assets_Ratio  6/29/2021                         0.20305
52                                                  NaN         NaN                             NaN                            NaN                          NaN                                 Debt_to_Assets_Ratio  6/29/2020                         0.23563
53                                                  NaN         NaN                             NaN                            NaN                          NaN                                 Debt_to_Assets_Ratio  6/29/2019                         0.27348
mspsb9vt

mspsb9vt1#

如果有两个 Dataframe :

Fruits  Year   FarmA
Apples  2021    2000
Apples  2020    1500
Apples  2019    1200
Fruits  Year   FarmB
Apples  2022    2200
Apples  2021    2100
Apples  2019    1900

比如leftright,那么可以按如下方式合并它们:

out = pd.merge(left, right, on=["Fruits", "Year"], how="outer")

它看起来像:

Fruits  Year   FarmA   FarmB
0  Apples  2021  2000.0  2100.0
1  Apples  2020  1500.0     NaN
2  Apples  2019  1200.0  1900.0
3  Apples  2022     NaN  2200.0

Fruits上已经有两个 Dataframe 连接在一起,这很难处理。您可以得到两个独立的 Dataframe ,如下所示:

left  = df[["Fruits", "Year1", "FarmA"]].rename(columns={"Year1": "Year"})
right = df[["Fruits", "Year1", "FarmB"]].rename(columns={"Year2": "Year"})

注意,我稍微修改了一下原始df的列名。(重复"Year",在"Farm X"中使用空格)

mqkwyuun

mqkwyuun2#

除非使用后缀,否则一个 Dataframe 中的两列不能具有相同的名称。

import pandas as pd

df = pd.DataFrame({"Fruits": ["Apple", "Apple", "Apple"],
 "Year_A": [2021, 2020, 2019],
 "Farm_A": [2000, 1500, 1200],
 "Year_B": [2022, 2021, 2019],
 "Farm_B": [2200, 2100, 1900]})
Fruits  Year_A  Farm_A  Year_B  Farm_B
0  Apple    2021    2000    2022    2200
1  Apple    2020    1500    2021    2100
2  Apple    2019    1200    2019    1900
    • 完整代码:**
import pandas as pd

df = pd.DataFrame({"Fruits": ["Apple", "Apple", "Apple"],
 "Year_A": [2021, 2020, 2019],
 "Farm_A": [2000, 1500, 1200],
 "Year_B": [2022, 2021, 2019],
 "Farm_B": [2200, 2100, 1900]})

# Extract two dataframes
df1 = df.loc[:,["Fruits", "Year_A", "Farm_A"]]
df2 = df.loc[:,["Fruits", "Year_B", "Farm_B"]]

# Rename year column and get rid of the suffix
df1.columns = df1.columns.str.replace('Year_A', 'Year')
df2.columns = df2.columns.str.replace('Year_B', 'Year')

# Apply merge on two columns
df3 = df1.merge(df2, on=["Fruits", "Year"], how="outer")
    • 结果:**
Fruits  Year  Farm_A  Farm_B
0  Apple  2021  2000.0  2100.0
1  Apple  2020  1500.0     NaN
2  Apple  2019  1200.0  1900.0
3  Apple  2022     NaN  2200.0
uqcuzwp8

uqcuzwp83#

如果数据与原始结构匹配,我们可以使用pands.wide_to_long()对数据进行反透视,然后,为了与OP的要求完全匹配,我们可以将pivot()扩展为宽。
请注意,如@Laurent B.所示,OP的数据结构不正确-在DataFrame中不能有两个相同的列名。

import pandas as pd

df = pd.DataFrame({"Fruits": ["Apple", "Apple", "Apple"],
 "Year1": [2021, 2020, 2019],
 "Farm1": [2000, 1500, 1200],
 "Year2": [2022, 2021, 2019],
 "Farm2": [2200, 2100, 1900]})
df["id"] = df.index

molten_df = pd.wide_to_long(df, ["Farm", "Year"], i = "id", j = "FarmId").reset_index()
formatted_df = molten_df.pivot(index = ["Fruits", "Year"], columns = "FarmId", values = "Farm").reset_index()

print(formatted_df)

##  FarmId Fruits  Year       1       2
##  0       Apple  2019  1200.0  1900.0
##  1       Apple  2020  1500.0     NaN
##  2       Apple  2021  2000.0  2100.0
##  3       Apple  2022     NaN  2200.0

相关问题