numpy Pandas Dataframe :将一列拆分为多列

ndasle7k  于 2022-11-24  发布在  其他
关注(0)|答案(3)|浏览(169)

我尝试将一列Class拆分为多列,并基于此更改列名。

ID    Name    Class       

  0    12    John      A
  1    13    Mark      A
  2    14    Tony      B
  3    15    Marcus    C 
  4    16    Phill     D
  5    17    Jack      A

最终df

ID    Name    Class     A         B       C       D  

0    12    John      A       A
1    13    Mark      A       A
2    14    Tony      B                 B
3    15    Marcus    C                         C
4    16    Phill     D                                D
5    17    Jack      A       A
klsxnrf1

klsxnrf11#

import numpy as np
uniq_class = df['Class'].unique().tolist()
# create a diagonal matrix with unique class as value
D = np.diag(uniq_class).tolist()
# map the diagonal matrix dictionary for each class value
temp = dict(zip(uniq_class, D))
# map class values to the temp dictionary
df[uniq_class] = df['Class'].map(temp).tolist()
df

输出量:

ID    Name Class  A  B  C  D
0  12    John     A  A         
1  13    Mark     A  A         
2  14    Tony     B     B      
3  15  Marcus     C        C   
4  16   Phill     D           D
5  17    Jack     A  A
des4xlb0

des4xlb02#

这样做的一个可能较慢的方法是定义一个函数,然后对原始列中每个项的所有可能答案进行循环。

#define a function to see if matched value
def new_column_val(row, value, column):
    
    if row[column] == value:
       return value
    else:
      return None

#create the new columns
for class_name in df["class"].unique():

    df[class] = df.apply(new_column_val, args = (class_name, "class")
plicqrtu

plicqrtu3#

您可以使用get_dummies:

mask=pd.get_dummies(df.Class).replace(1,np.nan)
for col in mask.columns:
    mask[col].fillna(col, inplace=True)

final=df.join(mask.replace(0,np.nan))
final
    ID    Name    Class     A         B       C       D  

0    12    John      A       A
1    13    Mark      A       A
2    14    Tony      B                 B
3    15    Marcus    C                         C
4    16    Phill     D                                D
5    17    Jack      A       A

相关问题