pandas 我应该如何将一列数据拆分为多列,并且每列在Python中应该具有唯一值?

sgtfey8w  于 2023-06-20  发布在  Python
关注(0)|答案(1)|浏览(123)

我想将一列拆分为多列,并使用Python为每个值分配相同的列名。

a sample of code is here

     import pandas as pd

    df = pd.DataFrame({"grade": ["a,b,c", "d,e,f", "b,d,a", "a,b,c,d,e,f"]})

I have used split function 

    # split column into multiple columns by delimiter 
    df[['Grade_A', 'Grade_B', 'Grade_C', 'Grade_D', 'Grade_E', 'Grade_F']] = 
    df['grade'].str.split(',', expand=True)

and got different values in columns name for example in column Grade_a, I got *a, d, b, a*, instead I wan to get *a, a, NA*. 

What I really want to find out is the output of this code:

   df = pd.DataFrame({"grade": ["a,b,c,d,e,f", "d,e,f", "b,d,a", "a,b,c,d,e,f"],
                "Grade_A": ["a", "NA", "a", "a"],
                "Grade_B": ["b", "NA", "b", "b"],
                "Grade_c": ["c","NA","NA", "c"],
                "Grade_D": ["d","d", "d", "d"],
                "Grade_E": ["e","e","NA", "e"],
                "Grade_F": ["f","f", "NA", "f"],
                  })

I have solved this problem in excel and R program, but I really want in python.  Does any can help me?
bnl4lu3b

bnl4lu3b1#

可能的解决方案:

from string import ascii_lowercase

d = {l:i for i,l in dict(enumerate(ascii_lowercase)).items()}

N = df["grade"].str.count(",").max()+1

grades = (
    df["grade"].str.extractall("([^,])")
    .assign(idx= lambda x: x[0].map(d)).set_index("idx", append=True)
    .droplevel(1).apply(lambda g: g.reindex(range(N), level=1))
    .unstack().droplevel(0, axis=1)#.fillna("NA") # optional
)

f = lambda x: dict(enumerate(ascii_lowercase)).get(x).upper()
grades.columns = "Grade_" + grades.columns.map(f)

out = df.join(grades)

输出:

print(out)

         grade Grade_A Grade_B Grade_C Grade_D Grade_E Grade_F
0        a,b,c       a       b       c     NaN     NaN     NaN
1        d,e,f     NaN     NaN     NaN       d       e       f
2        b,d,a       a       b     NaN       d     NaN     NaN
3  a,b,c,d,e,f       a       b       c       d       e       f

相关问题