pandas 有时具有多个键的字典

euoag5mw  于 2023-03-06  发布在  其他
关注(0)|答案(3)|浏览(122)

我有一个Pandas数据框,我想根据字典值创建一个新的列。
下面是我的df和字典:

data = ['One', 'Two', 'Three', 'Four']

df = pd.DataFrame(data, columns=['Count'])

dictionary = {'One':'Red', 'Two':['Red', 'Blue'], 'Three':'Green','Four':['Green','Red', 'Blue']}

这是我想达到的结果,

最好使用空白字段而不是None值,有人知道方法吗?
我尝试了以下方法:

df = df = pd.DataFrame([(k, *v) for k, v in dictionary.items()])
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
df

但是,对于没有多个值的键,它似乎将每个字母的实际字符串拆分到列中,如下所示:

将值Map到用分隔符(,)分隔的一列的解决方案也会很有帮助。

olqngx59

olqngx591#

字典的值中有列表,因此需要if-else语句来防止*解包字符串:

df = pd.DataFrame([(k, *v) 
                   if isinstance(v, list) 
                   else (k, v) for k, v in dictionary.items()])
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
print (df)
    name     n1    n2    n3
0    One    Red  None  None
1    Two    Red  Blue  None
2  Three  Green  None  None
3   Four  Green   Red  Blue

详细信息

print (((*'Red', )))
('R', 'e', 'd')

print (((*['Red', 'Blue'], )))
('Red', 'Blue')

最好使用空白字段而不是None值,有人知道方法吗?
添加DataFrame.fillna

df = pd.DataFrame([(k, *v) 
                   if isinstance(v, list) 
                   else (k, v) 
                   for k, v in dictionary.items()]).fillna('')
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]
print (df)
    name     n1    n2    n3
0    One    Red            
1    Two    Red  Blue      
2  Three  Green            
3   Four  Green   Red  Blue

如果相同的索引和行数使用DataFrame.join
如果原始df_orig与默认RangeIndex不同,则将index=df_orig.index添加到DataFrame构造函数:

df = pd.DataFrame([(k, *v) 
                   if isinstance(v, list) 
                   else (k, v) 
                   for k, v in dictionary.items()], index=df_orig.index).fillna('')
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]

df = df_orig.join(df)

如果需要通过DataFrame.merge中的左连接按name列合并:

df = df_orig.merge(df, on='name', how='left')
mwg9r5ms

mwg9r5ms2#

使用isinstance检查v是否为list,并确保未将字符串解压缩为字符:

df = pd.DataFrame([(k, *v) if isinstance(v, list) else (k, v)
                   for k, v in dictionary.items()])
df.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]

输出:

name     n1    n2    n3
0    One    Red  None  None
1    Two    Red  Blue  None
2  Three  Green  None  None
3   Four  Green   Red  Blue

加入另一个 Dataframe

使用joinmerge,具体取决于您是要基于索引还是"name"列进行组合:

df2 = pd.DataFrame([(k, *v) if isinstance(v, list) else (k, v)
                   for k, v in dictionary.items()]
                   ).fillna('')
df2.columns = ['name'] + [f'n{x}' for x in df.columns[1:]]

# if same index 
out = df.join(df2)

# Or merging on a common column
out = df.merge(df2, on='name', how='left')

输出:

original   name     n1    n2    n3
0        A    One    Red            
1        B    Two    Red  Blue      
2        C  Three  Green            
3        D   Four  Green   Red  Blue

使用df

original   name
0        A    One       
1        B    Two      
2        C  Three            
3        D   Four
mqkwyuun

mqkwyuun3#

另一种可能的解决方案:

df2 = (pd.DataFrame.from_records([[x, dictionary[x]] for x in dictionary])[1]
       .apply(pd.Series))

df2.columns = [f'Color{x+1}' for x in df2.columns]

pd.concat([df['Count'], df2], axis=1)

输出:

Count Color1 Color2 Color3
0    One    Red    NaN    NaN
1    Two    Red   Blue    NaN
2  Three  Green    NaN    NaN
3   Four  Green    Red   Blue

相关问题