如何在pandas dataframe的另一列中查找具有元组列表的列

nlejzf6q  于 2023-05-21  发布在  其他
关注(0)|答案(1)|浏览(129)

我正在寻找最好的方法来查找Name列中marks列中的第一个元素,并在相应的元组中附加相应的Score。下面是dataframe和预期的输出

df = pd.DataFrame({
    'Name': ['Nik', 'Kate', 'Evan', 'Naik'],
    'Age': [33, 34, 43, 44],
    'Score': [90, 95, 93, 92],
    'marks': [
        [('Nik', 100), ('Naik', 85)],
        [('Kate', 100)],
        [('Evan', 100)],
        [('Nik', 77), ('Naik', 100)],
    ]
})

预期输出:

zmeyuzjn

zmeyuzjn1#

更新

使用循环并改变列表:

scores = df.set_index('Name')['Score']

for l in df['marks']:
    for i, t in enumerate(l):
        l[i] = t+(scores.get(t[0]),)

输出:

Name  Age  Score                             marks
0   Nik   33     90  [(Nik, 100, 90), (Naik, 85, 92)]
1  Kate   34     95                 [(Kate, 100, 95)]
2  Evan   43     93                 [(Evan, 100, 93)]
3  Naik   44     92  [(Nik, 77, 90), (Naik, 100, 92)]
以前的回答

我将使用zip的列表解析和next的生成器:

df['first'] = [next((x for x in m if x[0]==n), None)
               for n, m in zip(df['Name'], df['marks'])]

输出:

Name  Age  Score                     marks        first
0   Nik   33     90  [(Nik, 100), (Naik, 85)]   (Nik, 100)
1  Kate   34     95             [(Kate, 100)]  (Kate, 100)
2  Evan   43     93             [(Evan, 100)]  (Evan, 100)
3  Naik   44     92  [(Nik, 77), (Naik, 100)]  (Naik, 100)

如果你想要号码:

df['first'] = [next((x[1] for x in m if x[0]==n), None)
               for n, m in zip(df['Name'], df['marks'])]

输出:

Name  Age  Score                     marks  first
0   Nik   33     90  [(Nik, 100), (Naik, 85)]    100
1  Kate   34     95             [(Kate, 100)]    100
2  Evan   43     93             [(Evan, 100)]    100
3  Naik   44     92  [(Nik, 77), (Naik, 100)]    100

或者使用纯pandas(可能效率较低):

df['first'] = (df
   .explode('marks')
   .loc[lambda d: d['marks'].str[0].eq(d['Name'])]
   ['marks'].str[1]
   .groupby(level=0).first()
)

相关问题