pandas 如何解析字符串并将其应用于 Dataframe

3phpmpom  于 2022-11-27  发布在  其他
关注(0)|答案(1)|浏览(161)

我有一个Excel表,用作逻辑运算符的参考,所以我可以稍后将它们连接起来,将逻辑字符串应用到Pandas Dataframe 。

Dataframe

GOOD  BAD UGLY
0   101   60    0
1    22   61    0
2   103   62  NaN
3   104   63    0

我可以从Excel工作表中获取值并将它们添加到列表中。但如何将此逻辑公式解析为df

import pandas as pd
import openpyxl

def create_dataframe():
    df = pd.DataFrame({'GOOD': [101,22,103,104],
                      'BAD': [60,61,62,63],
                      'UGLY': [0,0,'NaN',0],

                      })
    print(df)
    read_filter = pd.read_excel('test.xlsx')
    print(read_filter)
    formulas = []
    logicals = ['>','<']
    for i, filter_col in enumerate(read_filter['col1']):
        if read_filter['Logic'][i] in logicals:
            formula =  f"df['{filter_col}'][{i}]" + read_filter['Logic'][i] + str(read_filter['value'][i])
            formulas.append(formula)
        else:
            formula =  f"{read_filter['Logic'][i]}(df['{filter_col}'])"
            formulas.append(formula)
#      
    print(formulas)
    #df['Result'] = df.apply(lambda x: eval(formulas) , axis=1)
    return df

公式----

["df['GOOD'][0]>100", "df['BAD'][1]<50", "pd.isna(df['UGLY'])"]

预期结果:

GOOD  BAD UGLY  Result
0   101   60    0   False
1    22   61    0   False
2   103   62        True    
3   104   63    0   False
yws3nbqq

yws3nbqq1#

您可以像这样建立完整条件:

>>> ' & '.join(f"({f})" for f in formulas)
"(df['GOOD'][0]>100) & (df['BAD'][1]>50) & (pd.isna(df['UGLY']))"

每个表达式都应放在括号中。否则a > b & c > d将被解析为a > (b & c) > d,* 而不是 * (a > b) & (c > d)
然后eval它:

>>> import pandas as pd
>>> df = pd.DataFrame({'GOOD': [101,22,103,104], 'BAD': [60,61,62,63], 'UGLY': [0,0,float('nan'),0]})
>>> formulas = ["df['GOOD'][0]>100", "df['BAD'][1]<50", "pd.isna(df['UGLY'])"]
>>> eval(' & '.join(f"({f})" for f in formulas), {'df': df, 'pd': pd})
0    False
1    False
2     True
3    False
Name: UGLY, dtype: bool

然后,您可以使用此结果创建列:

>>> df.assign(Result=eval(' & '.join(f"({f})" for f in formulas), {'df': df, 'pd': pd}))
   GOOD  BAD  UGLY  Result
0   101   60   0.0   False
1    22   61   0.0   False
2   103   62   NaN    True
3   104   63   0.0   False

相关问题