我的CSV示例:
open_local_data,country,competition,match_id,match_name,market_id,market_name,runner_id,runner_name,status,total_matched,odds,market_matched,percentage,above_odds,result,back,lay
2022-08-24 15:00:00,UY,Uruguayan Segunda Division,31684262,Uruguay Montevideo FC v Progreso,1.202440748,Match Odds,11076801,Uruguay Montevideo FC,OPEN,197.2,2.88,448.52,43.96682422188531,9.24460199966309,WINNER,1.7578,-1
2022-08-24 15:00:00,AT,Austrian Matches,31685733,SV Gerasdorf Stammersdorf v Dinamo Helfort,1.202453470,Match Odds,10299781,SV Gerasdorf Stammersdorf,OPEN,15.99,3.05,27.12,58.96017699115043,26.17329174524879,LOSER,-1,0.45609756097560983
2022-08-24 15:00:00,UY,Uruguayan Segunda Division,31684267,Villa Espanola v Sud America,1.202440560,Match Odds,58805,The Draw,OPEN,458.35,3.5,651.11,70.39517132281796,41.82374275138939,LOSER,-1,0.37400000000000005
2022-08-24 15:00:00,UY,Uruguayan Segunda Division,31684266,Miramar Misiones v Central Espanol,1.202440654,Match Odds,5300627,Miramar Misiones,OPEN,642.05,2.1,1075.66,59.68893516538684,12.069887546339224,LOSER,-1,0.85
2022-08-24 15:00:00,UY,Uruguayan Segunda Division,31684266,Miramar Misiones v Central Espanol,1.202440654,Match Odds,5330852,Central Espanol,OPEN,356.65,4.5,1075.66,33.1563877061525,10.93416548393028,LOSER,-1,0.2671428571428571
2022-08-24 15:00:00,BR,Brazilian U20,31688169,Santa Cruz FC U20 v Retro U20,1.202492350,Match Odds,12819986,Santa Cruz FC U20,OPEN,74.82,2.18,78.72999999999999,95.03365934205512,49.16209970902761,LOSER,-1,0.7923728813559321
2022-08-24 15:00:00,CH,Swiss 1 Liga Promotion,31684279,Etoile Carouge v SC Bruhl St Gallen,1.202455897,Match Odds,544529,Etoile Carouge,OPEN,108.24,1.98,169.58999999999995,63.82451795506812,13.319467450017612,WINNER,0.9163,-1
2022-08-24 15:00:00,CH,Swiss 1 Liga Promotion,31684279,Etoile Carouge v SC Bruhl St Gallen,1.202455897,Match Odds,21005121,SC Bruhl St Gallen,OPEN,57.21,3.75,169.58999999999995,33.734300371484174,7.067633704817506,LOSER,-1,0.34
当我想要筛选与第一行具有相同数据的行,而且我知道要使用哪些数据行来筛选DataFrame时,我可以用下列方式找到最终结果:
import pandas as pd
df = pd.read_csv('history_p&l.csv')
def results(a,b):
filter = df[
(df['percentage'] == a) &
(df['odds'] == b)
]
back_sum = filter['back'].sum()
print(back_sum)
results(
df.iloc[0]['percentage'],
df.iloc[0]['odds']
)
但是,目前我希望发现DataFrame中具有值的列的所有可能组合的结果,在本例中,这些是选项,这就是我如何创建可能组合的列表:
def combs(a):
if len(a) == 0:
return [[]]
cs = []
for c in combs(a[1:]):
cs += [c, c+[a[0]]]
return cs
options = [
'country',
'competition',
'match_id',
'match_name',
'market_id',
'market_name',
'runner_id',
'runner_name',
'total_matched',
'odds',
'market_matched',
'percentage',
'above_odds'
]
combinations = combs(options)
如何使函数(def results()
)具有延展性,以便分析每个组合的结果?
也在想我如何能够根据不同的参数数量来调用函数。
注意:我考虑过使用eval()
并从字符串创建代码,但我知道这并没有指明,而且这样做并不明智。
1条答案
按热度按时间r7s23pms1#
要获取
filter
,可以根据combinations
中的单个combination
动态构建 predicate 沿着如下所示不确定
results()
的最佳参数是什么,但它取决于用例。例如,results()
可以只使用combination
而不是a, b
来调用,或者使用两个序列来调用:combination
和相应的值。