需要帮助从PrizePick api抓取数据

nnvyjq4y  于 2022-10-23  发布在  其他
关注(0)|答案(2)|浏览(175)

我正在尝试从这个链接(https://api.prizepicks.com/projections)中获取api数据。然而,我似乎遇到了一个403错误。有办法绕过它吗?
这是我的代码:
'''

import pandas as pd
import requests
from pandas.io.json import json_normalize

params = (
    ('league_id', '7'),
    ('per_page', '250'),
    ('projection_type_id', '1'),
    ('single_stat', 'true'),
)

session = requests.Session() 
response = session.get('https://api.prizepicks.com/projections', data=params)
print(response.status_code)

# df1 = json_normalize(response.json()['included'])

# df1 = df1[df1['type'] == 'new_player']

# df2 = json_normalize(response.json()['data'])

# df = pd.DataFrame(zip(df1['attributes.name'], df2['attributes.line_score']), columns=['name', 'points'])

'''

z9smfwbn

z9smfwbn1#

这是访问该数据的一种方式:

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.79 Safari/537.36'
}

url = 'https://api.prizepicks.com/projections'

r = requests.get(url, headers=headers)
df = pd.json_normalize(r.json()['data'])
print(df)

终端打印结果:

type    id  attributes.board_time   attributes.custom_image attributes.description  attributes.end_time attributes.flash_sale_line_score    attributes.is_promo attributes.line_score   attributes.projection_type  attributes.rank attributes.refundable   attributes.start_time   attributes.stat_type    attributes.status   attributes.tv_channel   attributes.updated_at   relationships.duration.data relationships.league.data.type  relationships.league.data.id    relationships.new_player.data.type  relationships.new_player.data.id    relationships.projection_type.data.type relationships.projection_type.data.id   relationships.stat_type.data.type   relationships.stat_type.data.id relationships.duration.data.type    relationships.duration.data.id
0   projection  812524  2022-10-21T00:00:00-04:00   None    RD 4 - PAR 71   None    None    False   4.0 Single Stat 1   True    2022-10-23T05:50:00-04:00   Birdies Or Better   pre_game    None    2022-10-22T20:04:02-04:00   NaN league  131 new_player  87385   projection_type 2   stat_type   32  NaN NaN
1   projection  812317  2022-10-22T22:11:00-04:00   None    LAL None    None    False   6.0 Single Stat 1   True    2022-10-23T15:40:00-04:00   Assists pre_game    None    2022-10-22T22:19:25-04:00   NaN league  7   new_player  1738    projection_type 2   stat_type   20  NaN NaN
2   projection  812975  2020-04-23T12:30:00-04:00   None    NRG (Maps 1-4)  None    None    False   2.0 Single Stat 1   True    2022-10-23T13:00:00-04:00   Goals   pre_game    https://www.twitch.tv/rocketleague  2022-10-23T00:37:14-04:00   NaN league  161 new_player  37461   projection_type 2   stat_type   29  NaN NaN
3   projection  802798  2021-09-01T10:00:00-04:00   None    United States GP Full   None    None    False   2.7 Single Stat 1   True    2022-10-23T15:00:00-04:00   1st Pit Stop Time (sec) pre_game    None    2022-10-23T01:14:10-04:00   NaN league  125 new_player  16369   projection_type 2   stat_type   188 duration    11
4   projection  812467  2022-10-21T00:00:00-04:00   None    RD 4 - 14 Fairways  None    None    False   11.5    Single Stat 1   True    2022-10-23T11:23:00-04:00   Fairways Hit    pre_game    None    2022-10-22T20:22:05-04:00   NaN league  1   new_player  10919   projection_type 2   stat_type   96  NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1534    projection  813017  2021-03-05T19:00:00-05:00   None    Napoli  None    None    False   3.0 Single Stat 1163    True    2022-10-23T14:45:00-04:00   Shots   pre_game    None    2022-10-23T01:20:53-04:00   NaN league  82  new_player  47513   projection_type 2   stat_type   50  NaN NaN
1535    projection  813018  2021-03-05T19:00:00-05:00   None    Roma    None    None    False   2.0 Single Stat 1164    True    2022-10-23T14:45:00-04:00   Shots On Target pre_game    None    2022-10-23T01:20:53-04:00   NaN league  82  new_player  47910   projection_type 2   stat_type   104 NaN NaN
1536    projection  813019  2021-03-05T19:00:00-05:00   None    Napoli  None    None    False   2.0 Single Stat 1165    True    2022-10-23T14:45:00-04:00   Shots   pre_game    None    2022-10-23T01:20:53-04:00   NaN league  82  new_player  47512   projection_type 2   stat_type   50  NaN NaN
1537    projection  812997  2021-03-05T19:00:00-05:00   None    Atalanta    None    None    False   1.5 Single Stat 2013    True    2022-10-23T12:00:00-04:00   Shots   pre_game    None    2022-10-23T00:45:53-04:00   NaN league  82  new_player  47710   projection_type 2   stat_type   50  NaN NaN
1538    projection  812998  2021-03-05T19:00:00-05:00   None    Lazio   None    None    False   1.5 Single Stat 2013    True    2022-10-23T12:00:00-04:00   Shots   pre_game    None    2022-10-23T00:45:53-04:00   NaN league  82  new_player  60433   projection_type 2   stat_type   50  NaN NaN
1539 rows × 28 columns
ykejflvf

ykejflvf2#

您的代码本身很好,问题是站点有一个安全功能,可以检查用户代理的传入请求。您需要做的只是添加一个模仿浏览器的User-Agent标头。然后,您可以取消注解代码的其余部分,它将按预期工作。

import pandas as pd
import requests
from pandas.io.json import json_normalize

params = (
    ('league_id', '7'),
    ('per_page', '250'),
    ('projection_type_id', '1'),
    ('single_stat', 'true'),
)

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36"

}
session = requests.Session()
response = session.get('https://api.prizepicks.com/projections', data=params, headers=headers)
print(response.status_code)

df1 = json_normalize(response.json()['included'])
df1 = df1[df1['type'] == 'new_player']

df2 = json_normalize(response.json()['data'])

df = pd.DataFrame(zip(df1['attributes.name'], df2['attributes.line_score']), columns=['name', 'points'])
print(df)

组:

name  points
0               Jared Goff     0.0
1    Pierre-Emile Højbjerg    13.5
2           Mecole Hardman    11.5
3            Merih Demiral    12.5
4             Ashley Young     2.7
..                     ...     ...
682             Nick Chubb     6.5
683             Derek Carr     0.5
684         Darnell Mooney     1.5
685          Daniel Suarez     2.0
686     Alexander Schwolow     2.5

[687 rows x 2 columns]

相关问题