pandas 网页搜罗NBA合并数据

ecbunoof  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(80)

我正试图从this webpage的网页上抓取表格。然而,到目前为止,我的脚本还没有生成一个表格。

import requests
import pandas as pd
from bs4 import BeautifulSoup

url = "https://www.nba.com/stats/draft/combine-anthro?SeasonYear=2021-22"

response = requests.get(url)
response.status_code

response.content

response = requests.get(url).content
soup = BeautifulSoup(response, "html.parser")

soup.find_all("table")

字符串
有没有人对我如何找回它的建议?提前感谢.

ldioqlga

ldioqlga1#

您可以尝试捕获外部URL返回的JSON:

import requests

url = "https://stats.nba.com/stats/draftcombineplayeranthro"

payload = {"LeagueID": "00", "SeasonYear": "2022-23"}

headers = {
    "Referer": "https://www.nba.com/",
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64)"
}

data = requests.get(url, params=payload, headers=headers).json()

df = pd.DataFrame(data["resultSets"][0]["rowSet"],
                  columns=data["resultSets"][0]["headers"])

字符串
输出量:

print(df)

 TEMP_PLAYER_ID  PLAYER_ID FIRST_NAME  ... BODY_FAT_PCT HAND_LENGTH HAND_WIDTH
        1630534    1630534      Ochai  ...         5.40        8.75       9.50
        1631116    1631116    Patrick  ...         8.90        8.75       9.50
        1631094    1631094      Paolo  ...          NaN         NaN        NaN
            ...        ...        ...  ...          ...         ...        ...
        1631109    1631109       Mark  ...         5.40        9.00       9.75
        1630592    1630592      Jalen  ...          NaN         NaN        NaN
        1630855    1630855      Fanbo  ...          NaN         NaN        NaN

[83 rows x 18 columns]

相关问题