我已经成功地废弃了一个网站的链接,我想把它们保存到一个已经创建的名为“HerHoops”的本地文件夹中,以便以后解析。在过去,我已经成功地做到了这一点,但这个网站的链接需要多一点清理。
到目前为止,这是我的代码。我想保持一切后,“框_分数”的链接,使保存的文件名包括日期和球队的比赛。也保存在写模式(“w+”)。
url = f"https://herhoopstats.com/stats/wnba/schedule_date/2004/6/1/"
data = requests.get(url)
soup = BeautifulSoup(data.text)
matchup_table = soup.find_all("div", {"class": "schedule"})[0]
links = matchup_table.find_all('a')
links = [l.get("href") for l in links]
links = [l for l in links if '/box_score/' in l]
box_scores_urls = [f"https://herhoopstats.com{l}" for l in links]
for box_scores_url in box_scores_urls:
data = requests.get(box_scores_url)
# within loop opening up page and saving to folder in write mode
with open("HerHoops/{}".format(box_scores_url[46:]), "w+") as f:
# write to the files
f.write(data.text)
time.sleep(3)
错误是
FileNotFoundError: [Errno 2] No such file or directory: 'HerHoops/2004/06/01/new-york-liberty-vs-charlotte-sting/'
1条答案
按热度按时间tyky79it1#
从错误本身来看,很明显您正在尝试写入文件“HerHoops/2004/06/01/new-york-liberty-vs-charlotte-sting/”,但部分目录不存在我们可以在写入文件之前使用
os.makedirs()
函数创建必要的目录全码