我试图从这个表中提取数据,但我不能为下一组页面做
网址:-https://securities.stanford.edu/filings.html?page=1
只能对page = 1执行
我试着用漂亮的汤,但无法得到第2页,第3页等的回应。我需要一些帮助转换所有的表格数据到excel
def opencodezscraping(webpage, page_number):
next_page = webpage + str(page_number)
response= requests.get(str(next_page))
soup = BeautifulSoup(response.content,"html.parser")
soup_table= soup.find('table',{"class":"table table-bordered table-striped table-hover"})
for j in soup_table.find_all('tr')[1:]:
row_data = j.find_all('td')
row = [i.text for i in row_data]
print(row)
#Generating the next page url
if page_number < 16:
page_number = page_number + 1
opencodezscraping(webpage, page_number)
#calling the function with relevant parameters
opencodezscraping('https://securities.stanford.edu/filings.html?page=', 2)
字符串
1条答案
按热度按时间vulvrdjw1#
没必要让自己为难。有一个.read_html()函数可以实现你想要的功能。
字符串
然后你可以.stack()页面df的,附加到
list
的dict
s,无论什么。