如何使用Python从多个页面进行网页抓取到Excel？

yftpprvb 于 2023-08-02 发布在 Python

关注(0)|答案(1)|浏览(132)

我试图从这个表中提取数据，但我不能为下一组页面做
网址：-https://securities.stanford.edu/filings.html?page=1
只能对page = 1执行
我试着用漂亮的汤，但无法得到第2页，第3页等的回应。我需要一些帮助转换所有的表格数据到excel

def opencodezscraping(webpage, page_number):
   next_page = webpage + str(page_number)
   response= requests.get(str(next_page))
   soup = BeautifulSoup(response.content,"html.parser")
   soup_table= soup.find('table',{"class":"table table-bordered table-striped table-hover"})
   for j in soup_table.find_all('tr')[1:]:
    row_data = j.find_all('td')
    row = [i.text for i in row_data]
    print(row)
 
      
   #Generating the next page url
   if page_number < 16:
      page_number = page_number + 1
      opencodezscraping(webpage, page_number)
   #calling the function with relevant parameters
opencodezscraping('https://securities.stanford.edu/filings.html?page=', 2)

字符串

python

来源：https://stackoverflow.com/questions/76813576/how-to-web-scrape-from-multiple-pages-using-python-to-excel

1条答案

按热度按时间

vulvrdjw1#

没必要让自己为难。有一个.read_html()函数可以实现你想要的功能。

>>> import pandas as pd
>>> 
>>> dfs = pd.read_html('https://securities.stanford.edu/filings.html?page=1')
>>> 
>>> len(dfs)
1
>>> df = dfs[0]
>>> df
                                          Filing Name Filing Date        District Court     Exchange Ticker
0                                           AT&T Inc.  07/28/2023         D. New Jersey  New York SE      T
1                                 Syneos Health, Inc.  07/27/2023         S.D. New York       NASDAQ   SYNH
...

字符串
然后你可以.stack()页面df的，附加到list的dict s，无论什么。

展开查看全部

赞(0）回复(0）举报 2023-08-02

我来回答

如何使用Python从多个页面进行网页抓取到Excel？

1条答案

相关问题

热门标签

最新问答