每当我改变解析函数时，Scrapy不工作并抛出错误？

nkcskrwz 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(118)

from scrapy import Spider
from selenium import webdriver
from scrapy.selector import Selector
from scrapy.http import Request

from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

class BookSeleniumSpider(Spider):
    name = 'book_selenium'
    allowed_domains = ['books.toscrape.com']

    def start_req(self):
        s = Service('C:\\Users\\aps\\Documents\\chromedriver.exe')
        chrome_options = webdriver.ChromeOptions()
        chrome_options.add_argument('--no-sandbox')
        self.driver = webdriver.Chrome(service=s, chrome_options=chrome_options)

        # Get the site we want to start scraping
        self.driver.get('http://books.toscrape.com')

        sel = Selector(text=self.driver.page_source)
        books = sel.xpath('//h3/a/@href').extract()

        for book in books:
            url = 'http://books.toscrape.com/' + book
            print(url)

    def parse_book(self, response):
        pass

当我把parse函数改为start_req时，它停止工作了。但是当我把它改回parse时，它工作得很好。我不知道为什么。有人能给我解释一下吗

scrapy

来源：https://stackoverflow.com/questions/72622026/whenever-i-change-parse-function-scrapy-doesnt-work-and-throws-error

1条答案

按热度按时间

ncecgwcz1#

当创建scrapy请求时，它们会被一个回调函数初始化以处理结果。除非用户明确地标识了这个回调函数，否则默认使用的函数是parse方法。因此，当你改变方法的名称时，它会抛出一个错误，因为回调函数不再存在。
如果由于某种原因你确实想改变名字，一种方法是在类中给函数名分配parse属性，这样做几乎不费什么力气。例如：

class BookSeleniumSpider(Spider):
    name = 'book_selenium'
    allowed_domains = ['books.toscrape.com']

    def start_req(self):
        s = Service('C:\\Users\\aps\\Documents\\chromedriver.exe')
        chrome_options = webdriver.ChromeOptions()
        chrome_options.add_argument('--no-sandbox')
        self.driver = webdriver.Chrome(service=s, chrome_options=chrome_options)

        # Get the site we want to start scraping
        self.driver.get('http://books.toscrape.com')

        sel = Selector(text=self.driver.page_source)
        books = sel.xpath('//h3/a/@href').extract()
        for book in books:
            url = 'http://books.toscrape.com/' + book
            print(url)

    def parse_book(self, response):
        pass
        ...

    ...
    ...
    parse = parse_book

赞(0）回复(0）举报 2022-11-09

我来回答

每当我改变解析函数时，Scrapy不工作并抛出错误？

1条答案

相关问题

热门标签

最新问答