scrapy XHR请求预览显示响应中不存在的数据

6tqwzwtp 于 2022-11-09 发布在其他

关注(0)|答案(1)|浏览(256)

我正在尝试使用scrappy从一个公共网站上抓取一些数据。谢天谢地，这些数据大部分可以在这里的xhr请求中找到：

但是当我双击查看实际响应时，search_results项中没有数据：

我只是想知道是怎么回事与请求，我如何才能访问这些数据在scrapy，目前我试图喜欢这个，但显然它没有抓住任何数据从响应。

import scrapy
from scrapy import Spider

class Whizzky(Spider):
    name = "whizzky"
    def __init__(self,):
        self.request_url = "https://www.whizzky.net/webapi/get_finder_results.php?cid=31&flavours=&view=rated&price=3&country=&regions="

    def start_requests(self):
        urls = ["https://www.whizzky.net/finder_results.php"]
        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):
            yield scrapy.Request(self.request_url, 
                     method='POST',
                     callback=self.parse_2)

    def parse_2(self, response):
        info = {}
        info["data"] = response.json()["search_results"]
        yield info

scrapy

来源：https://stackoverflow.com/questions/73681582/xhr-request-preview-shows-data-that-isnt-present-in-response

1条答案

按热度按时间

i2loujxw1#

实际上，响应工作正常，编码结构也正常。您从API方法获取json数据。因此，为了正确提取数据，必须将content-type headers和有效负载数据作为请求方法中的主体参数注入。

完整工作溶液示例：

import scrapy
class TestSpider(scrapy.Spider):
    name = 'test'
    body = 'maxResults=30&pager=30'

    def start_requests(self):
        api_url ='https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions='
        yield scrapy.Request(
            url = api_url,
            callback=self.parse,
            body=self.body,
            method="POST",
            headers= {
                    "content-type":"application/x-www-form-urlencoded",
                    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"
                    })
    def parse(self, response):

        for card in  response.json()['search_results']:
            yield {'Title':card['product_title']}

输出：

{'Title': 'Midleton Very Rare 2002'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Michel Couvreur Special Vatting Peaty '}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Laphroaig 25 Year Old'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'The Macallan Rare Cask Batch No.1 2018 Release'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'William Larue Weller 2017 Release 128.2 Proof'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Glen Moray Rare Vintage 1987 25 Year Old Port Cask Finish Batch 2'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'High West A Midwinter Nights Dram Limited Engagement Act 7 Scene 4'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Laphroaig Extremely Rare 30 Year Old'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Ardbeg Supernova 2019 Committee Release'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Caol Ila Cask Strength Distillery Exclusive 2017 Release'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'The Macallan 1824 Collection Estate Reserve Travel Retail Exclusive'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Ardbeg Supernova 2014 Committee Release'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'The Macallan Edition No.1'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': "Midleton Dair Ghaelach Grinsell's Wood"}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'George T. Stagg Bourbon 2019 Release 116.9 Proof'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Glenmorangie 25 Year Old'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'The Loch Fyne Craigellachie 10 Year Old'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'The Macallan 1997 18 Year Old Sherry Oak Cask Matured'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Lot No.40 Cask Strength Rye 1st Edition 12 Year Old'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Breaker Bourbon Port Barrel Finish Special Edition'}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': "Parker's Heritage Collection Single Barrel Bourbon 11 Year Old"}
2022-09-12 00:25:21 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.whizzky.net/webapi/get_finder_results.php?cid=&flavours=&view=rated&price=3&country=&regions=>
{'Title': 'Tomintoul 27 Year Old'}

赞(0）回复(0）举报 2022-11-09

我来回答

scrapy XHR请求预览显示响应中不存在的数据

1条答案

相关问题

热门标签

最新问答