我在尝试从API请求获取数据时遇到了一些意想不到的问题。我发现它抛出了一个“500”错误和此错误消息。我试图抓取此URL“https://www.machinerytrader.com/listings/for-sale/excavators/1031“,但我不知道我实际上错过了什么。
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 2)
这是我目前尝试的方法
import scrapy
import json
class ListingSpider(scrapy.Spider):
name = 'listing'
allowed_domains = ['www.machinerytrader.com']
# start_urls = ['https://www.machinerytrader.com/listings/for-sale/excavators/1031']
def start_requests(self):
payload = {
"Category":"1031",
"sort": "1",
"page":"2"
}
headers= {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
"authority": "www.machinerytrader.com",
"method": "GET",
"path": "/ajax/listings/ajaxsearch?Category=1031&sort=1&page=2",
"scheme": "https",
"accept": "application/json, text/plain, */*",
"accept-encoding": "gzip, deflate, br",
"cache-control": "no-cache",
"content-type": "application/json",
"cookie": "ASP.NET_SessionId=uircx3p1up0gs3we43zfy3xp; Tracking=SessionStarted=1&UserReferrer=&GUID=75541984850425423381; __RequestVerificationToken=CqqhcuoUxcCh_VvGb2QkZPTMG1sygAxcDjWmGutWxYGvIScO7I1rCwBZabShMlyTl9syCA2; BIGipServerwww.machinery_tradesites_http_pool=545368256.20480.0000; ln_or=eyI0MjU0ODkyIjoiZCJ9; AMP_TOKEN=%24NOT_FOUND; _gid=GA1.2.104780578.1678372555; _fbp=fb.1.1678372557791.1218325782; _hjFirstSeen=1; _hjIncludedInSessionSample_1143836=1; _hjSession_1143836=eyJpZCI6IjU1ZGYyOGJmLWQ4YjktNGU2Mi04NjU2LWUwYmJkYzdiNGMxMSIsImNyZWF0ZWQiOjE2NzgzNzI1NTgzNDcsImluU2FtcGxlIjp0cnVlfQ==; _hjIncludedInPageviewSample=1; _hjAbsoluteSessionInProgress=1; __gads=ID=2a38a4e969861bb1:T=1678372561:S=ALNI_MarB5bgIDdhzpQPECKDmX-70INeJg; __gpi=UID=00000bd5f25a7b4b:T=1678372561:RT=1678372561:S=ALNI_MapbcTx6haLt65wewjezZyeMFVtCw; _hjSessionUser_1143836=eyJpZCI6IjI3ODM2ZDdhLTc0YzUtNTIwMi05YjdhLWYxMmM5YTk4ZGNmNiIsImNyZWF0ZWQiOjE2NzgzNzI1NTgzMzksImV4aXN0aW5nIjp0cnVlfQ==; __atuvc=2%7C10; __atuvs=6409eecd9390ad6d001; Top_PopUp=true; reese84=3:MhdsyFtuLMcDbPfjHYfnUQ==:Xnyj2+4WPTbNOTnv4Aj99+6mLrSjYnrQVoSGqCJEwqmN/gdPfQuCPFYN1/1sInEQHaUvLNdN2VbgdxeC96k6tr1MUSbHd2GxI4AKb1CxnkZfLm63/CXWNqJ/vlS66hOTSsEn+gxPb2l3g2TD3RGi0H4PjyhskjDIE10USkPi3mm83aG/xkAYL4khuWtRDaYzyHjzQ76f9yRr0tNnEEbUPbxZTW7BPXcEF606e6mzq6v5/YEy17JScccw/CCkXb4Uv1tzeNYhkMuFj5V5upY0a2tC/MiJeCACNCYnX9obZhGsfPbL6VUYdJDEhmyR8OBJsHuH4BwOdjnbr7pFG+o4AZKqDHliWKhUnDxGAHIhKwzhhq5TFjeJbqRwSLrMXH54WxZZHcuvRtwr734U2F3Pmf8NqW+zavYdB/aYrk+HpA9LfQQQFBGd/1FNRAM0e8fxZpj5U/DxTKPMdvwK5qBnfzQaTzycDwe80G7QRYX9kf4=:gQlue37nFKz2zVkiSWGW9vURldmXHHEIxHz2yiUrtF8=; _uetsid=b29154b0be8711edabac07f5e20bba65; _uetvid=b2917380be8711edbb2587449262974a; _ga=GA1.2.1777003444.1678372555; UserID=ID=n2K08gLg7XRct%2fxxJeWEnGDwbYpUh6vQ%2fiE1eBN%2f25lkMV4lKXFpeoTT54DrUsz9CriJRnchYL4PfEPzqxaCRA%3d%3d&LV=sHEWMHROf%2fDobQFZDWX1nWtv%2bf2Uk9i6YA9N5Sk0lGE%2fWiDudiekp7MPDIUnH0jGKMx9VZbhLzD4VuT7pKbqepCdPLaN274I; UserSettingsCookie=screenSize=1246|947; _ga_27QWK2FVDW=GS1.1.1678372557.1.1.1678373273.60.0.0",
"pragma": "no-cache",
"referer": "https://www.machinerytrader.com/listings/for-sale/excavators/1031",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "Windows",
"sec-fetch-dest": "empty",
"sec-fetch-mode": "cors",
"sec-fetch-site": "same-origin",
"x-xsrf-token": "lKwor8adm67mnDJjariTC1-_x2sWvmjxDtVZerZ6p03OwqvVc10YVZUQMmD4-pTv7E2cTSN-8rsTW6ISckmZVgBek66eHw3iFUngI3jYt6h_rwqQ3pI_QxPjYH1us7eHyW27lxFL_-wSS3QC0",
"sec-ch-ua": '"Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"',
}
yield scrapy.Request(
url="https://www.machinerytrader.com/ajax/listings/ajaxsearch?Category=1031&sort=1&page=2",
method="GET",
headers=headers,
body=json.dumps(payload),
callback=self.parse
)
def parse(self, response):
json_resp = json.loads(response.body)
products = json_resp['Listings']
yield {
'DealerLocation': products['DealerLocation'],
}
1条答案
按热度按时间mlnl4t2r1#
您需要一个xsrf-token来进行这些请求,在这种情况下,您可以先向主页(https://www.machinerytrader.com/)发出请求,然后使用选择器(
//input[@name="__XSRF-TOKEN"]/@value
)获取令牌。将此值添加到下一个请求头中,请求将正常工作。如果您想运行/调度这个或多个spider,您可以考虑使用estela,这是一个spider管理解决方案。