scrapy TypeError:“Request”对象不可订阅

zqry0prt  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(177)

我得到TypeError:尝试访问从辅助Web请求传递回的数据时,“Request”对象不可订阅:

import scrapy

class MyItem(scrapy.Item):
   main_url = scrapy.Field()
   addr_name = scrapy.Field()
   addr = scrapy.Field()
   addr_city = scrapy.Field()

class ServiceCanadaSpider(scrapy.Spider):
    name = 'servicecan'
    start_urls = ['http://www.servicecanada.gc.ca/tbsc-fsco/sc-lst.jsp?prov=AB&lang=eng']

    def parse(self, response):
        with open('test', 'w') as f:
            for title in response.xpath('//li/ul/li/a'):
                f.write(title.xpath('text()').extract_first())
                #get url for info page
                url='http://www.servicecanada.gc.ca' + title.xpath('@href').extract_first()
                #parse info page
                item = MyItem()
                request = scrapy.Request(url, callback=self.parse_info_page)
                request.meta['item'] = item

                f.write(',' + url)
                yield request
                f.write(',' + request['addr_name'])
                #f.write(',' + request.addr)
                #f.write(',' + request.addr_city)
                f.write('\n')

    def parse_info_page(self, response):
        item = response.meta['item']
        item['main_url'] = response.url
        if len(response.xpath('//td[@id="offInfo"]/text()')) == 3:
            item['addr_name']='';
            item['addr'] = response.xpath('//td[@id="offInfo"]/text()').extract()[0].replace('\n','')
            item['addr_city'] = response.xpath('//td[@id="offInfo"]/text()').extract()[1].replace('\n','')
        else:
            item['addr_name']=response.xpath('//td[@id="offInfo"]/text()').extract()[0].replace('\n','')
            item['addr'] = response.xpath('//td[@id="offInfo"]/text()').extract()[1].replace('\n','')
            item['addr_city'] = response.xpath('//td[@id="offInfo"]/text()').extract()[2].replace('\n','')
        return [item]

当我产生请求时,我可以看到它的MyItem类中的数据...

{'addr': ' 802 Bow Valley Trail',
 'addr_city': ' Canmore, Alberta',
 'addr_name': ' Canmore Gateway Shops - Building C, Suite 113',
 'main_url': 'http://www.servicecanada.gc.ca/tbsc-fsco/sc-dsp.jsp?rc=4865&lang=eng'}
1wnzp6jl

1wnzp6jl1#

Request类确实不支持订阅,也就是说,不支持使用[]操作符。如果你想通过meta属性访问附加到Request示例的对象的字段,你必须显式地这样做:

request = scrapy.Request(url, callback=self.parse_info_page)
request.meta['item'] = item

f.write(',' + request.meta['item'].addr_name)

相关问题