python 我一直得到错误TypeError:unsupported operand type(s)for +=:'NoneType' and 'str' [closed]

jgzswidk  于 2024-01-05  发布在  Python
关注(0)|答案(1)|浏览(307)

**已关闭。**此问题为not reproducible or was caused by typos。目前不接受回答。

这个问题是由一个错字或一个无法再重现的问题引起的。虽然类似的问题可能在这里是on-topic,但这个问题的解决方式不太可能帮助未来的读者。
12小时前关门了
Improve this question
我做了一个scrapy spider,它可以为水管工抓取黄页。地址被分成两部分。所以在我得到这些部分之后,我把它们放在管道文件中。bspider是我的一个spider示例的输出。

  1. 'locality_': 'Manassas, VA 20109',
  2. 'logo': None,
  3. 'name': 'F H Furr Plumbing-Heating & Air Conditioning',
  4. 'number_of_riews': None,
  5. 'payment_mentod': 'check',
  6. 'phone_number': '(571) 234-6893',
  7. 'stars_out_of_five': None,
  8. 'street_adress': '9040 Mike Garcia Dr',
  9. 'website': 'http://www.fhfurr.com'}

字符串
但是我的问题是我得到了错误full_address += f'{locailty_} {street_adress}' TypeError: unsupported operand type(s) for +=: 'NoneType' and 'str'。为什么它会这样做,因为街道地址和地点都有值。
spider.py

  1. import scrapy
  2. from yp_scraper.items import PlubmerInfo
  3. class PlumberScraperSpider(scrapy.Spider):
  4. name = 'plumber_scraper'
  5. allowed_domains = ['yellowpages.com']
  6. start_urls = ['https://www.yellowpages.com/']
  7. def parse(self, response):
  8. #bellow are the area name and state these values will be plued into the start url to find the area in wich we are serchign for plumbers
  9. area = "fairfax"
  10. state = "va"#must be an abreation
  11. #the value bellow is the amount of pages you want to scrape
  12. numer_of_pages = 10
  13. page_numebr = 1
  14. #while numer_of_pages > 0:
  15. url = f'https://www.yellowpages.com/{area}-{state}/plumbers?page={page_numebr}'
  16. #page_numebr += 1
  17. #numer_of_pages -= 1
  18. print('tesssssssssssssssst')
  19. yield response.follow(url, callback=self.parse_plumbers)
  20. def parse_plumbers(self, response):
  21. print('text2')
  22. plumber_item = PlubmerInfo()
  23. print('yes_man')
  24. plumbers = response.css('div.result')
  25. for plumber in plumbers:
  26. starter_indidual_url = plumber.css('a.business-name ::attr(href)').get()
  27. indidual_url = f'https://www.yellowpages.com{starter_indidual_url}'
  28. yield response.follow(indidual_url, callback=self.parse_indvidual_plumbers)
  29. def parse_indvidual_plumbers(self, response):
  30. plumber_item = PlubmerInfo()
  31. print(response.xpath('//*[@id="default-ctas"]/a[3]/span/text()').get())
  32. plumber_item['name'] = response.css('h1.business-name ::text').get()
  33. plumber_item['phone_number'] = response.css('a.phone ::text').get()
  34. plumber_item['website'] = response.css('a.website-link ::attr(href)').get()
  35. plumber_item['genral_info'] = response.css('dd.general-info ::text').get()
  36. plumber_item['payment_mentod'] = response.css('dd.payment ::text').get()
  37. plumber_item['stars_out_of_five'] = response.css('div.rating-stars ::attr(class)').get()
  38. plumber_item['number_of_riews'] = response.css('span.count ::text').get()
  39. plumber_item['locality_'] = response.xpath('//*[@id="default-ctas"]/a[3]/span/text()').get()
  40. plumber_item['street_adress'] = response.css('span.address ::text').get()
  41. #plumber_item['services'] = response.css('div.locality ::text').get()
  42. plumber_item['email'] = response.css('a.email-business ::attr(href)').get()
  43. plumber_item['logo'] = response.css('dd.logo ::attr(href)').get()
  44. yield plumber_item


pipelines.py

  1. # Define your item pipelines here
  2. #
  3. # Don't forget to add your pipeline to the ITEM_PIPELINES setting
  4. # See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
  5. # useful for handling different item types with a single interface
  6. from itemadapter import ItemAdapter
  7. class YpScraperPipeline:
  8. def process_item(self, item, spider):
  9. adapter = ItemAdapter(item)
  10. #combindes locality and street adress
  11. locailty_ = adapter.get('locality_')
  12. street_adress = adapter.get('street_adress')
  13. full_address = adapter.get('full_address')
  14. if locailty_ is not None:
  15. for i in street_adress:
  16. full_address += f'{locailty_} {street_adress}'
  17. return item

nhhxz33t

nhhxz33t1#

如果缺少“full_address”,这将返回None,这将导致问题:

  1. full_address = adapter.get('full_address')

字符串
解决这个问题的一种方法是告诉.get()在缺少“full_address”的情况下返回一个空字符串而不是None作为默认值:

  1. full_address = adapter.get('full_address', '')

相关问题