scrapy获得图像大小没有下载

rsl1atfo  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(179)

我想在不下载的情况下获得图像大小,可以吗?
图像1网址:https://koctas-img.mncdn.com/mnresize/600/600/productimages/1000599303/1000599303_1_MC/8843182866482_1663925809606.jpg

def parse_product(self, response):

    images = response.css(".swiper-slide::attr(data-large)").getall()
    image1 = images[0]
    image_size=yield Request(image1, method="HEAD", callback=self.callback)
k5ifujac

k5ifujac1#

您可以使用HEAD方法。

import scrapy

class ExampleSpider(scrapy.Spider):
    name = 'example_spider'

    def start_requests(self):
        images_urls = [
            'http://wallpapercave.com/wp/wp1809904.jpg',
            'https://i2.wp.com/www.otakutale.com/wp-content/uploads/2015/10/One-Punch-Man-Anime-Magazine-Visual-01.jpg',
            'https://thedeadtoons.com/wp-content/uploads/2020/06/One-Punch-Man-Season-3.jpg'
        ]
        for url in images_urls:
            yield scrapy.Request(url=url, method='HEAD')

    def parse(self, response,**kwargs):
        yield {
            'Content-Length': response.headers['Content-Length']
        }

输出:

[scrapy.core.scraper] DEBUG: Scraped from <200 https://thedeadtoons.com/wp-content/uploads/2020/06/One-Punch-Man-Season-3.jpg>
{'Content-Length': b'179681'}
[scrapy.core.scraper] DEBUG: Scraped from <200 https://i2.wp.com/www.otakutale.com/wp-content/uploads/2015/10/One-Punch-Man-Anime-Magazine-Visual-01.jpg>
{'Content-Length': b'1847153'}
[scrapy.core.scraper] DEBUG: Scraped from <200 https://wallpapercave.com/wp/wp1809904.jpg>
{'Content-Length': b'246144'}

相关问题