无法到达下一个URL(Scrapy)

yv5phkfx 于 2023-08-05 发布在其他

关注(0)|答案(1)|浏览(154)

我正在尝试通过Visual Studio Code（1.80.1）使用Scrapy（2.9.0）在Python（3.11.4）中编写代码。我知道这样写不合适。我有另一个项目，这部分是唯一的一部分，我不能得到的工作。所以我才把它拿出来，这样我就可以轻松地玩了。

import scrapy
import scrapy
import w3lib.html

class KmjklolSpider(scrapy.Spider):
    name = "kmjklol"
    allowed_domains = ["ensonhaber.com"]
    start_urls = ["https://www.ensonhaber.com/gundem/mehmet-ozhaseki-acikladi-haftaya-istanbulda-calismalara-basliyoruz"]

    def parse(self, response):
        sourcecode = response.xpath("//article")
        for myinfo in sourcecode:
            title = w3lib.html.remove_tags(myinfo.xpath(".//div[@class='article-title']/h1").get().strip())
            desc = w3lib.html.remove_tags(myinfo.xpath(".//div[@class='article-title']/h2[@class='desc']/text()").get().strip())
            text = myinfo.xpath(".//div[@class='article-body']/p[@class='text']/text() | //div[@class='article-body']/p[@class='text']//*/text()").getall()
            yield {"Title: ": title, "Description: ": desc, "Content: ": text}

        nextpage = response.xpath("//div[@class='shotnews mb-30']/a/@href").extract()
        print(nextpage)

        completed_nextpage = response.urljoin(nextpage)

        yield scrapy.Request(completed_nextpage)

字符串
Output
我希望我的代码能到达新闻网站，获取标题、描述和新闻内容，它做到了。但我卡住的部分是，我无法到达下一个URL，这是给我的建议新闻在右边。我做错了什么？

scrapy

来源：https://stackoverflow.com/questions/76722260/cant-reach-the-next-url-scrapy