scrapy 为什么斯帕蒂 selenium 没有得到任何数据？

我正在努力刮“产品细节（一个表）”和“请选择大小（JavaScript按钮类型）”部分。我正在使用scrapy-selenium来抓取这个网页。这段代码可以抓取除了02部分以外的所有内容。我已经检查过了，只使用了selenium并得到了结果。但是没有使用scrapy-selenium。我还使用了scrapy-selenium。飞溅，但它甚至不能呈现整个页面。我已经检查了前面的问题，但不能得到答案。我到底做错了什么？

import scrapy
from scrapy_selenium import SeleniumRequest
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import time

class ProductsSpider(scrapy.Spider):
    name = 'products'
    allowed_domains = ['www.breuninger.com']

def start_requests(self):
    options = webdriver.ChromeOptions()
    options.headless = True

    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
    driver.set_window_size(1920, 1080)

    driver.get("https://www.breuninger.com/de/damen/luxus/bekleidung-jacken-maentel/")
    time.sleep(5)

    banner_btn = driver.find_element(By.XPATH, "//div[@class='banner-actions-container']/button")
    banner_btn.click()
    time.sleep(3)

    links = driver.find_elements(By.XPATH, "//suchen-produktliste[@id='produktliste']/section/div/suchen-produkt/div/a")

    for link in links:
        href= link.get_attribute('href')
        yield SeleniumRequest(
                url = href,
                callback= self.parse,
                wait_time=1
        )

    driver.quit()
    return super().start_requests()

 def parse(self, response):

    yield {
    'Bold-title' : response.xpath("(//span[@itemprop='name'])[1]/text()").get(),
    'Price' : response.xpath("//div[@itemprop='offers']/span/text()").get(),
    'Beschreibung': response.xpath("//div[@class='bewerten-textformat--produktdetails-detail']/div/ul/li/text()").getall()
    }

你并不真的需要在这里的selenium重炮获得产品的详细信息，如价格，描述，和品牌。
您可以尝试以下操作：

import pandas as pd
import requests
from bs4 import BeautifulSoup
from tabulate import tabulate

url = "https://www.breuninger.com/de/damen/luxus/bekleidung-jacken-maentel/"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.61 Safari/537.36",
}

soup = (
    BeautifulSoup(
        requests.get(url, headers=headers).text,
        "lxml",
    ).select(".suchen-produkt a")
)

products = [
    [
        i.select_one(".suchen-produkt__marke").getText(),
        i.select_one(".suchen-produkt__name").getText(),
        i.select_one(".suchen-produkt__preis").getText(),
    ] for i in soup
]

df = pd.DataFrame(products, columns=["Brand", "Description", "Price"])
df.to_csv("products.csv", index=False)
print(tabulate(df, headers="keys", tablefmt="grid"))

这应该会给予你一个这样的表（沿着.csv文件）。

+----+-------------------------+--------------------------------------------------------------+------------------+
|    | Brand                   | Description                                                  | Price            |
+====+=========================+==============================================================+==================+
|  0 | BURBERRY                | Jacke BINHAM                                                 | 1.549,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  1 | BURBERRY                | Trenchcoat KENSINGTON                                        | 1.849,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  2 | RALPH LAUREN Collection | Blouson mit Schmucksteinen                                   | 2.050 €          |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  3 | BURBERRY                | Trenchcoat KENSINGTON                                        | 1.849,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  4 | BURBERRY                | Trenchcoat WATERLOO                                          | 1.889,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  5 | BURBERRY                | Trenchcoat ISLINGTON                                         | 1.849,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  6 | BURBERRY                | Trenchcoat WATERLOO                                          | 1.889,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  7 | MONCLER                 | Daunenweste LIANE                                            | 495 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  8 | BURBERRY                | Trenchcoat                                                   | 1.849,99 €       |
+----+-------------------------+--------------------------------------------------------------+------------------+
|  9 | MONCLER                 | Jacke im Materialmix                                         | 650 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 10 | MONCLER                 | Jacke AGDE                                                   | 695 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 11 | MONCLER                 | Jacke CECILE                                                 | 520 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 12 | MONCLER                 | Jacke TIYA                                                   | 695 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 13 | MONCLER                 | Daunenweste LIANE                                            | 495 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 14 | MONCLER                 | Daunenparka HERMANVILLE                                      | 1.250 €          |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 15 | BURBERRY                | Trenchcoat KENSINGTON                                        | 999,99 €         |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 16 | MONCLER                 | Jacke AGDE                                                   | 695 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 17 | MONCLER                 | Daunenweste ALPISTE                                          | 750 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 18 | MONCLER                 | Regenmantel HIENGU                                           | 735 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 19 | MONCLER                 | Jacke TIYA                                                   | 695 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+
| 20 | MONCLER                 | Jacke HOULGATE                                               | 780 €            |
+----+-------------------------+--------------------------------------------------------------+------------------+

and more ...

另外，最后一个XPath在该页面上不起作用，因此得到了空列表。

scrapy 为什么斯帕蒂 selenium 没有得到任何数据？

1条答案

相关问题

热门标签

最新问答