我正在尝试获得phone number
,但没有给予任何从xpath如何解决这些问题,这些是页面链接https://aaos22.mapyourshow.com/8_0/exhibitor/exhibitor-details.cfm?exhid=999999999999
import scrapy
from scrapy.http import Request
from bs4 import BeautifulSoup
from selenium import webdriver
import time
from scrapy_selenium import SeleniumRequest
import requests
import json
import pandas as pd
class TestSpider(scrapy.Spider):
name = 'test'
def start_requests(self):
yield SeleniumRequest(
url="https://aaos22.mapyourshow.com/8_0/explore/exhibitor-gallery.cfm?featured=false",
wait_time=3,
screenshot=True,
callback=self.parse,
dont_filter=True
)
def parse(self, response):
books = response.xpath("//h3[@class='card-Title\nbreak-word\nf3\nmb1\nmt0']//a//@href").extract()
for book in books:
url = response.urljoin(book)
yield Request(url, callback=self.parse_book)
def parse_book(self, response):
phone = response.xpath("//li[@class='dib ml3 mr3'][2]").get()
print(phone)
2条答案
按热度按时间yruzcnhs1#
如果你想摆脱索引,这是你可以实现的方法:
nxagd54h2#
假设你得到了具体的
HTML
,你可以调整你的xpath
--通过它的class
和最后一个<li>
选择<ul>
。由于这个数字不包括在<span>
中,你必须调用它的sibling
: