使用selenium从网页中检索数据-不检索所有数据

bpsygsoo 于 2023-05-29 发布在其他

关注(0)|答案(2)|浏览(265)

我试图从www.example.com检索数据（硬币名称，价格，coinmarket上限和流通供应）coinmarketcap.com，但当我运行下面的代码时，我只得到11个硬币名称。另外，我无法检索其他数据。我尝试了几种选择，但都没有成功。我的目标是将数据存储在dataframe中，这样我就可以分析它。

driver = webdriver.Chrome(r'C:\Users\Ejer\PycharmProjects\pythonProject\chromedriver')
driver.get('https://coinmarketcap.com/')
Crypto = driver.find_elements_by_xpath("//div[contains(concat(' ', normalize-space(@class), ' '), 'sc-16r8icm-0 sc-1teo54s-1 lgwUsc')]")
#price = driver.find_elements_by_xpath('//td[@class="cmc-link"]')
#coincap = driver.find_elements_by_xpath('//td[@class="DAY"]')
CMC_list = []
for c in range(len(Crypto)):
    CMC_list.append(Crypto[c].text)
print(CMC_list)
#driver.get('https://coinmarketcap.com/')
#print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[contains(@class, 'cmc-table')]//tbody//tr//td/a//p[@color='text']")))[:50]])
driver.close()

selenium

来源：https://stackoverflow.com/questions/65279203/using-selenium-to-retrieve-data-from-webpage-not-retrieving-all-data

2条答案

按热度按时间

xxslljrj1#

尝试以下代码行以获取第页上的所有值：

cryptos = [name.text for name in driver.find_elements_by_xpath('//td[3]/a[@class="cmc-link" and starts-with(@href, "/currencies/")]//p[@color="text"]')]

赞(0）回复(0）举报 2023-05-29

tsm1rwdh2#

尝试使用BeautifulSoup删除coinmarket数据集

data_list = []
crypto_count = 0
for page in range(1, 100):
    url = f'https://coinmarketcap.com/?page={page}'
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    rows = soup.find('table', {'class': 'sc-beb003d5-3 ieTeVa cmc- table'}).find('tbody').find_all('tr')
    crypto_list = []
    for row in rows:
        dic = {}
        cells = row.find_all('td')
        if len(cells) >= 10:
            dic['Name'] = cells[2].text.strip()
            dic['Price'] = cells[3].text.strip().replace(',', '')
            dic['OneH'] = cells[4].text.strip()
            dic['TwentyfourH'] = cells[5].text.strip()
            dic['SevenD'] = cells[6].text.strip()
            dic['MarketCap'] = cells[7].text.strip().replace(',', '')
            dic['Volume'] = cells[8].text.strip().replace(',', '')
            dic['CirculatingSupply'] = cells[9].text.strip().replace(',', '')
            crypto_list.append(dic)
            crypto_count += 1
            if crypto_count == 1000:
                break
    data_list.append(crypto_list)

展开查看全部

赞(0）回复(0）举报 2023-05-29

我来回答

使用selenium从网页中检索数据-不检索所有数据

2条答案

相关问题

热门标签

最新问答