selenium 使用Selify WebDriver仅获取页面源代码的一部分

gc0ot86w 于 2022-11-10 发布在其他

关注(0)|答案(2)|浏览(128)

我试图获得公告牌100强排行榜的超文本标记语言，但我只得到了大约一半的页面。
我尝试使用以下代码获取页面源代码：

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
s=Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s)
url = "https://www.billboard.com/charts/hot-100/"
driver.get(url)
driver.implicitly_wait(10)
print(driver.page_source)

但它总是只返回排行榜上第53首歌曲的页面源代码(我尝试增加隐式等待，但没有任何变化)

selenium

来源：https://stackoverflow.com/questions/74348691/getting-only-a-part-of-the-page-source-using-selenium-webdriver

2条答案

按热度按时间

bzzcjhmw1#

不知道为什么你会得到53个元素。使用显式等待，我得到了所有100个元素。
用户webdriverWait()作为显式等待。

driver.get('https://www.billboard.com/charts/hot-100/')
elements=WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "ul.lrv-a-unstyle-list h3#title-of-a-story")))
print(len(elements))
print([item.text for item in elements])

需要导入下面的库。

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By

输出：

100
['Anti-Hero', 'Lavender Haze', 'Maroon', 'Snow On The Beach', 'Midnight Rain', 'Bejeweled', 'Question...?', "You're On Your Own, Kid", 'Karma', 'Vigilante Shit', 'Unholy', 'Bad Habit', 'Mastermind', 'Labyrinth', 'Sweet Nothing', 'As It Was', 'I Like You (A Happier Song)', "I Ain't Worried", 'You Proof', "Would've, Could've, Should've", 'Bigger Than The Whole Sky', 'Super Freaky Girl', 'Sunroof', "I'm Good (Blue)", 'Under The Influence', 'The Great War', 'Vegas', 'Something In The Orange', 'Wasted On You', 'Jimmy Cooks', 'Wait For U', 'Paris', 'High Infidelity', 'Tomorrow 2', 'Titi Me Pregunto', 'About Damn Time', 'The Kind Of Love We Make', 'Late Night Talking', 'Cuff It', 'She Had Me At Heads Carolina', 'Glitch', 'Me Porto Bonito', 'Die For You', 'California Breeze', 'Dear Reader', 'Forever', 'Hold Me Closer', 'Just Wanna Rock', '5 Foot 9', 'Unstoppable', 'Thank God', 'Fall In Love', 'Rock And A Hard Place', 'Golden Hour', 'Heyy', 'Half Of Me', 'Until I Found You', 'Victoria’s Secret', 'Son Of A Sinner', "Star Walkin' (League Of Legends Worlds Anthem)", 'Romantic Homicide', 'What My World Spins Around', 'Real Spill', "Don't Come Lookin'", 'Monotonia', 'Stand On It', 'Never Hating', 'Wishful Drinking', 'No Se Va', 'Free Mind', 'Not Finished', 'Music For A Sushi Restaurant', 'Pop Out', 'Staying Alive', 'Poland', 'Whiskey On You', 'Billie Eilish.', 'Betty (Get Money)', '2 Be Loved (Am I Ready)', 'Wait In The Truck', 'All Mine', 'La Bachata', 'Last Last', 'Glimpse Of Us', 'Freestyle', 'Calm Down', 'Gatubela', 'Evergreen', 'Pick Me Up', 'Gotta Move On', 'Perfect Timing', 'Country On', 'Snap', 'She Likes It', 'Made You Look', 'Dark Red', 'From Now On', 'Forget Me', 'Miss You', 'Despecha']

赞(0）回复(0）举报 2022-11-10

zzoitvuj2#

driver.implicitly_wait(10)不是暂停命令！
它对下一行print(driver.page_source)没有影响。
如果您想要等待页面完全加载，您可以等待某些特定的元素可见。为此，使用WebDriverWaitexpected_conditions。或者只添加硬编码暂停time.sleep(10)而不是driver.implicitly_wait(10)

赞(0）回复(0）举报 2022-11-10

我来回答

selenium 使用Selify WebDriver仅获取页面源代码的一部分

2条答案

相关问题

热门标签

最新问答