为什么selenium中的get_attribute()函数在检查显示属性的网页时返回空字符串?

7gcisfzg  于 2022-11-24  发布在  其他
关注(0)|答案(1)|浏览(109)

我正在尝试从this webpage的video标记中获取src属性。This显示了我在检查图像时看到video标记的位置。safari中标记的XPath是“//*[@id=“player”]/div[2]/div[4]/video”
这是我的代码:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import os
os.environ["SELENIUM_SERVER_JAR"] = "selenium-server-standalone-2.41.0.jar"
browser = webdriver.Safari()
browser.get("https://mplayer.me/default.php?id=MTc3ODc3")
print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.TAG_NAME,"video"))).get_attribute("src"))
browser.quit()

使用.text而不是og .get_Attribute也会返回一个空字符串。我必须使用safari而不是chrome来获取src链接,因为chrome使用a blob storage design,由于a blob storage design通过chrome抓取时会显示“blob:https://mplayer.me/d420cb30-ed6e-4772-b169-ed33a5d3ee9f",而不是https://wwwx18.gogocdn.stream/videos/hls/6CjH7KUeu18L4Y7ls0ohCw/1668685924/177877/81aa0af3891f4ef11da3f67f0d43ade6/ep.1.1657688313.m3u8我想获取的链接“www.example.com“。

ui7jx7zq

ui7jx7zq1#

您可以使用Desired Capabilities从日志中获取Chrome中m3u8文件的链接
以下是一种可能的解决方案:

import json
from selenium import webdriver
from selenium.webdriver import DesiredCapabilities
from selenium.webdriver.chrome.service import Service

options = webdriver.ChromeOptions()
options.add_argument('--headless')
capabilities = DesiredCapabilities.CHROME
capabilities["goog:loggingPrefs"] = {"performance": "ALL"}
options.add_experimental_option("excludeSwitches", ["enable-automation", "enable-logging"])
service = Service(executable_path="path/to/your/chromedriver.exe")
driver = webdriver.Chrome(service=service, options=options, desired_capabilities=capabilities)

driver.get('https://mplayer.me/default.php?id=MTc3ODc3')
logs = driver.get_log('performance')

for log in logs:
    data = json.loads(log['message'])['message']['params'].get('request')
    if data and data['url'].endswith('.m3u8'):
        print(data['url'])

driver.quit()

输出量:

https://wwwx18.gogocdn.stream/videos/hls/myv1spZ0483oSfvbo4bcbQ/1668706324/177877/81aa0af3891f4ef11da3f67f0d43ade6/ep.1.1657688313.m3u8

Win 10Python 3.9.10Selenium 4.5.0上进行测试

相关问题