[Selenium]我想保存我的Pinterest板上的所有图像

j1dl9f46 于 2023-02-12 发布在其他

关注(0)|答案(1)|浏览(212)

我想保存所有图片从一个pinterest板。我有麻烦写的过程回到板，并转到下一个图像后，下载图像，我会很感激，如果你能帮助我。
电路板示例：https://www.pinterest.jp/aku_ma/%E3%82%A2%E3%83%8B%E3%83%A1%E3%82%A2%E3%82%A4%E3%82%B3%E3%83%B3/
1.登录
1.访问董事会←我已经做到了这一点.
1.访问图板中图像的页面
1.按下载按钮并保存到指定路径
1.返回讨论板并访问下一个图像的页面

红头发在红头发里

import os
import selenium
import time
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
url ='https://www.pinterest.jp/aku_ma/%E3%82%A2%E3%83%8B%E3%83%A1%E3%82%A2%E3%82%A4%E3%82%B3%E3%83%B3/'
profilefolder = '--user-data-dir=' + '/Users/t/Library/Application Support/Google/Chrome/Default'
emailAdress = 'xxxx@gmail.com'
passwordNumber='xxxx'
foldername="/Users/t/Desktop/koreanLikeImages"
speed = 1
options = Options()
# options.add_argument('--headless')
DRIVER_PATH = "./chromedriver" # My ChromeDrivers Path
driver = webdriver.Chrome(options=options)
driver.get(url)
loginButton = driver.find_element(By.CSS_SELECTOR, "div[data-test-id='login-button']") 
loginButton.click()#Push at login button
time.sleep(1)
#Enter ID,Pass
email = driver.find_element(By.ID,"email")
email.send_keys(emailAdress)
password = driver.find_element(By.ID,"password")
password.send_keys(passwordNumber)
# Push The Red Login Button
redLoginButton = driver.find_element(By.CLASS_NAME, "SignupButton") 
redLoginButton.click()
time.sleep(3)
driver.get(url)

selenium

来源：https://stackoverflow.com/questions/75368713/seleniumi-want-to-save-all-images-from-my-pinterest-boards

1条答案

按热度按时间

bwitn5fc1#

步骤3，4和5是不必要的，因为当你在主页面的高分辨率链接已经加载在html中。例如，这是一个图像的html代码

<img ... srcset="
https://i.pinimg.com/236x/80/c8/ec/80c8ec56386197561bac4c4e40d331b8.jpg 1x,
https://i.pinimg.com/474x/80/c8/ec/80c8ec56386197561bac4c4e40d331b8.jpg 2x, 
https://i.pinimg.com/736x/80/c8/ec/80c8ec56386197561bac4c4e40d331b8.jpg 3x, 
https://i.pinimg.com/originals/80/c8/ec/80c8ec56386197561bac4c4e40d331b8.jpg 4x">

如你所见，每张图片有4个url，每个url是不同分辨率的图片，4x分辨率最高，使用urllib.request.urlretrieve(url)我们可以下载url关联的文件，这样我们就可以直接在主页上下载高质量的图片。

import urllib.request
from selenium.common.exceptions import StaleElementReferenceException
foldername = 'C://Users//gtu//Desktop//folder//'
urls = []
new_images = False
while 1:
    images = driver.find_elements(By.CSS_SELECTOR, 'img[srcset]')
    for img in images:
    
        try:
            url = img.get_attribute('srcset').split(',')[-1].split()[0] # [-1] selects the larget resolution
        except StaleElementReferenceException:
            # as you scroll down old images are removed from the html, so it may raise this error but it's not a real problem
            continue
        
        if url not in urls:
            # scroll down so that new images are loaded
            driver.execute_script('arguments[0].scrollIntoView({block: "center", behavior: "smooth"});', img)
            urls.append(url)
            print(url)
            new_images = True
            file_name = url.split('/')[-1]
            # download the image
            urllib.request.urlretrieve(url, foldername + file_name)
            time.sleep(1)
    
    # if there are no new images it means we reached the bottom of the page
    if not new_images:
        break
    else:
        new_images = False

展开查看全部

赞(0）回复(0）举报 2023-02-12

我来回答

[Selenium]我想保存我的Pinterest板上的所有图像

红头发在红头发里

1条答案

相关问题

热门标签

最新问答