如何刮Instagram的追随者弹出与Python剧作家

kx7yvsdv  于 2023-11-20  发布在  Python
关注(0)|答案(1)|浏览(105)

我使用剧作家库刮网站,到目前为止,它一直很棒。然而,我想刮的追随者的具体帐户,我不能设法滚动通过追随者的弹出窗口。
例如,当我使用page.mouse.wheel(0,1000)时,它试图滚动整个Instagram页面,而不是在弹出窗口中滚动。
我找到了这些问题的解决方案,但他们都使用 selenium ,我不熟悉,因为我是新的网页抓取,我发现 selenium 是有点压倒性的开始。
所以,我的问题是,我如何添加某种边界框,使剧作家只滚动通过追随者的弹出窗口?
我已经使用剧作家codegen的代码走了这么远..这是我卡住的地方:

from playwright.sync_api import Playwright, sync_playwright, expect 
import time

def run(playwright: Playwright) -> None:
browser = playwright.chromium.launch(headless=False)
context = browser.new_context()

# Open new page
page = context.new_page()

# Go to https://www.instagram.com/
page.goto("https://www.instagram.com/")

# Click on Username field
page.locator(
    "[aria-label=\"Phone number\\, username\\, or email\"]").click()

# Fill with username
page.locator(
    "[aria-label=\"Phone number\\, username\\, or 
email\"]").fill("USERNAME")

# Click on Password field
page.locator("[aria-label=\"Password\"]").click()

# Fill with password
page.locator("[aria-label=\"Password\"]").fill("PASSWORD")

# Click Log In
page.locator("button:has-text(\"Log In\")").first.click()
page.wait_for_url("https://www.instagram.com/accounts/onetap/? 
next=%2F")

# Click text=Not Now
page.locator("text=Not Now").click()
page.wait_for_url("https://www.instagram.com/")

# Click text=Not Now
page.locator("text=Not Now").click()

page.goto("https://www.instagram.com/instagram/")

# Click text=542M followers
page.locator("text=542M followers").click()
page.wait_for_url("https://www.instagram.com/instagram/followers/")

page.mouse.wheel(0, 2000)
time.sleep(4)
page.mouse.wheel(0, 2000)
time.sleep(4)
page.mouse.wheel(0, 2000)

字符串

798qvoo8

798qvoo81#

您可以使用此示例作为脚本的起点

from playwright.sync_api import Playwright, sync_playwright

def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(headless=False)
    context = browser.new_context()

    # Open new page
    page = context.new_page()

    # Go to https://www.instagram.com/
    page.goto("https://www.instagram.com/")

    # Fill with username
    page.get_by_label("Phone number, username, or email").click()
    page.get_by_label("Phone number, username, or 
                      email").fill("[email protected]")

    # Fill with password
    page.get_by_label("Password").click()
    page.get_by_label("Password").fill("MyVeryStrongPassword!")

    # Click Log In
    page.get_by_role("button", name="Log in", exact=True).click()
    page.wait_for_url("https://www.instagram.com/accounts/onetap/?next=%2F")

    page.goto("https://www.instagram.com/")

    # Click text=Not Now
    page.get_by_role("button", name="Not Now").click()
    page.wait_for_url("https://www.instagram.com/")

    # put the link of the profile from which you want to get followers
    page.goto("https://www.instagram.com/desired_profile/followers/")

    # Use the while loop where you compare the number of profiles in the DOM
    # with the number of followers indicated in the profile header
    # because this example will only scroll 5 times
    for _ in range(5):
        page.locator('a > div > div > 
                     span[dir="auto"]').last.scroll_into_view_if_needed()
        page.wait_for_timeout(5 * 1000)
    page.pause()

if __name__ == "__main__":
    with sync_playwright() as pw:
        run(pw)

字符串

相关问题