登录页面并获取内容

ffscu2ro  于 2021-09-29  发布在  Java
关注(0)|答案(1)|浏览(448)

几天来,我一直在尝试使用aiohttp登录到一个网站,然后导航到管理区域以获取内容。在我发布会话后,我不确定如何从管理页面获取内容。我也尝试过从会话中抓取cookies,但我不确定抓取cookies后要做什么。此代码部分已注解掉,因为它不是首选方式。

async def do_task(session, credentials):
    try:
     async with session.get(credentials['domain']) as r:
         url = r.url #follow redirect to login page
         login_data = {"log": credentials['username'], "pwd": credentials['password']}

         # Please help with below
         await session.post(url, json=login_data)
         return await r.get(f'{url}admin').text()

        # #Attempt with cookies
        #  async with session.post(url, json=login_data) as login:
        #      session.cookie_jar.update_cookies(login.cookies)
        #      return await login.get(f'{url}admin').text()

    except Exception as e:
     print(e)

async def tasks(session, dict_list):
    tasks = []
    for credentials in dict_list:
        task = asyncio.create_task(do_task(session, credentials))
        tasks.append(task)
    results = await asyncio.gather(*tasks)
    return results

async def main(x):
    async with aiohttp.ClientSession() as session:
        data = await tasks(session, x)
        return data

if __name__ == '__main__':
    dict_list = ({
        "username": 'test',
        "domain": 'http://url.com/admin',
        "password": 'enter'
    },
    )

    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy()) #for windows
    results = asyncio.run(main(dict_list))

我收到的错误消息是 'ClientResponse' object has no attribute 'get' 我以前使用下面的代码对请求执行过完全相同的操作,但我正在尝试使用aiohttp来加快速度。

with requests.Session() as login_request:
                login_data = {"log": x['username'], "pwd": x['password']
                              }
                login_request.post(url, data=login_data)
                source_code = login_request.get(url).content
dz6r00yl

dz6r00yl1#

这里的主要问题是:
r、 获取(f'{url}/admin').text()
就你而言 rClientResponse (不是 ClientSession . 是 async with session.get(credentials['domain']) 结果)。
您可以异步接收cookie,然后将其用于异步刮取。应该是这样的:

async def login(session, domain: str, login_data: dict):
    async with session.post(domain + '/admin', data=login_data) as resp:
        # I don't know how you check success login...
        # data = await resp.json() or data await resp.text()
        # if data... blablabla
        return [domain, resp.cookies]

async def process_page(session, url: str, cookie: SimpleCookie):
    async with session.get(url, cookie_jar=cookie) as resp:
        content = await resp.text()
        # do something + return...

# example logins = [{'domain': '...', 'login_data': {...}}, ...]

# get cookies for all users and domains

cookies = await asyncio.gather(*[
    login(session, l['domain'], t['login_data'])
    for l in logins
])

# processing '/page1' for all domains and users

result = await asyncio.gather(*[
    process_page(session, c[0] + '/page1', c[1])
    for c in cookies
])

相关问题