python-3.x 无法使用sha256散列查询graphql以从网页抓取属性链接

5ssjco0h  于 2023-02-06  发布在  Python
关注(0)|答案(1)|浏览(177)

访问这个website之后,当我用Sydney CBD, NSW填写输入框并点击搜索按钮时,我可以看到那个站点上显示的所需结果。
我想刮属性链接使用请求模块。当我去下面的尝试,我可以从第一页的属性链接。
这里的问题是我在params中硬编码了sha256Hash的值,这不是我想要做的。我不知道通过向建议URL发出get请求检索到的ID是否需要转换为sha256Hash
但是,当我使用get_hashed_string()函数执行此操作时,它产生的值与params中的硬编码值不同,因此,脚本在以下行中生成一个keyErrorcontainer = res.json().

import requests
import hashlib
from pprint import pprint
from bs4 import BeautifulSoup

url = 'https://suggest.realestate.com.au/consumer-suggest/suggestions'
link = 'https://lexa.realestate.com.au/graphql'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
}
payload = {
    'max': '7',
    'type': 'suburb,region,precinct,state,postcode',
    'src': 'homepage-web',
    'query': 'Sydney CBD, NSW'
}
params = {"operationName":"searchByQuery","variables":{"query":"{\"channel\":\"buy\",\"page\":1,\"pageSize\":25,\"filters\":{\"surroundingSuburbs\":true,\"excludeNoSalePrice\":false,\"ex-under-contract\":false,\"ex-deposit-taken\":false,\"excludeAuctions\":false,\"excludePrivateSales\":false,\"furnished\":false,\"petsAllowed\":false,\"hasScheduledAuction\":false},\"localities\":[{\"searchLocation\":\"sydney cbd, nsw\"}]}","testListings":False,"nullifyOptionals":False},"extensions":{"persistedQuery":{"version":1,"sha256Hash":"ef58e42a4bd826a761f2092d573ee0fb1dac5a70cd0ce71abfffbf349b5b89c1"}}}

def get_hashed_string(keyword):
    hashed_str = hashlib.sha256(keyword.encode('utf-8')).hexdigest()
    return hashed_str

with requests.Session() as s:
    s.headers.update(headers)
    r = s.get(url,params=payload)
    hashed_id = r.json()['_embedded']['suggestions'][0]['id']

    # params['extensions']['persistedQuery']['sha256Hash'] = get_hashed_string(hashed_id)
    
    res = s.post(link,json=params)
    container = res.json()['data']['buySearch']['results']['exact']['items']
    for item in container:
        print(item['listing']['_links']['canonical']['href'])

如果我按原样运行脚本,它会运行得很漂亮。当我取消注解params['extensions']['persistedQuery']-->行并再次运行脚本时,脚本会中断。

如何生成sha256Hash的值并在上面的脚本中使用该值?

ergxz8rk

ergxz8rk1#

这不是graphql的工作方式,sha的值在所有请求中保持不变,但缺少的是一个有效的graphql查询。
您必须首先重新构造它,然后使用API分页-这是关键。
具体方法如下:

import json

import requests

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/109.0",
    "Accept": "application/graphql+json, application/json",
    "Content-Type": "application/json",
    "Host": "lexa.realestate.com.au",
    "Referer": "https://www.realestate.com.au/",
}

endpoint = "https://lexa.realestate.com.au/graphql"
graph_query = "{\"channel\":\"buy\",\"page\":page_number,\"pageSize\":25,\"filters\":{\"surroundingSuburbs\":true," \
               "\"excludeNoSalePrice\":false,\"ex-under-contract\":false,\"ex-deposit-taken\":false," \
               "\"excludeAuctions\":false,\"excludePrivateSales\":false,\"furnished\":false,\"petsAllowed\":false," \
               "\"hasScheduledAuction\":false},\"localities\":[{\"searchLocation\":\"sydney cbd, nsw\"}]}"

graph_json = {
  "operationName": "searchByQuery",
  "variables": {
    "query": "",
    "testListings": False,
    "nullifyOptionals": False
  },
  "extensions": {
    "persistedQuery": {
      "version": 1,
      "sha256Hash": "ef58e42a4bd826a761f2092d573ee0fb1dac5a70cd0ce71abfffbf349b5b89c1"
    }
  }
}

if __name__ == '__main__':
    with requests.Session() as s:
        for page in range(1, 3):
            graph_json['variables']['query'] = graph_query.replace('page_number', str(page))
            r = s.post(endpoint, headers=headers, data=json.dumps(graph_json))
            listing = r.json()['data']['buySearch']['results']['exact']['items']
            for item in listing:
                print(item['listing']['_links']['canonical']['href'])

这将为您提供:

https://www.realestate.com.au/property-apartment-nsw-sydney-140558991
https://www.realestate.com.au/property-apartment-nsw-sydney-141380404
https://www.realestate.com.au/property-apartment-nsw-sydney-140310979
https://www.realestate.com.au/property-apartment-nsw-sydney-141259592
https://www.realestate.com.au/property-apartment-nsw-barangaroo-140555291
https://www.realestate.com.au/property-apartment-nsw-sydney-140554403
https://www.realestate.com.au/property-apartment-nsw-millers+point-141245584
https://www.realestate.com.au/property-apartment-nsw-haymarket-139205259
https://www.realestate.com.au/project/hyde-metropolitan-by-deicorp-sydney-600036803
https://www.realestate.com.au/property-apartment-nsw-haymarket-140807411
https://www.realestate.com.au/property-apartment-nsw-sydney-141370756
https://www.realestate.com.au/property-apartment-nsw-sydney-141370364
https://www.realestate.com.au/property-apartment-nsw-haymarket-140425111
https://www.realestate.com.au/project/greenland-centre-sydney-600028910
https://www.realestate.com.au/property-apartment-nsw-sydney-141364136
https://www.realestate.com.au/property-apartment-nsw-sydney-139367203
https://www.realestate.com.au/property-apartment-nsw-sydney-141156696
https://www.realestate.com.au/property-apartment-nsw-sydney-141362880
https://www.realestate.com.au/property-studio-nsw-sydney-141311384
https://www.realestate.com.au/property-apartment-nsw-haymarket-141354876
https://www.realestate.com.au/property-apartment-nsw-the+rocks-140413283
https://www.realestate.com.au/property-apartment-nsw-sydney-141350552
https://www.realestate.com.au/property-apartment-nsw-sydney-140657935
https://www.realestate.com.au/property-apartment-nsw-barangaroo-139149039
https://www.realestate.com.au/property-apartment-nsw-haymarket-141034784
https://www.realestate.com.au/property-apartment-nsw-sydney-141230640
https://www.realestate.com.au/property-apartment-nsw-barangaroo-141340768
https://www.realestate.com.au/property-apartment-nsw-haymarket-141337684
https://www.realestate.com.au/property-unitblock-nsw-millers+point-141337528
https://www.realestate.com.au/property-apartment-nsw-sydney-141028828
https://www.realestate.com.au/property-apartment-nsw-sydney-141223160
https://www.realestate.com.au/property-apartment-nsw-sydney-140643067
https://www.realestate.com.au/property-apartment-nsw-sydney-140768179
https://www.realestate.com.au/property-apartment-nsw-haymarket-139406051
https://www.realestate.com.au/property-apartment-nsw-haymarket-139406047
https://www.realestate.com.au/property-apartment-nsw-sydney-139652067
https://www.realestate.com.au/property-apartment-nsw-sydney-140032667
https://www.realestate.com.au/property-apartment-nsw-sydney-127711002
https://www.realestate.com.au/property-apartment-nsw-sydney-140903924
https://www.realestate.com.au/property-apartment-nsw-walsh+bay-139130519
https://www.realestate.com.au/property-apartment-nsw-sydney-140285823
https://www.realestate.com.au/property-apartment-nsw-sydney-140761223
https://www.realestate.com.au/project/111-castlereagh-sydney-600031082
https://www.realestate.com.au/property-apartment-nsw-sydney-140633099
https://www.realestate.com.au/property-apartment-nsw-haymarket-141102892
https://www.realestate.com.au/property-apartment-nsw-sydney-139522379
https://www.realestate.com.au/property-apartment-nsw-sydney-139521259
https://www.realestate.com.au/property-apartment-nsw-sydney-139521219
https://www.realestate.com.au/property-apartment-nsw-haymarket-140007279
https://www.realestate.com.au/property-apartment-nsw-haymarket-139156515

相关问题