python-3.x 从Steam网站的不同HTTP请求获得相同的响应

bbmckpt7  于 2023-05-02  发布在  Python
关注(0)|答案(1)|浏览(108)

这是我第一次主动使用StackOverflow,所以请原谅任何错误。我目前正在写一个Python3脚本,应该是刮蒸汽社区市场的图标,名称和价格。数据的提取和格式化按预期工作。网站使用分页,所以我必须发出多个GET请求才能覆盖所有169个页面。我的方法是使用for循环并在URL中插入循环变量,因为我注意到当前页面包含在其中。
我的问题是,当我执行脚本并打印应该包含数据的数组时,90%的数据完全相同。(例如,页面2的内容被添加到数组7次)
我不确定如何修复它,并从请求中获得正确的数据。
我希望这个描述是足够清楚的,感谢任何帮助提前。
下面是源代码:

import requests
from bs4 import BeautifulSoup
import time
import json as json

def main():

    name_arr = []
    img_arr = []
    price_arr = []


    for i in range(1,11): # later change to 169 pages
        
        url = f"https://steamcommunity.com/market/search?q=&category_730_ItemSet%5B%5D=any&category_730_ProPlayer%5B%5D=any&category_730_StickerCapsule%5B%5D=any&category_730_TournamentTeam%5B%5D=any&category_730_Weapon%5B%5D=any&category_730_Exterior%5B%5D=tag_WearCategory2&category_730_Quality%5B%5D=tag_normal&category_730_Quality%5B%5D=tag_unusual&appid=730#p{i}_popular_desc"
        print(url)
        r = requests.get(url)

        print("----------------------------------- on : " + str(i) + "right now")
        print(r.status_code)

        soup = BeautifulSoup(r.content, "html.parser")

        images = soup.find_all("img", class_="market_listing_item_img")
        names = soup.find_all("span", class_="market_listing_item_name")
        prices = soup.find_all("span", class_="sale_price")


        def extract_text(list, list_arr):
            for x in list:
                name_only = x.text.replace("(Field-Tested)", "").strip()
                list_arr.append(name_only)

        def extract_src(list, list_arr):
            for x in list:
                list_arr.append(x["src"])

        extract_text(names, name_arr)
        extract_text(prices,price_arr)
        extract_src(images, img_arr)

        time.sleep(60)


    print(name_arr)
    print(price_arr)
    print(img_arr)

    with open('output.json', 'w') as f:
    # Write the array to file as JSON
        json.dump(name_arr, f)

    # amount = float(dollars.replace("$", "").strip()) 

if __name__ == "__main__":
    main()

这里是终端输出,注意名称是如何在那里多次出现的:

❯ python3 webscrape.py

['P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange', 'P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange', 'P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange', 'Sawed-Off | Highwayman', 'Galil AR | Shattered', 'AUG | Torque', 'SG 553 | Tornado', 'Dual Berettas | Briar', 'SG 553 | Wave Spray', 'Five-SeveN | Kami', 'FAMAS | Contrast Spray', 'MAG-7 | Chainmail', 'Sawed-Off | Serenity', 'P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange', 'P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange', 'P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange', 'Sawed-Off | Highwayman', 'Galil AR | Shattered', 'AUG | Torque', 'SG 553 | Tornado', 'Dual Berettas | Briar', 'SG 553 | Wave Spray', 'Five-SeveN | Kami', 'FAMAS | Contrast Spray', 'MAG-7 | Chainmail', 'Sawed-Off | Serenity', 'Sawed-Off | Highwayman', 'Galil AR | Shattered', 'AUG | Torque', 'SG 553 | Tornado', 'Dual Berettas | Briar', 'SG 553 | Wave Spray', 'Five-SeveN | Kami', 'FAMAS | Contrast Spray', 'MAG-7 | Chainmail', 'Sawed-Off | Serenity', 'P90 | Blind Spot', 'SCAR-20 | Cardiac', 'Five-SeveN | Contractor', 'PP-Bizon | Forest Leaves', 'XM1014 | Urban Perforated', 'Sawed-Off | Irradiated Alert', 'SG 553 | Tornado', 'P250 | Mehndi', 'FAMAS | Commemoration', 'XM1014 | Blaze Orange']
c0vxltue

c0vxltue1#

您在页面上看到的数据是在JavaScript的帮助下从其他URL加载的。你可以用requests模块来模拟:

from time import sleep
import requests
from bs4 import BeautifulSoup

api_url = 'https://steamcommunity.com/market/search/render/'

params = {
    "query": "",
    "start": 0,
    "count": 10,
    "search_descriptions": "0",
    "sort_column": "popular",
    "sort_dir": "desc",
    "appid": "730",
    "category_730_ItemSet[]": "any",
    "category_730_ProPlayer[]": "any",
    "category_730_StickerCapsule[]": "any",
    "category_730_TournamentTeam[]": "any",
    "category_730_Weapon[]": "any",
    "category_730_Exterior[]": "tag_WearCategory2",
    "category_730_Quality[]": ["tag_normal", "tag_unusual"],
}

with requests.session() as s:
    s.get('https://steamcommunity.com/market/search?q=&category_730_ItemSet%5B%5D=any&category_730_ProPlayer%5B%5D=any&category_730_StickerCapsule%5B%5D=any&category_730_TournamentTeam%5B%5D=any&category_730_Weapon%5B%5D=any&category_730_Exterior%5B%5D=tag_WearCategory2&category_730_Quality%5B%5D=tag_normal&category_730_Quality%5B%5D=tag_unusual&appid=730')

    for params['start'] in range(0, 100, 10):  # <-- increase number of pages here
        data = s.get(api_url, params=params).json()
        soup = BeautifulSoup(data['results_html'], 'html.parser')

        for item in soup.select('.market_listing_row_link'):
            name = item.select_one('.market_listing_item_name').text.strip()
            qty = item.select_one('.market_listing_num_listings_qty').text.strip()
            price = item.select_one('[data-price]').text.strip()
            print('{:<50} {:<5} {}'.format(name, qty, price))

        sleep(10)

图纸:

Sawed-Off | Highwayman (Field-Tested)              132   $0.92 USD
Galil AR | Shattered (Field-Tested)                98    $5.81 USD
AUG | Torque (Field-Tested)                        120   $7.91 USD
SG 553 | Tornado (Field-Tested)                    91    $8.19 USD
Dual Berettas | Briar (Field-Tested)               103   $2.01 USD
SG 553 | Wave Spray (Field-Tested)                 101   $5.56 USD
Five-SeveN | Kami (Field-Tested)                   136   $1.47 USD
FAMAS | Contrast Spray (Field-Tested)              158   $2.10 USD
MAG-7 | Chainmail (Field-Tested)                   18    $16.70 USD
Sawed-Off | Serenity (Field-Tested)                67    $1.49 USD
P250 | Whiteout (Field-Tested)                     46    $18.72 USD
MP7 | Olive Plaid (Field-Tested)                   95    $1.38 USD
CZ75-Auto | Army Sheen (Field-Tested)              82    $0.98 USD
G3SG1 | Arctic Camo (Field-Tested)                 32    $4.31 USD
M4A4 | Asiimov (Field-Tested)                      57    $237.28 USD
P90 | Fallout Warning (Field-Tested)               66    $7.03 USD
Tec-9 | Remote Control (Field-Tested)              65    $3.64 USD
SSG 08 | Tropical Storm (Field-Tested)             89    $6.00 USD
USP-S | Target Acquired (Field-Tested)             19    $210.02 USD
M4A4 | Radiation Hazard (Field-Tested)             111   $26.00 USD
SSG 08 | Lichen Dashed (Field-Tested)              136   $1.39 USD
M4A1-S | Dark Water (Field-Tested)                 127   $79.48 USD
Nova | Walnut (Field-Tested)                       126   $1.21 USD
M4A4 | Zirka (Field-Tested)                        146   $33.01 USD
P250 | Vino Primo (Field-Tested)                   99    $4.98 USD
MP7 | Skulls (Field-Tested)                        130   $16.23 USD
M249 | Shipping Forecast (Field-Tested)            42    $15.48 USD
Five-SeveN | Nightshade (Field-Tested)             95    $1.27 USD
G3SG1 | Safari Mesh (Field-Tested)                 111   $1.17 USD
Negev | CaliCamo (Field-Tested)                    42    $5.84 USD
AWP | Hyper Beast (Field-Tested)                   142   $42.55 USD
UMP-45 | Crime Scene (Field-Tested)                20    $67.32 USD
★ Moto Gloves | 3rd Commando Company (Field-Tested) 46    $117.73 USD
Desert Eagle | Code Red (Field-Tested)             122   $34.54 USD
Tec-9 | Tornado (Field-Tested)                     79    $1.27 USD
Sawed-Off | Highwayman (Field-Tested)              132   $0.92 USD
P90 | Baroque Red (Field-Tested)                   15    $29.97 USD
UMP-45 | Caramel (Field-Tested)                    125   $8.41 USD
G3SG1 | Murky (Field-Tested)                       90    $0.49 USD
P2000 | Woodsman (Field-Tested)                    27    $7.18 USD

...and so on.

相关问题