scrapy 网页搜罗得到空数组值

w46czmvw  于 2023-03-12  发布在  其他
关注(0)|答案(1)|浏览(209)

你好所有我试图刮股票网站,以获得股票部门明智的信息在这个website
如果此表的数据可获取,我就在终端中单击scrapy shell

在终端中,这是我运行scrapy shell "https://nepsealpha.com/"response.xpath("//table[@id='fixTable']//tbody//tr")后的命令
但我得到的输出是空list = []
我觉得内容是用javascript渲染的,我可以不使用 selenium 吗?

waxmsbnn

waxmsbnn1#

你要找的数据来自API端点。
你可以得到它,然后按摩它回到一个表的形式或只使用它的一部分。
具体方法如下:

import requests

import pandas as pd

api_endpoint = "https://nepsealpha.com/api/smx9841/dashboard_board"

payload = {
    "_token": "K5fwARzoE7j49mIE5hdeZUqeoYgQGUXnsUeS7SG1"
}

headers = {
    "Accept": "application/json",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest",
}

response = requests.request("POST", api_endpoint, headers=headers, data=payload)
data = response.json()["home_table"]
df = pd.json_normalize(data)
print(df)

输出:

id    index_name  ... indexvalue.percent_change indexvalue.turn_over_value
0   41538         NEPSE  ...                      0.74              1204409411.62
1   41541       BANKING  ...                      0.51                181973963.3
2   41548       TRADING  ...                      0.75                  121077893
3   41550        HOTELS  ...                      1.05                  9852264.4
4   41547       DEVBANK  ...                      1.16                 52379584.2
5   41543    HYDROPOWER  ...                      1.67                317296705.3
6   41546       FINANCE  ...                      1.18                 39390111.3
7   41542   NONLIFEINSU  ...                      0.97                 42936919.2
8   41544   MANUFACTURE  ...                     -1.06                126684476.5
9   41549        OTHERS  ...                      0.71                 21071125.9
10  41540  MICROFINANCE  ...                       0.5                171446649.5
11  41545      LIFEINSU  ...                      0.46                 52938619.8
12  41551    INVESTMENT  ...                      1.11                 44639401.2

相关问题