pandas 我如何用一个已知的类名抓取一个表?

8xiog9wr  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(79)

我想从下面的URL中抓取表“隐含波动率”,理想情况下为dataframe
https://optioncharts.io/options/AAPL/overview/option-statistics
我已经用类名“tableborder table-sm optioncharts-table-styling table-light”标识了表

<table class="table border table-sm optioncharts-table-styling table-light" style="">
            <tbody>
            <tr>
              <th>
                Implied Volatility
                <i class="bi bi-info-circle" rel="tooltip" data-bs-toggle="tooltip" data-bs-placement="right" title="" data-bs-original-title="The average implied volatility of options expring nearest to 30-days." aria-label="The average implied volatility of options expring nearest to 30-days."></i>
              </th>
              <td>17.49%</td>
            </tr>
...

字符串
所以我的想法是使用soup.find来搜索div和类名。虽然pandas和soup都找不到表。我尝试了下面的代码,它输出了随机HTML和错误的混合,但没有表。

import requests
import pandas as pd
import json
from pandas.io.json import json_normalize
from bs4 import BeautifulSoup

url = 'https://optioncharts.io/options/AAPL/overview/option-statistics'
res = requests.get(url)
soup = BeautifulSoup(res.content, "lxml")

e0bqpujr

e0bqpujr1#

您看到的数据是通过JavaScript从外部URL加载的。要将数据加载到pandas框架中,您可以使用:用途:

import pandas as pd

api_url = "https://optioncharts.io/async/options_ticker_info?ticker=AAPL"

df = pd.read_html(api_url)[0].set_index(0).rename_axis(index=None)
print(df)

字符串
印刷品:

1
Implied Volatility                 17.49%
Historical Volatility              13.18%
IV Percentile                         12%
IV Rank                             9.07%
IV High                42.79% on 01/03/23
IV Low                 15.42% on 12/15/23


获取"Implied Volatility"

print(df.loc["Implied Volatility"])


印刷品:

相关问题