R语言 网页搜罗收入日历

gojuced7  于 2023-01-28  发布在  其他
关注(0)|答案(1)|浏览(133)
url <- "https://finance.yahoo.com/calendar/earnings?from=2022-12-04&to=2022-12-10&day=2022-12-06"

download_table <- function(url) {
  url_file <- GET(url)
  web_page_parsed <- htmlParse(url_file)
  tables <- readHTMLTable(web_page_parsed)
}

url_file <- GET(url)
web_page_parsed <- htmlParse(url_file)
tables <- readHTMLTable(web_page_parsed)
print(head(tables))

我在雅虎用过这个,效果很好。但我试过这个:

url <- "https://www.benzinga.com/calendars/earnings"

download_table <- function(url) {
  url_file <- GET(url)
  web_page_parsed <- htmlParse(url_file)
  tables <- readHTMLTable(web_page_parsed)
}

url_file <- GET(url)
web_page_parsed <- htmlParse(url_file)
tables <- readHTMLTable(web_page_parsed)
print(head(tables))
tables$`NULL`

结果我没有得到任何表格,只有这个:

> print(head(tables))
$`NULL`
  Date time ticker Quarter Prior EPS Est EPS Actual EPS EPS Surprise
1 Date time ticker Quarter Prior EPS Est EPS Actual EPS EPS Surprise
  Prior Rev Est Rev Actual Rev Rev Surprise Get Alert
1 Prior Rev Est Rev Actual Rev Rev Surprise Get Alert

$`NULL`
  V1   V2   V3   V4   V5   V6   V7   V8   V9  V10  V11  V12  V13
1                                                               
2    <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>

> tables$`NULL`
  Date time ticker Quarter Prior EPS Est EPS Actual EPS EPS Surprise
1 Date time ticker Quarter Prior EPS Est EPS Actual EPS EPS Surprise
  Prior Rev Est Rev Actual Rev Rev Surprise Get Alert
1 Prior Rev Est Rev Actual Rev Rev Surprise Get Alert
>

如果我在源代码中搜索,例如股票代码,我找不到它们。所以我不能使用rvest包来废弃它们。
有人知道怎么对付binginga吗?
谢谢你和KR
网页搜罗Bezinga收入日历与rvest和httpr

xxe27gdn

xxe27gdn1#

数据是从网络部分(开发人员工具中的inspect元素)中看到的API提取的。
链接如下:
https://api.benzinga.com/api/v2.1/calendar/earnings?token=1c2735820e984715bc4081264135cb90&parameters[date_from]=2023-01-25&parameters[date_to]=2023-01-25&parameters[tickers]=&pagesize=1000
然后,您可以创建一个函数来更改日期并过滤感兴趣的股票代码([tickers]),我在这里编写了一个httr2函数作为建议,该函数将from_dateto_date作为输入。

library(tidyverse)
library(httr2)

get_earnings <- function(from_date, to_date) {
  str_c(
    "https://api.benzinga.com/api/v2.1/calendar/earnings?token=1c2735820e984715bc4081264135cb90&parameters[date_from]=",
    from_date,
    "&parameters[date_to]=",
    to_date,
    "&parameters[tickers]=&pagesize=1000"
  ) %>%
    request() %>%
    req_headers(accept = "application/json") %>%
    req_perform() %>%
    resp_body_json(simplifyVector = TRUE) %>%
    pluck("earnings") %>%
    as_tibble() %>%
    type_convert()
}

get_earnings(from_date = "2023-01-01", to_date = "2023-01-25")

# A tibble: 387 × 25
   currency date       date_confirmed   eps eps_est eps_prior eps_surprise eps_surprise_per…
   <chr>    <date>              <int> <dbl>   <dbl>     <dbl>        <dbl>             <dbl>
 1 USD      2023-01-25              1  0.91    0.58      0.57         0.33            0.569 
 2 USD      2023-01-25              1 NA       1.27      1.42        NA              NA     
 3 USD      2023-01-25              1  1       0.97      0.92         0.03            0.0309
 4 USD      2023-01-25              1  1.01    1.13      0.95        -0.12           -0.106 
 5 USD      2023-01-25              1  0.69   NA         0.93        NA              NA     
 6 USD      2023-01-25              1  0.12    0.13      0.16        -0.01           -0.0769
 7 USD      2023-01-25              1  1.5     1.43      1.05         0.07            0.049 
 8 USD      2023-01-25              1  1.1     0.98      0.69         0.12            0.122 
 9 USD      2023-01-25              1  0.02    0.01     -0.65         0.01            1     
10 USD      2023-01-25              1  0.42    0.44      0.5         -0.02           -0.0455
# … with 377 more rows, and 17 more variables: eps_type <chr>, exchange <chr>, id <chr>,
#   importance <int>, name <chr>, notes <chr>, period <chr>, period_year <int>,
#   revenue <dbl>, revenue_est <dbl>, revenue_prior <dbl>, revenue_surprise <dbl>,
#   revenue_surprise_percent <dbl>, revenue_type <chr>, ticker <chr>, time <time>,
#   updated <int>

相关问题