r-使用Rvest抓取历史股价是行不通的



我正试图从网页中提取历史股价信息。然而,rvest抛出xml错误。

我是这方面的新手,有人能帮我了解如何让它发挥作用吗?

这是我使用rvest的R脚本

share_url <- "https://www.moneycontrol.com/stocks/hist_stock_result.php?ex=B&sc_id=ITC&mycomp=ITC"
share.data<- read_html(share_url)
share_css<-"#mc_mainWrapper > div.PA10 > div.FL > div.PT15 > div.MT12 > table > tbody"
share_table<- share.data %>% read_html() %>% html_nodes("table") %>% html_table()

错误:

UseMethod中的错误("read_xml"(:没有适用于‘read_xml’应用于类"的对象;c('xml_document','xml_node'(">

更改日期时会有以下调用:

POST https://www.moneycontrol.com/stocks/hist_stock_result.php

正文是用以下数据编码的表单url(从和到日期(:

frm_dy = "01",
frm_mth = "01",
frm_yr = "2020",
to_dy = "01",
to_mth = "01",
to_yr = "2021",
hdn = "daily"

以下代码使用httrrvest来获取表格数据:

library(rvest)
library(httr)
resp <- POST(
"https://www.moneycontrol.com/stocks/hist_stock_result.php",
query = list(
ex = "B",
sc_id = "ITC",
mycomp = "ITC"
),
body = list(
frm_dy = "01",
frm_mth = "01",
frm_yr = "2020",
to_dy = "01",
to_mth = "01",
to_yr = "2021",
hdn = "daily"
), 
encode = "form")
table <- html_table(content(resp),fill=TRUE)[[3]]
print(table)

kaggle链接

最新更新