提取动态内容 网页抓取



我知道XMLHTTP只获取初始页面源代码,它不会执行任何动态更新。我不想尝试自动化IE,因为它太慢了。

我附上了下面的代码。我想在疯牛病和新泽西中提取这只股票的交易量。 但是NSE卷只能在单击"查看NSE"时提取。 我在提取 NSE 卷时出错,因为

"未设置对象变量"

请帮助我解决解决方案,我是XHR,JSON等的新手...

Sub PV_Extract()
Dim wpage As New MSXML2.ServerXMLHTTP60
Dim hdoc As New HTMLDocument

URL = "https://money.rediff.com/companies/Asian-Paints-Ltd/11580001"
wpage.Open "GET", URL, False
wpage.send
While wpage.readyState <> 4
DoEvents
Wend
Set hdoc = New HTMLDocument
hdoc.body.innerHTML = wpage.responseText
Set today_tab_bse = hdoc.getElementsByTagName("table")(0).getElementsByTagName("tr")(1)
Set today_tab_nse = hdoc.getElementsByTagName("table")(1).getElementsByTagName("tr")(1)
vol_1 = today_tab_bse.getElementsByTagName("td")(0).innerText
vol_2 = today_tab_nse.getElementsByTagName("td")(0).innerText
End Sub

始终声明所有变量。您可以按元素的 ID 抓取元素。

Sub PV_Extract()
Dim wpage As New MSXML2.ServerXMLHTTP60
Dim url As String
Dim hdoc As New HTMLDocument
Dim today_tab_bse As Object
Dim today_tab_nse As Object
Dim vol_1 As String
Dim vol_2 As String
url = "https://money.rediff.com/companies/Asian-Paints-Ltd/11580001"
wpage.Open "GET", url, False
wpage.send
Set hdoc = New HTMLDocument
hdoc.body.innerHTML = wpage.responseText
Set today_tab_bse = hdoc.getElementByID("for_BSE")
Set today_tab_nse = hdoc.getElementByID("for_NSE")
vol_1 = today_tab_bse.getElementsByTagName("td")(0).innerText
vol_2 = today_tab_nse.getElementsByTagName("td")(0).innerText
MsgBox "BSE: " & vol_1 & Chr(13) & "NSE: " & vol_2
End Sub

编辑:获取 BSE 和 NSE 的值

Sub PV_Extract()
Dim wpage As New MSXML2.ServerXMLHTTP60
Dim url As String
Dim hdoc As New HTMLDocument
Dim vol_1 As String
Dim vol_2 As String
url = "https://money.rediff.com/companies/Asian-Paints-Ltd/11580001"
wpage.Open "GET", url, False
wpage.send
Set hdoc = New HTMLDocument
hdoc.body.innerHTML = wpage.responseText
vol_1 = hdoc.getElementByID("ltpid").innerText
vol_2 = hdoc.getElementByID("ltpid_nse").innerText
MsgBox "BSE: " & vol_1 & Chr(13) & "NSE: " & vol_2
End Sub

最新更新