使用宏从excel网页上的超链接导入数据



我想从网站导入一些数据https://www.amfiindia.com/nav-history-download.在这个页面上,有一个链接"下载文本格式的完整NAV报告",它会给我所需的数据。但是这个链接不是静态的,所以我不能在VBA中直接使用它来下载我的数据。那么,如何使用excel从网页上的超链接下载数据呢?

我的方法是首先在一个变量中获取超链接,然后使用该变量来获取数据?

  1. 首先,使用getElementsByTagName函数获取超链接,如下所示
  2. 然后使用它作为URL来获取数据
  3. 但我得到了类型不匹配的错误,同时等同于一个字符串的网站和我的超链接

我不知道href的类型。尝试在显示Variant的监视窗口中查看,尝试了那个仍然错误。请帮我做这件事。



Sub webscraping()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim cellAddress As String
Dim rowNumber As Long
Dim ie As InternetExplorer
Dim ht As HTMLDocument
Dim hr As MSHTML.IHTMLElement
'Dim Hra As MSHTML.IHTMLElement

Set ie = New InternetExplorer
ie.Visible = True
ie.Navigate ("https://www.amfiindia.com/nav-history-download")
Do Until ie.ReadyState >= 4
DoEvents
Loop
Set ht = ie.Document
'MsgBox ht.getElementById("navhistorydownload")
Set hr = ht.getElementsByTagName("a")(18).href

' Website to go to.
website = StrConv(hr, vbUnicode)
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
'price = html.getElementstagName("a").Item(0).innerText
cellAddress = Range("A" & Rows.Count).End(xlUp).Address
rowNumber = Range(cellAddress).Row
ThisWorkbook.Sheets(1).Cells(rowNumber + 1, 1) = response
' MsgBox rowNumber
' MsgBox cellAddress
' Output the price into a message box.
'MsgBox price
End Sub

如果您不知道类型,则可以使用

?typename(ht.getElementsByTagName("a")(18).href) 

在直接窗口中。

它应该是一个字符串并声明为字符串。

我会通过css选择器获取,而不是索引到锚集合中

ht.querySelector(".nav-hist-dwnld a").href

这指定了类名为nav-hist-dwnld的父节点,然后请求第一个子a标记。

这不需要website = StrConv(hr, vbUnicode)。直接使用提取的href

最新更新