在需要对象交互的站点上使用XML HTTP请求



我正在做一个项目,从一些网站上抓取信息。我有一些网站工作没有问题,主要是通过修改URL来通过相关标准或通过发布AJAX请求来处理它们。我是相当新的,所以我正在寻求一些帮助。

我遇到了一个网站,我需要与页面上的对象进行交互,以获得进一步的信息。下面的站点就是一个例子:

示例站点

如果你访问网站,去底部有更多的品牌,点击"查看"将显示更多的产品。它们的HTML仅在单击后返回。

对于我从其他网站获取的信息,我使用了以下方法。是否有一种方法通过XML HTTP方法处理页面后,一个页面对象的行动已经执行?

任何帮助都将非常感激。目前,我认为我将不得不坚持使用Internet Explorer对象来抓取这些网站。

Option Explicit
Public Sub sbKF()
Dim conn As ADODB.Connection
Dim rsIn As ADODB.Recordset
Dim HTMLDoc As HTMLDocument
Dim strUrl As String
Dim strPost As String
Set conn = CurrentProject.Connection
Set rsIn = New ADODB.Recordset
Set HTMLDoc = New MSHTML.HTMLDocument
rsIn.Open pcstrInput, conn, adOpenStatic, adLockReadOnly
rsIn.MoveLast: rsIn.MoveFirst
Do While Not rsIn.EOF
    ' Create the URL and Post submission for input size.
    strUrl = "http://www.[Site].com"
    strPost = "Stage=2&sop=TyreSize&ssq=1&vnp=&vmk=&vch=&vmo=&drd="
    ' Return the Document body results
    HTMLDoc.body.innerHTML = fnPostXmlHttp(strUrl, strPost)
    rsIn.MoveNext
Loop
End Sub
Public Function fnPostXmlHttp(ByVal strUrl As String, ByVal strScript As String)
Dim XMLHttpRequest As Object
Dim strOut As String
Set XMLHttpRequest = CreateObject("MSXML2.XMLHTTP")
XMLHttpRequest.Open "POST", strUrl, False
XMLHttpRequest.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
XMLHttpRequest.send (strScript)
While XMLHttpRequest.ReadyState <> 4
    DoEvents
Wend
fnPostXmlHttp = XMLHttpRequest.responseText
End Function

如果您查看www.blackcircles.com HTML响应,您将看到javascript片段:

...
var newTyresActionUrl;
var lookupAddress;
$(document).ready(function () {
    newTyresActionUrl = new BC.classes.productV6SearchPage('https://www.blackcircles.com/order/tyres',
        {"Error":false,"VariantFitments":[{"Name":"All Season","VariantType":11,"SeasonalType":true,"TruckType":false,"FriendlyName":"allseason","Count":17,
        ...
        "TakeoverCss":"u003clink id="brandtakeover-css" rel=u0027stylesheetu0027 type=u0027text/cssu0027 href=u0027/templates/bcstyles/css/goodyear-effgrip-perf.cssu0027u003e"},
        "Width", 
        "Profile",
        "Rim",
        "Speed",
        "Method",
        true,
        ""
    );
    addToBasket = new BC.classes.addtobasket('https://www.blackcircles.com/order/tyres', "order", '/truck/garages');
    ...

实际上,花括号内的部分表示一个JSON对象,其中包含网页上显示的所有数据。因此,您可以通过Instr()从HTML内容中提取JSON字符串,解析它,转换为数组并输出到工作表,如下面的示例代码所示。进口JSON。

Option Explicit
Sub Test_blackcircles()
    Dim sResp As String
    Dim vJSON As Variant
    Dim sState As String
    Dim i As Long
    Dim vItem
    Dim aData()
    Dim aHeader()
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.blackcircles.com/order/tyres/search?width=205&profile=55&rim=R16&speed=V&vehicle-make=&postcode=&delivery=1&findTyre=", False
        .send
        sResp = .responseText
    End With
    sResp = getFragment(sResp, "new BC.classes.productV6SearchPage", "new BC.classes.addtobasket")
    sResp = getFragment(sResp, "{", "}")
    sResp = "{" & sResp & "}"
    JSON.Parse sResp, vJSON, sState
    i = 1
    With Sheets(1)
        .Cells.Delete
        .Cells.WrapText = False
        For Each vItem In Array( _
                "Manufacturers", _
                "CarManufacturers", _
                "All", _
                "Deals", _
                "Best", _
                "Rest", _
                "SearchParams" _
                )
            .Cells(i, 1).Value = vItem
            JSON.ToArray vJSON(vItem), aData, aHeader
            OutputArray .Cells(i + 2, 1), aHeader
            Output2DArray .Cells(i + 3, 1), aData
            .Columns.AutoFit
            i = i + UBound(aData, 1) + 5
        Next
    End With
End Sub
Sub OutputArray(oDstRng As Range, aCells As Variant)
    With oDstRng
        .Parent.Select
        With .Resize(1, UBound(aCells) - LBound(aCells) + 1)
            .NumberFormat = "@"
            .Value = aCells
        End With
    End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
    With oDstRng
        .Parent.Select
        With .Resize( _
                UBound(aCells, 1) - LBound(aCells, 1) + 1, _
                UBound(aCells, 2) - LBound(aCells, 2) + 1)
            .NumberFormat = "@"
            .Value = aCells
        End With
    End With
End Sub
Function getFragment( _
    sourceText As String, _
    startPattern As String, _
    endPattern As String _
) As String
    Dim startPos
    startPos = InStr(sourceText, startPattern)
    If startPos = 0 Then Exit Function
    Dim partText
    partText = Mid(sourceText, startPos + Len(startPattern))
    Dim endPos
    endPos = InStrRev(partText, endPattern)
    If endPos = 0 Then Exit Function
    getFragment = Left(partText, endPos - 1)
End Function

顺便说一下,类似的方法也应用在其他答案中。

最新更新