我想通过代码获取网站的内部文本。
我已经可以使用下面的代码获取它的内部 html,但我找不到任何在没有网络浏览器的情况下获取 URL 内部文本的代码。
这段代码在网络浏览器中从网站获取文本,但我需要同样的东西,只是没有网络浏览器。
Dim sourceString As String = WebBrowser1.Document.Body.InnerText
with HtmlAgilityPack...
Private Sub ToolStripButton1_Click(sender As Object, e As EventArgs) Handles ToolStripButton1.Click
Dim doc As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument
With New Net.WebClient
doc.LoadHtml(.DownloadString("https://example.com"))
.Dispose()
End With
Debug.Print(doc.DocumentNode.Name)
PrintChildNodes(doc.DocumentNode)
Debug.Print(doc.DocumentNode.Element("html").Element("body").InnerText)
End Sub
Sub PrintChildNodes(Node As HtmlAgilityPack.HtmlNode, Optional Indent As Integer = 1)
For Each Child As HtmlAgilityPack.HtmlNode In Node.ChildNodes
Debug.Print("{0}{1}", String.Empty.PadLeft(Indent, vbTab), Child.Name)
PrintChildNodes(Child, Indent + 1)
Next
End Sub
**取自 **沃尔夫威德
在这个问题中,HTTP GET VB.NET
Try
Dim fr As System.Net.HttpWebRequest
Dim targetURI As New Uri("http://whatever.you.want.to.get/file.html")
fr = DirectCast(HttpWebRequest.Create(targetURI), System.Net.HttpWebRequest)
If (fr.GetResponse().ContentLength > 0) Then
Dim str As New System.IO.StreamReader(fr.GetResponse().GetResponseStream())
Response.Write(str.ReadToEnd())
str.Close();
End If
Catch ex As System.Net.WebException '访问资源时出错,请处理它结束尝试
您将获得 HTML 以及 http 标头。不要认为这本身适用于https
.