vb.net httpwebrequest get html with google links


Imports System.Net
Imports System.IO
Public Class Form1
    Public Function GetHTML(ByVal url As Uri) As String
        Dim HTML As String
        Dim Request As HttpWebRequest
        Dim Response As HttpWebResponse
        Dim Reader As StreamReader
        Try
            Request = HttpWebRequest.Create(url)
            Response = Request.GetResponse
            Reader = New StreamReader(Response.GetResponseStream())
            HTML = Reader.ReadToEnd
        Catch ex As Exception
            HTML = Nothing
        End Try
        Return HTML
    End Function
    Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
        Dim url As Uri = New Uri(TextBox1.Text)
        TextBox2.Text = GetHTML(url)
    End Sub
End Class

以上是我从网页获取html的代码。如果我输入类似这样的东西,我遇到了问题 http://www.google.com.sg/url?sa=t&rct=j&q=vb.net%20convert%20string%20to%20uri&source=web&cd=1&ved=0CFcQFjAA&url=http%3A%2F%2Fwww.vbforums.com%2Fshowthread.php%3Fp%3D3434187&ei=R0fxT872Cs2HrAesq4m-DQ&usg=AFQjCNGGedjegaM8osT689qWhbqpf6NI7Q

它给了我

   <script>window.googleJavaScriptRedirect=1</script>
    <script>
    var f={};
    f.navigateTo=function(b,a,g){
      if(b!=a&&b.google)
      {
        if(b.google.r)
         {
           b.google.r=0;
           b.location.href=g;
           a.location.replace("about:blank");
         }
      }
      else
      {
        a.location.replace(g);
      }
    };
    f.navigateTo(window.parent,window,"http://www.vbforums.com/showthread.php?px3d3434187");
    </script>
    <noscript>
    <META http-equiv="refresh" content="0;URL='http://www.vbforums.com/showthread.php?p=3434187'">
    </noscript>

而不是 http://www.vbforums.com/showthread.php?p=3434187 的html

如何让我的代码执行重定向并获取 HTML?

元标记中剔除网址,然后发出新请求。对于抓取,我推荐HtmlAgilityPack,您可以在 http://html-agility-pack.net/下载它或使用NuGet安装它。

最新更新