在 Web 浏览器中读取和显示法语字符 (C#)



我在C#(.Net 2.0(中使用Visual Studio Community。我在读取 Web 浏览器控件中加载的网页上的数据,然后将其显示在 Web 浏览器控件中时遇到问题。网页使用"iso-8859-1"字符集。我尝试使用"Document.Encoding",正如您在以下代码中的注释行中看到的那样,但它一直显示这些带有白色问号而不是法语字符的黑色losanges。我知道,这个程序没有任何用处,它只是一个示例来向您展示问题。我做错了什么?我花了三天时间寻找解决方案。

private void button1_Click(object sender, EventArgs e)
{
        string myquery = "";
        string html = "";
        string results = "";
        int start;
        int end;
        //string encodage = "";
        webBrowser1.ScriptErrorsSuppressed = true;
        myquery = "http://citoyens.soquij.qc.ca/index.php";
        webBrowser1.Navigate(myquery);
        do {
            Application.DoEvents();
        } while (webBrowser1.ReadyState != WebBrowserReadyState.Complete);
        //encodage = webBrowser1.Document.Encoding;
        //MessageBox.Show(encodage);
        html = webBrowser1.DocumentText;
        start = html.IndexOf("a class="petit_logo"");
        start = html.IndexOf(">", start);
        end = html.IndexOf("<", start);
        results += html.Substring(start, end - start);
        //webBrowser1.Document.Encoding = encodage;
        webBrowser1.DocumentText = results;
        //MessageBox.Show(webBrowser1.Document.Encoding);
}

这里的问题是文档流未使用所需的编码格式正确编码。在以所需的编码格式对文档进行编码后,您可能必须使用 document.write,如下所示。

 WebBrowser webBrowser1 = new WebBrowser();
        string myquery = "";
        string html = "";
        string results = "";
        int start;
        int end;
        //string encodage = "";
        webBrowser1.ScriptErrorsSuppressed = true;
        myquery = "http://citoyens.soquij.qc.ca/index.php";
        webBrowser1.Navigate(myquery);
        do
        {
            Application.DoEvents();
        } while (webBrowser1.ReadyState != WebBrowserReadyState.Complete);
        // For UTF-8 encoding. 
        //StreamReader sr = new StreamReader(webBrowser1.DocumentStream, Encoding.GetEncoding("UTF-8"));
        //string source = sr.ReadToEnd();
        //string encodage = webBrowser1.Document.Encoding;
        //MessageBox.Show(encodage);

        // Similarly, for "iso-8859-1" character set encoding.
        using (StreamReader sr = new StreamReader(webBrowser1.DocumentStream, Encoding.GetEncoding(webBrowser1.Document.Encoding)))
        {
            string source = sr.ReadToEnd();
            webBrowser1.Document.Write(source);
        }
        html = webBrowser1.DocumentText;
        start = html.IndexOf("a class="petit_logo"");
        start = html.IndexOf(">", start);
        end = html.IndexOf("<", start);
        results += html.Substring(start, end - start);
        //webBrowser1.Document.Encoding = encodage;
        webBrowser1.DocumentText = results; //use Document.write(text) method.

最新更新