Convert to UTF-8



我正在使用以下方法从HTTP服务器读取TXT文件。

public static String getHtmlFromUrl(String strUrl, String referer, boolean isMobile) {
    URL url = null;
    BufferedReader reader = null;
    StringBuilder sb = null;
    String returnValue = "";
    try {
        url = new URL(strUrl);
        URLConnection con = url.openConnection();
        // force server to mimic specific Browser
        con.setRequestProperty("User-Agent", userAgent);
        if(isMobile)
            con.setRequestProperty("User-Agent", userAgentMobile);
        con.setRequestProperty("Referer", referer);
        con.setReadTimeout(15000);
        con.connect();
        reader = new BufferedReader(new InputStreamReader(con.getInputStream()));
        sb = new StringBuilder();
        String line = null;
        while((line = reader.readLine()) != null) {
            sb.append(line + "n");
        }
        returnValue = sb.toString();
    } catch(Exception e) {
        e.printStackTrace();
    } finally {
        if(reader != null) {
            try {
                reader.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
    return returnValue;
}

我无法直接访问此文件(因此我无法更改其方式)。如果我在浏览器中调用URL,则使用ISO-8859或Windows-1252编码正确显示它。

Android默认情况下似乎将其解释为UTF-8。因此,我需要一种方法将returnValueStringBuffer sb从现有的ISO-8859编码转换为UTF-8。

我该怎么做?

您必须更新此行:

reader = new BufferedReader(new InputStreamReader(con.getInputStream()));

需要:

reader = new BufferedReader(new InputStreamReader(url.getInputStream(), "ISO_8859_1"));

或以来Java 7:

reader = new BufferedReader(new InputStreamReader(url.getInputStream(), StandardCharsets.ISO_8859_1));

更新:ISO_8859_1代替UTF-8

最新更新