我正在使用以下方法从HTTP服务器读取TXT文件。
public static String getHtmlFromUrl(String strUrl, String referer, boolean isMobile) {
URL url = null;
BufferedReader reader = null;
StringBuilder sb = null;
String returnValue = "";
try {
url = new URL(strUrl);
URLConnection con = url.openConnection();
// force server to mimic specific Browser
con.setRequestProperty("User-Agent", userAgent);
if(isMobile)
con.setRequestProperty("User-Agent", userAgentMobile);
con.setRequestProperty("Referer", referer);
con.setReadTimeout(15000);
con.connect();
reader = new BufferedReader(new InputStreamReader(con.getInputStream()));
sb = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null) {
sb.append(line + "n");
}
returnValue = sb.toString();
} catch(Exception e) {
e.printStackTrace();
} finally {
if(reader != null) {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return returnValue;
}
我无法直接访问此文件(因此我无法更改其方式)。如果我在浏览器中调用URL,则使用ISO-8859或Windows-1252编码正确显示它。
Android默认情况下似乎将其解释为UTF-8。因此,我需要一种方法将returnValue
或StringBuffer sb
从现有的ISO-8859编码转换为UTF-8。
我该怎么做?
您必须更新此行:
reader = new BufferedReader(new InputStreamReader(con.getInputStream()));
需要:
reader = new BufferedReader(new InputStreamReader(url.getInputStream(), "ISO_8859_1"));
或以来Java 7:
reader = new BufferedReader(new InputStreamReader(url.getInputStream(), StandardCharsets.ISO_8859_1));
更新:ISO_8859_1代替UTF-8