用java解析网页

我想从这个网页解析实时费率：http://www.truefx.com/在我的java程序中，也就是说，我希望网页中每秒刷新的数据能够不断地流式传输到我的程序中。

如果可能的话，我想使用标准的java库来完成这项工作。我知道像jsoup这样的插件，可能还有其他插件，但我不想下载和安装这些插件，因为我使用的计算机的硬盘位于加利福尼亚州，除了一些核心程序（eclipse就是其中之一）之外，其他所有程序都会在每晚系统重新启动时被删除。

因此，如果有人知道标准eclipse下载中有一个包可以做到这一点，请告诉我！感谢

好的，所以我开始工作了，但看起来很慢。例如，数据会一秒一秒地变化，即使我也在一秒一秒钟地刷新我从中读取的网页（我使用了thread.sleep（1000）），然后获得网页的新实例，但它每分钟左右只更新一次？

以下是我的代码（我使用了你在上面发布的内容作为我的url阅读器）：

 public String getPage(String urlString){
        String result = "";
        //Access the page
        try {
         // Create a URL for the desired page
         URL url = new URL(urlString);
         // Read all the text returned by the server
         BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
         String str;
         while ((str = in.readLine()) != null) {
             // str is one line of text; readLine() strips the newline character(s)
             result += str;
         }
         in.close();             
        } catch (MalformedURLException e) {
        } catch (IOException e) {
        }          
        return result;
    }
    public static void main(String[]args){
        int i =0;
        Reading r = new Reading();
    while(true){
        try{Thread.sleep(1000);}catch(Exception e){}
        String page = new String(r.getPage("http://www.fxstreet.com/rates-charts/forex-rates/"));
        int index = page.indexOf("last_3212166");
        //System.out.println(i+page);
        i++;
        System.out.println(i+"GBP/USD: "+page.substring(index+14,index+20));
    }

使用没有外部API，只需导入java.net.URL，就可以通过此函数获取页面

static public String getPage(String urlString){
    String result = "";
    //Access the page
    try {
     // Create a URL for the desired page
     URL url = new URL(urlString);
     // Read all the text returned by the server
     BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
     String str;
     while ((str = in.readLine()) != null) {
         // str is one line of text; readLine() strips the newline character(s)
         result += str;
     }
     in.close();             
    } catch (MalformedURLException e) {
    } catch (IOException e) {
    }          
    return result;
}

然后使用java.util.regex匹配要从页面中获取的数据。并将其解析为标签。不要忘记将所有这些放在线程中，使用while（true）循环和sleep（some_time）来获得逐秒的信息。

相关内容

最新更新

热门标签：