Matcher.find() return false android



我正在尝试从下面的html代码中获取softwareVersion

<div class="title">Current Version</div> <div class="content" itemprop="softwareVersion"> 1.1.3  </div> </div> <div class="meta-info"> <div class="title">Requires Android</div> <div class="content" itemprop="operatingSystems">     2.2 and up   </div> </div>

我用了下面的代码

String Html = GetHtml("https://play.google.com/store/apps/details?id="+ AppID)
Pattern pattern = Pattern.compile("softwareVersion">[^<]*</dd");
Matcher matcher = pattern.matcher(Html);
matcher.find();
String GetHtml(String url1) 
    {
        String str = "";
        try 
        {
            URL url = new URL(url1);
            URLConnection spoof = url.openConnection();
            spoof.setRequestProperty("User-Agent",
                    "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; H010818)");
            BufferedReader in = new BufferedReader(new InputStreamReader(
                    spoof.getInputStream()));
            String strLine = "";
            // Loop through every line in the source
            while ((strLine = in.readLine()) != null) 
            {
                str = str + strLine;
            }
        } 
        catch (Exception e) 
        {
        }
        return str;
    }

但matcher总是返回false。我想我的模式有问题,有人能帮我吗感谢

正如其他人所评论的,我通常会使用html解析器从html中提取内容。然而,在您只从字符串中提取一点信息的情况下,我可以理解为什么要使用regex。

您需要做的是这样的事情——正则表达式的问题是额外的d。此外,如果你把你关心的比特放在括号里,你可以用.group来获取它。

import java.util.regex.*;
public class R {
  public static void main(String[] args){
     String Html = "<div class="title">Current Version</div> <div class="content" itemprop="softwareVersion"> 1.1.3  </div> </div> <div class="meta-info"> <div class="title">Requires Android</div> <div class="content" itemprop="operatingSystems">     2.2 and up   </div> </div>";
     Pattern pattern = Pattern.compile("softwareVersion">([^<]*)</d");
     Matcher matcher = pattern.matcher(Html);
     System.out.println(matcher.find());
     System.out.println(matcher.group(1));
  }
}

相关内容

最新更新