标签汤和XPath

我正在尝试使用带有XPath (JAXP)的TagSoup。我知道如何从TagSoup(或XMLReader)获得SAX解析器。但是我没有找到如何创建将使用SAX解析器的DocumentBuilder。我怎么做呢?

谢谢。

编辑:很抱歉这么笼统，但Java XML API是这样的痛苦。

EDIT2:

问题解决:

public static void main(String[] args) throws XPathExpressionException, IOException,
        SAXNotRecognizedException, SAXNotSupportedException,
        TransformerFactoryConfigurationError, TransformerException {
    XPathFactory xpathFac = XPathFactory.newInstance();
    XPath xpath = xpathFac.newXPath();
    InputStream input = new FileInputStream("/tmp/g.html");
    XMLReader reader = new Parser();
    reader.setFeature(Parser.namespacesFeature, false);
    Transformer transformer = TransformerFactory.newInstance().newTransformer();
    DOMResult result = new DOMResult();
    transformer.transform(new SAXSource(reader, new InputSource(input)), result);
    Node htmlNode = result.getNode();
    NodeList nodes = (NodeList) xpath.evaluate("//span", htmlNode, XPathConstants.NODESET);
    System.out.println(nodes.getLength());
}

EDIT3:

帮助我的链接:http://www.jezuk.co.uk/cgi-bin/view/jez?id=2643

Java XML API是如此的痛苦

确实是。考虑迁移到XSLT 2.0/XPath 2.0，转而使用Saxon的s9api接口。它看起来大概像这样:

Processor proc = new Processor();
InputStream input = new FileInputStream("/tmp/g.html");
XMLReader reader = new Parser();
reader.setFeature(Parser.namespacesFeature, false);
Source source = new SAXSource(parser, input);
DocumentBuilder builder = proc.newDocumentBuilder();
XdmNode input = builder.build(source);
XPathCompiler compiler = proc.newXPathCompiler();
XdmValue result = compiler.evaluate("//span", input);
System.out.println(result.size());

相关内容

最新更新

热门标签：