我如何避免返回白色空间，并在XPath中间的节点之间返回线路

我正在尝试学习使用Java XPath，但遇到了问题。当我使用getnodeName和getTextContent时，我最终会抓住在节点之间发生的空格和线返回。例如，如果我的XML看起来像：

<node-i-am-looking-for-in-my-xml>
    <parent-node-01>
        <child-node-01>
            some text
        </child-node>
        <child-node-02>
            some more text
        </child-node>
        <child-node-03>
            even more text
        </child-node>
    </parent-node-01>
    <parent-node-02>
        <child-node-01>
            some text
        </child-node>
        <child-node-02>
            some more text
        </child-node>
        <child-node-03>
            even more text
        </child-node>
    </parent-node-02>
    <parent-node-03>
        <child-node-01>
            some text
        </child-node>
        <child-node-02>
            some more text
        </child-node>
        <child-node-03>
            even more text
        </child-node>
    </parent-node-03>
</node-i-am-looking-for-in-my-xml>

我使用getnodeName时会得到什么：

child-node-01
#text
child-node-02
#text
child-node-03
#text

当我使用getTextContent时，它看起来像：

some text
some more text
even more text

这是我正在使用的代码：

public static void main(String[] args) throws Exception {
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    dbf.setValidating(false);
    DocumentBuilder db = dbf.newDocumentBuilder();
    String filename = "C:\Users\Me\file.xml";
    Document doc = db.parse(new FileInputStream(new File(filename)));
    XPathFactory factory = XPathFactory.newInstance();
    XPath xpath = factory.newXPath();
    String expression;
    Node node;
    NodeList nodeList;
    expression = "//node-i-am-looking-for/*";
    nodeList = (NodeList) xpath.evaluate(expression, doc, XPathConstants.NODESET);
    System.out.println("nodeList.getLength(): " + nodeList.getLength());
    for (int i = 0; i < nodeList.getLength(); i++) {
        for(int j=1; j<(nodeList.item(i).getChildNodes().getLength()); j++){
            Node nowNode = nodeList.item(i).getChildNodes().item(j);
            System.out.println(nowNode.getNodeName() + ":" + nowNode.getTextContent());
        }
    }
}

在围绕Google浏览Google时，我似乎需要使用"归一化空间"，但是我不知道如何实现。

正如您所看到的，XML文本节点中的空格很重要。child-node-01的文本内容（或更准确地说，父母是child-node-01的文本节点的内容）实际上是'n some textn '。

如果您需要在XPath表达式内处理此Whitespace时，您才会使用normalize-space，因为normalize-space是XPATH函数。例如，如果您想选择所有节点，其中文本内容（带有领先/拖延的空格剥离）为'some data'，则可以具有以下XPath：

//*[normalize-space(.) = 'some data']

但是，当您检索文本内容时，您已经不在XPath世界之外，然后回到Java，所以您可能会更好地选择：

nowNode.getTextContent().trim()

相关内容

最新更新

热门标签：