我想漂亮地打印一个没有架构的org.w3.dom文档

我觉得自己快要疯了。我想漂亮地打印一个没有模式的org.w3.dom文档(用Java)。缩进不是我所需要的全部，我希望忽略无用的空行和空白。不知何故，这种情况并没有发生，每次我从文件中解析XML或将其写回文件时，DOM文档中都会有包含空格的文本节点(空格等)。难道没有一种方法可以让我在没有模式的情况下，通过迭代所有节点并删除空的文本节点，不用自己转换XML，就可以简单地消除这些问题吗？

示例：我的输入文件看起来像这样(但有更多的空行：)

<mytag>
<anothertag>content</anothertag>

</mytag>

我希望我的输出文件看起来像这样：

<mytag>
<anothertag>content</anothertag>
</mytag>

注意：我没有XML的架构(所以我被迫调用builder.setValidating(false))，并且当运行此代码时，我没有互联网连接的奢侈。

谢谢！

UPDATE：我发现了一些非常接近我需要的东西，也许它可以帮助其他士兵对抗没有模式的XML文档：

org.apache.axis.utils.XMLUtils.normalize(document);

此处为源代码。在创建文档之后，在使用Transformer编写文档之前调用它，将产生完全没有模式验证的漂亮输出。JB Nizet也给了我一个有效的答案，但我觉得在代码的幕后正在进行一些验证，这将使它与我的用例不同。不过，我把这个问题留几天，以防有人有更好的解决方案。

下面是一个工作示例：

public class Xml {
private static final String XML =
"<mytag>n" +
"        <anothertag>content</anothertag>n" +
"n" +
"n" +
"n" +
"</mytag>";
public static void main(String[] args) throws ParserConfigurationException, IOException, SAXException, InstantiationException, IllegalAccessException, ClassNotFoundException {
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setValidating(false);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(new InputSource(new StringReader(XML)));
NodeList childNodes = document.getDocumentElement().getChildNodes();
for (int i = 0; i < childNodes.getLength(); i++) {
System.out.println(childNodes.item(i));
}
final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
final LSSerializer writer = impl.createLSSerializer();
writer.getDomConfig().setParameter("xml-declaration", false);
writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
System.out.println(writer.writeToString(document));
}
}

输出：

[#text: 
]
[anothertag: null]
[#text: 

]
<mytag>
<anothertag>content</anothertag>
</mytag>

因此，解析器不进行验证，它保留了文本节点，并且序列化程序产生的输出正如您所期望的那样

相关内容

最新更新

热门标签：