如何在Java中使用StAX读取XML的修改片段



我的目标是将对象(featureMember)读取到DOM中,对它们进行更改并写回新的XML。XML太大,无法使用DOM本身。我想我需要的是StAX和TransformerFactory,但我无法让它工作。

这就是我迄今为止所做的:

private void change(File pathIn, File pathOut) {
try {
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLOutputFactory factoryOut = XMLOutputFactory.newInstance();
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
XMLEventReader in = factory.createXMLEventReader(new FileReader(pathIn));
XMLEventWriter out = factoryOut.createXMLEventWriter(new FileWriter(pathOut));
while (in.hasNext()) {
XMLEvent e = in.nextTag();
if (e.getEventType() == XMLStreamConstants.START_ELEMENT) {
if (((StartElement) e).getName().getLocalPart().equals("featureMember")) {
DOMResult result = new DOMResult();
t.transform(new StAXSource(in), result);
Node domNode = result.getNode();
System.out.println(domnode);
}
}
out.add(e);
}
in.close();
out.close();
} catch (FileNotFoundException e1) {
e1.printStackTrace();
} catch (IOException e1) {
e1.printStackTrace();
} catch (TransformerConfigurationException e1) {
e1.printStackTrace();
} catch (XMLStreamException e1) {
e1.printStackTrace();
} catch (TransformerException e1) {
e1.printStackTrace();
}
}

我得到异常(在t.transform()上):

Exception in thread "AWT-EventQueue-0" java.lang.IllegalStateException: StAXSource(XMLEventReader) with XMLEventReader not in XMLStreamConstants.START_DOCUMENT or XMLStreamConstants.START_ELEMENT state

我的xml的简化版本看起来像(它有名称空间):

<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:gml="http://www.opengis.net/gml/3.2" gml:id="featureCollection">
<gml:featureMember>
</eg:RST>
<eg:pole>Krakow</eg:pole>
<eg:localId>id1234</eg:localId>
</gml:featureMember>
<gml:featureMember>
<eg:RST>1002</eg:RST>
<eg:pole>Rzeszow</eg:pole>
<eg:localId>id1235</eg:localId>
</gml:featureMember>
</gml:FeatureCollection>

我有一个对象(featureMember)的localId列表,我想更改它并更正更改的RST或极点(这取决于用户更改了哪一个):

localId(id1234)RST(1001)

localId(id1236)RST(1003)

您遇到的问题是,当您创建StAXSource时,您的START_ELEMENT事件已经被消耗。因此,XMLEventReader可能处于某个空白文本节点事件中,或者其他不可能是XML文档源的事件中。您可以使用peek()方法查看下一个事件,而无需消耗它。不过,请确保先有一个带有hasNext()的事件。

我不能100%确定你想完成什么,所以这里有一些你可以根据情况做的事情。

编辑:我刚刚读了一些关于你问题的评论,这些评论让事情变得更清楚了。以下内容仍然可以帮助您通过一些调整来实现所需的结果。还要注意,JavaXSLT处理器允许扩展函数和扩展元素,它们可以从XSLT样式表调用Java代码。这是一种强大的方法,可以使用外部资源(如数据库查询)扩展基本的XSLT功能。


如果您希望将输入XML转换为一个输出XML,那么您最好简单地使用XML样式表转换。在您的代码中,您创建了一个没有任何模板的转换器,因此它成为默认的"身份转换器",只将输入复制到输出。假设您的输入XML如下:

<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:gml="http://www.opengis.net/gml/3.2" gml:id="featureCollection" xmlns:eg="acme.com">
<gml:featureMember>
<eg:RST/>
<eg:pole>Krakow</eg:pole>
<eg:localId>id1234</eg:localId>
</gml:featureMember>
<gml:featureMember>
<eg:RST>1002</eg:RST>
<eg:pole>Rzeszow</eg:pole>
<eg:localId>id1235</eg:localId>
</gml:featureMember>
</gml:FeatureCollection>

我已经将eg前缀绑定到某个伪名称空间,因为它在您的示例中丢失了,并修复了格式错误的RST元素。

以下程序将对您的输入运行XSLT转换,并将其写入输出文件。

package xsltplayground;
import java.io.File;
import java.net.URL;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class XSLTplayground {
public static void main(String[] args) throws Exception {
URL url = XSLTplayground.class.getResource("sample.xml");
File input = new File(url.toURI());
URL url2 = XSLTplayground.class.getResource("stylesheet.xsl");
File xslt = new File(url2.toURI());
URL url3 = XSLTplayground.class.getResource(".");
File output = new File(new File(url3.toURI()), "output.xml");
change(input, output, xslt);
}
private static void change(File pathIn, File pathOut, File xsltFile) {
try {
// Creating transformer with XSLT file
TransformerFactory tf = TransformerFactory.newInstance();
Source xsltSource = new StreamSource(xsltFile);
Transformer t = tf.newTransformer(xsltSource);
// Input source
Source input = new StreamSource(pathIn);
// Output target
Result output = new StreamResult(pathOut);
// Transforming
t.transform(input, output);
} catch (TransformerConfigurationException ex) {
Logger.getLogger(XSLTplayground.class.getName()).log(Level.SEVERE, null, ex);
} catch (TransformerException ex) {
Logger.getLogger(XSLTplayground.class.getName()).log(Level.SEVERE, null, ex);
} 
}
}

这里有一个示例stylesheet.xsl文件,为了方便起见,我只是将其转储到与输入XML和类相同的包中。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:eg="acme.com">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>
<xsl:template match="gml:featureMember">
<gml:member>
<xsl:apply-templates select="node()|@*" />
</gml:member>
</xsl:template>
</xsl:stylesheet>

默认情况下,上面的样式表将复制所有内容,但当它到达<gml:featureMember>元素时,它将把内容包装到一个新的<gml:member>元素中。这只是一个非常简单的例子,说明了如何使用XSLT。

输出为:

<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:eg="acme.com" gml:id="featureCollection">
<gml:member>
<eg:RST/>
<eg:pole>Krakow</eg:pole>
<eg:localId>id1234</eg:localId>
</gml:member>
<gml:member>
<eg:RST>1002</eg:RST>
<eg:pole>Rzeszow</eg:pole>
<eg:localId>id1235</eg:localId>
</gml:member>
</gml:FeatureCollection>

由于输入和输出都是文件流,所以不需要在内存中保存整个DOM。Java中的XSLT非常快速高效,所以这可能就足够了。


也许你真的想把某个元素的每一次出现都拆分到它自己的输出文件中,并对其进行一些更改。下面是一个使用StAX将<gml:featureMember>元素拆分为单独文档的代码示例。然后,您可以对创建的文件进行迭代,并根据需要对其进行转换(XSLT也是一个不错的选择)。显然,错误处理需要更加稳健。这只是示范。

package xsltplayground;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URL;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.xml.stream.XMLEventFactory;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLEventWriter;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.XMLEvent;
import javax.xml.transform.stream.StreamSource;
public class XSLTplayground {
public static void main(String[] args) throws Exception {
URL url = XSLTplayground.class.getResource("sample.xml");
File input = new File(url.toURI());
URL url2 = XSLTplayground.class.getResource("stylesheet.xsl");
File xslt = new File(url2.toURI());
URL url3 = XSLTplayground.class.getResource(".");
File output = new File(url3.toURI());
change(input, output, xslt);
}
private static void change(File pathIn, File directoryOut, File xsltFile) throws InterruptedException {
try {
// Creating a StAX event reader from the input
XMLInputFactory xmlIf = XMLInputFactory.newFactory();
XMLEventReader reader = xmlIf.createXMLEventReader(new StreamSource(pathIn));
// Create a StAX output factory
XMLOutputFactory xmlOf = XMLOutputFactory.newInstance();
int counter = 1;
// Keep going until no more events
while (reader.hasNext()) {
// Peek into the next event to find out what it is
XMLEvent next = reader.peek();
// If it's the start of a featureMember element, commence output
if (next.isStartElement() 
&& next.asStartElement().getName().getLocalPart().equals("featureMember")) {
File output = new File(directoryOut, "output_" + counter + ".xml");
try (OutputStream ops = new FileOutputStream(output)) {
XMLEventWriter writer = xmlOf.createXMLEventWriter(ops);
copy(reader, writer);
writer.flush();
writer.close();
}
counter++;
} else {
// Not in a featureMember element: ignore
reader.next();
}
}
} catch (XMLStreamException ex) {
Logger.getLogger(XSLTplayground.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(XSLTplayground.class.getName()).log(Level.SEVERE, null, ex);
} 
}
private static void copy(XMLEventReader reader, XMLEventWriter writer) throws XMLStreamException {
// Creating an XMLEventFactory
XMLEventFactory ef = XMLEventFactory.newFactory();
// Writing an XML document start
writer.add(ef.createStartDocument());
int depth = 0;
boolean stop = false;
while (!stop) {
XMLEvent next = reader.nextEvent();
writer.add(next);
if (next.isStartElement()) {
depth++;
} else if (next.isEndElement()) {
depth--;
if (depth == 0) {
writer.add(ef.createEndDocument());
stop = true;
}
}
}
}
}

最新更新