使用StaX如何读取UTF-8数据?



如何使用Stax读取标记文本中的所有字符,甚至是&?我对传入的XML文件没有影响。

一个示例XML文件是:

<?xml version="1.0" encoding="UTF-8"?>
<Employees>
<Employee id="1">
<age>22</age>
<name>MyName &amp; Team 01/46</name>
<gender>Female</gender>
<role>Java Developer</role>
</Employee>
....
</Employees>

通过多次尝试;MyName";部分已阅读。

尝试1:

Path gpxPath = Paths.get( path);
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader;
reader = xmlInputFactory.createXMLStreamReader( new FileInputStream(gpxPath.toFile()), "UTF-8");
... 
String name = reader.getText();

尝试2:

XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
try {
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader( 
new DataInputStream(new FileInputStream(fileName)), "UTF-8");
... 
name = new String( xmlStreamReader.getTextCharacters());
// or ... 
name = xmlStreamReader.getText();

如何阅读完整的名称?因此,";MyName&Team 01/46";。

解决方案是在Xml工厂上设置一个属性:

XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
xmlInputFactory.setProperty( IS_COALESCING, true);

最新更新