我在使用 Apache POI 读取一些.docx内容并将结果显示为无格式预览时遇到问题。我使用的是 POI 版本 3.11。
法典:
private static String POI2Text(File file) {
POITextExtractor extractor = null;
try {
extractor = ExtractorFactory.createExtractor(file);
return extractor.getText();
} catch (Exception ex) {
logger.warn("Error:", ex);
} finally {
if (extractor!=null) try { extractor.close(); } catch (Exception ex) { logger.warn("Error:", ex); }
}
return "";
}
在 finally 块 (extractor.close()) 中抛出以下异常:
org.apache.poi.openxml4j.exceptions.OpenXML4JRuntimeException: Fail to save: an error occurs while > saving the package : part
at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:503) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1425) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1412) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.ZipPackage.closeImpl(ZipPackage.java:353) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.OPCPackage.close(OPCPackage.java:425) ~[agent.jar:na]
at org.apache.poi.POIXMLTextExtractor.close(POIXMLTextExtractor.java:87) ~[agent.jar:na]
....
Caused by: java.lang.IllegalArgumentException: part
at org.apache.poi.openxml4j.opc.OPCPackage.addPackagePart(OPCPackage.java:873) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:448) ~[agent.jar:na]
... 15 common frames omitted
任何想法如何防止此异常?最大的问题是,poi 在引发异常后不会释放文件句柄。我需要能够在应用之外移动或编辑文件。
只是一个快速反馈:我可以通过以只读方式打开输入流,然后使用此流使用 POITextractor 提取数据来解决此问题。
try (InputStream is = Files.newInputStream(path, StandardOpenOption.READ);
POITextExtractor extractor = ExtractorFactory.createExtractor(is)) {
return extractor.getText();
} catch (Exception ex) {
logger.warn("Error in file {}", path, ex);
}