实际上我正在研究一个java程序,从Excel文件中提取数据,我正在使用POI库,事实上,我必须指定每个提取值的类型,但文件包含大量不同类型的数据,所以我问是否有另一种方法来获得所有的数据作为一个字符串。
谢谢。
问好package DAO;
import java.io.FileInputStream;
import java.util.Iterator;
import java.util.Vector;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
public class ReadExcelFile {
public static void main(String[] args) {
String fileName = "C:\Users\marrah\Desktop\TRIAL FILE1.xls";
Vector dataHolder = ReadCSV(fileName);
printCellData(dataHolder);
}
public static Vector ReadCSV(String fileName) {
Vector cellVectorHolder = new Vector();
try {
FileInputStream myInput = new FileInputStream(fileName);
POIFSFileSystem myFileSystem = new POIFSFileSystem(myInput);
HSSFWorkbook myWorkBook = new HSSFWorkbook(myFileSystem);
HSSFSheet mySheet = myWorkBook.getSheetAt(0);
Iterator rowIter = mySheet.rowIterator();
while (rowIter.hasNext()) {
HSSFRow myRow = (HSSFRow) rowIter.next();
Iterator cellIter = myRow.cellIterator();
Vector cellStoreVector = new Vector();
while (cellIter.hasNext()) {
HSSFCell myCell = (HSSFCell) cellIter.next();
cellStoreVector.addElement(myCell);
}
cellVectorHolder.addElement(cellStoreVector);
}
} catch (Exception e) {
e.printStackTrace();
}
return cellVectorHolder;
}
private static void printCellData(Vector dataHolder) {
for (int i = 0; i < dataHolder.size(); i++) {
Vector cellStoreVector = (Vector) dataHolder.elementAt(i);
for (int j = 0; j < cellStoreVector.size(); j++) {
HSSFCell myCell = (HSSFCell) cellStoreVector.elementAt(j);
Object stringCellValue="";
stringCellValue =cellStoreVector.get(j).toString();
System.out.print(stringCellValue.toString()+"t");
}
}
}
}
我有一个单元测试,其中我使用以下代码从Excel文件中提取所有文本,而不使用任何格式,对于某些用例,这可能比逐个迭代所有元素更快:
private POITextExtractor extractText(File file) throws IOException {
InputStream inp = null;
try {
inp = new PushbackInputStream(
new FileInputStream(file), 8);
if(POIFSFileSystem.hasPOIFSHeader(inp)) {
return createExtractor(new POIFSFileSystem(inp));
}
throw new IllegalArgumentException("Your File was neither an OLE2 file, nor an OOXML file");
} finally {
if(inp != null) inp.close();
}
}
private static POITextExtractor createExtractor(POIFSFileSystem fs) throws IOException {
return createExtractor(fs.getRoot(), fs);
}
private static POITextExtractor createExtractor(DirectoryNode poifsDir, POIFSFileSystem fs) throws IOException {
for(Iterator<Entry> entries = poifsDir.getEntries(); entries.hasNext(); ) {
Entry entry = entries.next();
if(entry.getName().equals("Workbook")) {
{
return new ExcelExtractor(poifsDir, fs);
}
}
}
throw new IllegalArgumentException("No supported documents found in the OLE2 stream");
}
private String assertContains(File file, String... contents) throws IOException {
assertTrue(file.exists());
POITextExtractor extractor = extractText(file);
assertNotNull(extractor);
String str = extractor.getText();
for(String s : contents) {
assertTrue("Did expect to find text '" + s + "' in resulting Excel file, but did not find it in str: " + str, str.contains(s));
}
return str;
}
您可以创建一个通用函数,以便在遍历每行时对每个单元格使用,该函数将验证数据类型,然后以您喜欢的格式检索它。所以你将一行一行移动到另一行对于每一个单元格你可以这样命名:
private static String getCellvalue(HSSFRow poiRow, int intColActual) {
if (poiFilaActual != null && poiRowActual.getLastCellNum() >= (short) intColActual) {
HSSFCell cell = poiRowActual.getCell(intColActual);
if (cell != null) {
if (HSSFCell.CELL_TYPE_STRING == cell.getCellType()) {
return cell.getRichStringCellValue().toString();
} else if (HSSFCell.CELL_TYPE_BOOLEAN == cell.getCellType()) {
return new String( (cell.getBooleanCellValue() == true ? "true" : "false") );
} else if (HSSFCell.CELL_TYPE_BLANK == cell.getCellType()) {
return "";
} else if (HSSFCell.CELL_TYPE_NUMERIC == cell.getCellType()) {
if(HSSFDateUtil.isCellDateFormatted(cell)){
return ( new SimpleDateFormat("dd/MM/yyyy").format(cell.getDateCellValue()) );
}else{
return new BigDecimal(cell.getNumericCellValue()).toString();
}
}
}
}
return null;
}