如何使用apache.commons中的CSVParser以任何顺序读取CSV列

我有一个csv文件，其中包含以下格式的一些数据：

id,first,last,city
1,john,doe,austin
2,jane,mary,seattle

到目前为止，我正在使用以下代码在csv中阅读：

String path = "./data/data.csv";
Map<Integer, User> map = new HashMap<>();
Reader reader = Files.newBufferedReader(Paths.get(path));
try (CSVParser csvParser = new CSVParser(reader, CSVFormat.DEFAULT)) {
List<CSVRecord> csvRecords = csvParser.getRecords();
for(int i=0; i < csvRecords.size(); i++){
if(0<i){//skip over header
CSVRecord csvRecord = csvRecords.get(i);
User currentUser = new User(
Double.valueOf(csvRecord.get(0)).intValue(),
Double.valueOf(csvRecord.get(1)),
Double.valueOf(csvRecord.get(2)),
Double.valueOf(csvRecord.get(3))
);
map.put(currentUser.getId(), currentUser);
}
}
} catch (IOException e){
System.out.println(e);
}

它获取正确的值，但如果这些值的顺序不同，比如[city，last，id，first]，它将被错误地读取，因为读取是按照[id，first，last，city]的顺序硬编码的。(用户对象也必须按照id、first、last、city的确切顺序创建字段(

我知道我可以使用"withHeader"选项，但这也需要我提前定义标题列的顺序，如下所示：

String header = "id,first,last,city";
CSVParser csvParser = new CSVParser(reader, CSVFormat.EXCEL.withHeader(header.split(",")));

我也知道有一个内置的函数getHeaderNames((，但它只有在我已经将它们作为字符串传入后才能获得标题(再次进行编码(。因此，如果我传入标题字符串"；last，first，id，city"它会在列表中返回该值。

有没有一种方法可以将这些位组合在csv中读取，无论列的顺序是什么，并用按顺序传递的字段(id、first、last、city(定义我的"User"对象？

我们需要告诉解析器为我们处理头行。我们将其指定为CSVFormat的一部分，因此我们将创建这样的自定义格式：

CSVFormat csvFormat = CSVFormat.RFC4180.withFirstRecordAsHeader();

问题代码使用了DEFAULT，但这是基于RFC4180。并排比较：

DEFAULT                               RFC4180                       Comment
===================================   ===========================   ========================
withDelimiter(',')                    withDelimiter(',')            Same
withQuote('"')                        withQuote('"')                Same
withRecordSeparator("rn")           withRecordSeparator("rn")   Same
withIgnoreEmptyLines(true)            withIgnoreEmptyLines(false)   Don't ignore blank lines
withAllowDuplicateHeaderNames(true)   -                             Don't allow duplicates
===================================   ===========================   ========================
withFirstRecordAsHeader()     We need this

有了这个变化，我们可以调用get(String name)而不是get(int i):

User currentUser = new User(
Integer.parseInt(csvRecord.get("id")),
csvRecord.get("first"),
csvRecord.get("last"),
csvRecord.get("city")
);

注意，CSVParser实现了Iterable<CSVRecord>，所以我们可以为每个循环使用一个，这使得代码看起来像这样：

String path = "./data/data.csv";
Map<Integer, User> map = new HashMap<>();
try (CSVParser csvParser = new CSVParser(Files.newBufferedReader(Paths.get(path)),
CSVFormat.RFC4180.withFirstRecordAsHeader())) {
for (CSVRecord csvRecord : csvParser) {
User currentUser = new User(
Integer.parseInt(csvRecord.get("id")),
csvRecord.get("first"),
csvRecord.get("last"),
csvRecord.get("city")
);
map.put(currentUser.getId(), currentUser);
}
}

即使列顺序发生变化，该代码也能正确解析文件，例如：

last,first,id,city
doe,john,1,austin
mary,jane,2,seattle

相关内容

最新更新

热门标签：