如何使用Java使用csv文件导出巨大数据(近100万数据)



我正试图从PostgreSQL数据库导出数据,因为我必须导出的数据量接近100万。我尝试了各种方法,但没有找到解决方案。即使我使用邮递员来调用API,我已经编写了该API来导出csv,邮递员也会关闭。我正在使用react.js下载,但它已经加载了几个小时。我正在张贴出口的代码

public String populateCsvReport(SearchDto searchDto){
List<DetailRecord> myDetailRecord = itsCustomRepo.getDetail(searchDto);
StringWriter sw = new StringWriter();
try(CSVPrinter csvPrinter = new CSVPrinter(sw,CSVFormat.DEFAULT.withHeader("Supplier Number"
"Supplier name"........
)){
myDetailRecord.forEach(mydetail->{
csvPrinter.printRecord(
mydetail.getSuplNum(),
mydetail.getSuplName(),
......................
)
});
return myDetailRecord;

现在,我也尝试将我的代码更改为

myDetailRecord.forEach(mydetail->{
mydetail.getSuplNum(),
mydetail.getSuplName(),
......................
});
csvPrinter.printRecord(
myDetailRecord
);

但它并没有对我的代码产生影响。

在我的控制器中,我做得像

@Getmapping(path="/get-export-detail/csv"){
public RespnseEntity<String> generateMydetailExport(SearchDto searchDto){
return ResponseEntity.ok()
.header("Content-Disposition","attachment;fileName="+"myDetails.csv")
.contentType(MediaType.parseMediaType("text/csv"))
.body(callingService.populateCSVForDetail(searchDto));

这里我使用react.js代码导出文件

const exportOnClick=()=>{
callingDetailsService.export(param)
.then(response)=>{
let mime = "test/csv";
let fileName = "myDetail.csv";
util.downloadFile(response.data,fileName,mime);

这是我的自定义存储库代码

@Repository
public class ItsCustomRepo{
@PersistanceContext
private EntityManager entityManager;
public List<DetailRecord> getDetail(List<SearchCriteria> params){
List<DetailRecord> listOfDetail = new ArrayList<>();
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<DetailEntity> cQuery = cb.createQuery(DetailEntity.class);
Root<DetailEntity> rootE = cQuery.from(DetailEntity.class);
String sqlQuery = "select ............."
if(params.size()>0){
for(SearchCriteria param:params){
if(param.getValue()!=null && param.getValue()!=""){
if(param.getOperation().equalIgnoreCase(CriteriaOperationEnum.GREATER_THAN_OR_EQUAL.getOperation()){
if(rootE.get(param.getKey()).getJavaType()==LocalDate.class){
}else if(param.getOperation().equalIgnoreCase(CriteriaOperationEnum.LESS_THAN_OR_EQUAL.getOperation()
//some op
}else{ if(param.getOPeration().equalsIgnoreCase(CriteriaOperationEnum.LIKE.getOperation())){
//some op
}
}
Query query = entityManager.createNativeQuery(sqlQuery);
List<Object[]> objectList  = query.getResultList();
int count  = objectList.size();
objectList.forEach(glObject->{
DetailRecord detailRecord = DetailRecord.builder()
.supl_num(glObject[0])
...................
listOfDetail .add(detailRecord);
});
return listOfDetail;

我的代码很简单,因为我不知道它在哪里失败了,我在运行查询时检查数据库的计数,它很快,而且我可以在调试时看到代码顺利地到达控制器,但之后它会挂上几个小时。我试过使用opencsv、apache-poi等。我不明白它在哪里失败了,有人请帮我。

下面是一些用两种方法生成CSV的示例代码。第一个与您的类似——它获取列表中的所有行,然后创建csv。第二种方法更为";流式传输";因为它可以在从数据库中获取行后立即写出这些行。有了1M行,情况就大不相同了。

import org.h2.jdbcx.JdbcDataSource;
import org.springframework.jdbc.core.JdbcTemplate;
import java.util.List;
import java.util.function.Consumer;
public class CsvSample {
static class Player {
int id;
String name;
int teamId;
Player(int id, String name, int temId) {
this.id = id;
this.name = name;
this.teamId = temId;
}
}
interface PlayerRepo {
void save(Player player);
List<Player> findPlayers(int teamId);
int processPlayers(int teamId, Consumer<Player> callback);
}
static class SimplePlayerRepo implements PlayerRepo {
JdbcTemplate jdbc;
SimplePlayerRepo(JdbcTemplate jdbc) {
this.jdbc = jdbc;
this.jdbc.execute("create table if not exists Player(id int primary key, name varchar(30), team int)");
}
@Override
public void save(Player player) {
int n = jdbc.update(
"update Player set name=?, team=? where id=?",
player.name, player.teamId, player.id);
if (n == 0) {
jdbc.update(
"insert into Player(name, team, id) values (?, ?, ?)",
player.name, player.teamId, player.id);
}
}
@Override
public List<Player> findPlayers(int teamId) {
return jdbc.query(
"select id, name, team from Player where team=?",
(rs, n) -> new Player(rs.getInt(1), rs.getString(2), rs.getInt(3)),
teamId);
}
@Override
public int processPlayers(int teamId, Consumer<Player> callback) {
return jdbc.query(
"select id, name, team from Player where team=?",
rs -> {
int n = 0;
while (rs.next()) {
Player p = new Player(rs.getInt(1), rs.getString(2), rs.getInt(3));
callback.accept(p);
}
return n;
},
teamId);
}
}
public static void main(String[] args) throws Exception {
JdbcDataSource dataSource = new JdbcDataSource();
dataSource.setUrl("jdbc:h2:mem:csvsample;DB_CLOSE_DELAY=-1");
PlayerRepo repo = new SimplePlayerRepo(new JdbcTemplate(dataSource));
// add some players
repo.save(new Player(1, "Kobe", 1));
repo.save(new Player(2, "LeBron", 1));
repo.save(new Player(3, "Shaq", 1));
repo.save(new Player(4, "Kareem", 1));
repo.save(new Player(5, "Magic", 1));
repo.save(new Player(6, "Larry", 2));
repo.save(new Player(7, "Jason", 2));
// generate CSV from List
repo.findPlayers(1).forEach(player -> {
System.out.println(player.id + "," + player.name);
});
System.out.println("----");
// generate CSV with callback
repo.processPlayers(1, player -> {
System.out.println(player.id + "," + player.name);
});
}
}

因此,在您的情况下,我将向您的存储库类添加一个方法。它应该包含来自getDetail方法的所有逻辑,直到您到达显示Query query = entityManager.createNativeQuery(sqlQuery);:的行

public int processSearchResults(List<SearchCriteria> params, Consumer<DetailRecord> callback){
// instead of this:
// Query query = entityManager.createNativeQuery(sqlQuery);
Session session = entityManager.unwrap(Session.class);
return session.doWork(new Work() {
@Override
public void execute(Connection connection) throws SQLException {
Statement stmt = connection.createQuery();
ResultSet rs = stmt.executeQuery(sqlQuery);
int n = 0;
while (rs.next()) {
DetailRecord detailRecord = DetailRecord.builder()
.supl_num(rs.getObject(1))
// .....
.build();
callback.accept(detailRecord);
n++;
}
rs.close();
stmt.close();
return n;
}
});
}

然后,你的generateMydetailExport可能看起来像这样:

@Getmapping(path="/get-export-detail/csv", produces="text/csv") {
public void generateMydetailExport(SearchDto searchDto, PrintWriter out) {
CSVPrinter csvPrinter = new CSVPrinter(out, CSVFormat.DEFAULT.withHeader("Supplier Number", /* ... */));
itsCustomRepo.processSearchResults(searchDto, detail -> {
csvPrinter.printRecord(
mydetail.getSuplNum(),
mydetail.getSuplName(),
// .....
);
});
}

最新更新