我使用Scroll API从我们的Elastic Search中获取超过10,000个文档,但是,每当我的代码试图查询超过10k时,我就会得到以下错误:
Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
这是我的代码:
try {
// 1. Build Search Request
final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
SearchRequest searchRequest = new SearchRequest(eventId);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(queryBuilder);
searchSourceBuilder.size(limit);
searchSourceBuilder.profile(true); // used to profile the execution of queries and aggregations for a specific search
searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); // optional parameter that controls how long the search is allowed to take
if(CollectionUtils.isNotEmpty(sortBy)){
for (int i = 0; i < sortBy.size(); i++) {
String sortByField = sortBy.get(i);
String orderByField = orderBy.get(i < orderBy.size() ? i : orderBy.size() - 1);
SortOrder sortOrder = (orderByField != null && orderByField.trim().equalsIgnoreCase("asc")) ? SortOrder.ASC : SortOrder.DESC;
if(keywordFields.contains(sortByField)) {
sortByField = sortByField + ".keyword";
} else if(rawFields.contains(sortByField)) {
sortByField = sortByField + ".raw";
}
searchSourceBuilder.sort(new FieldSortBuilder(sortByField).order(sortOrder));
}
}
searchSourceBuilder.sort(new FieldSortBuilder("_id").order(SortOrder.ASC));
if (includes != null) {
String[] excludes = {""};
searchSourceBuilder.fetchSource(includes, excludes);
}
if (CollectionUtils.isNotEmpty(aggregations)) {
aggregations.forEach(searchSourceBuilder::aggregation);
}
searchRequest.scroll(scroll);
searchRequest.source(searchSourceBuilder);
SearchResponse resp = null;
try {
resp = client.search(searchRequest, RequestOptions.DEFAULT);
String scrollId = resp.getScrollId();
SearchHit[] searchHits = resp.getHits().getHits();
// Pagination - will continue to call ES until there are no more pages
while(searchHits != null && searchHits.length > 0){
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
scrollRequest.scroll(scroll);
resp = client.scroll(scrollRequest, RequestOptions.DEFAULT);
scrollId = resp.getScrollId();
searchHits = resp.getHits().getHits();
}
// Clear scroll request to release the search context
ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
clearScrollRequest.addScrollId(scrollId);
client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
} catch (Exception e) {
String msg = "Could not get search result. Exception=" + ExceptionUtilsEx.getExceptionInformation(e);
throw new Exception(msg);
我正在从这个链接实现解决方案:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-search-scroll.html
谁能告诉我我做错了什么,我需要做什么,以获得超过10000滚动api?
如果您的迭代花费超过5分钟,那么您需要调整滚动时间。修改这一行,以确保滚动上下文不会在1分钟后消失
final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(10L));
并删除这个:
searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); // optional parameter that controls how long the search is allowed to take