MongoDB: Aggregation ($sort)对一个集合的联合非常慢


"$match": {
"$and": [
{ ... }
// repeat this chunk for each collection
"$unionWith": {
"coll": "anotherCollection",
"pipeline": [
"$match": {
"$and": [
{ ... }
// Then an overall limit/handle pagination for all the unioned results
// UPDATE: Realised the sort is the culprit
{ "$sort": { "createdAt": -1 } },
{ "$skip": 0},
{ "$limit": 50 }





! !重要! !您需要跟踪找到的最后一项,而不是跳过一些元素。如果不这样做,就会失去分页的跟踪,并且可能永远不会返回某些数据,或者返回某些数据两次,这比慢得多

"$match": {
"$and": [
{ ... }
"_id":{"$gt": lastKnownIdOfCollectionA} // this will filter out everything you already saw, so no skip needed
{ "$sort": { "createdAt": -1 } }, // this sorting is indexed!
{ "$limit": 50 } // maybe you will take 0 but max 50, you don't care about the rest
// repeat this chunk for each collection
"$unionWith": {
"coll": "anotherCollection",
"pipeline": [
"$match": {
"$and": [
{ ... }
"_id":{"$gt": lastKnownIdOfCollectionB} // this will filter out everything you already saw, so no skip needed
{ "$sort": { "createdAt": -1 } }, // this sorting is indexed!
{ "$limit": 50 } // maybe you will take 0 but max 50, you don't care about the rest
// At this point you have MAX 100 elements, an index is not needed for sorting :)
{ "$sort": { "createdAt": -1 } },
{ "$skip": 0},
{ "$limit": 50 }

在本例中,我通过_id进行早期筛选,其中还包含createat时间戳。如果筛选与创建日期无关,则可能需要定义最适合的标识符。请记住,标识符必须是唯一的标识符,但您可以使用多个值组合(例如。createdAt + randomizedId)