我有包含以下数据的集合(集合包含超过 1000 万条记录)
> db.LogBuff.find()
{ "_id" : ObjectId("578899d5d2b76f77d083f16c"), "SUBJECT" : "DD", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f16d"), "SUBJECT" : "AA", "SYS" : "B" }
{ "_id" : ObjectId("578899d5d2b76f77d083f16e"), "SUBJECT" : "BB", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f16f"), "SUBJECT" : "AA", "SYS" : "C" }
{ "_id" : ObjectId("578899d5d2b76f77d083f170"), "SUBJECT" : "BB", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f171"), "SUBJECT" : "BB", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f172"), "SUBJECT" : "CC", "SYS" : "B" }
{ "_id" : ObjectId("578899d5d2b76f77d083f173"), "SUBJECT" : "AA", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f174"), "SUBJECT" : "CC", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f175"), "SUBJECT" : "DD", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f176"), "SUBJECT" : "AA", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f177"), "SUBJECT" : "BB", "SYS" : "C" }
{ "_id" : ObjectId("578899d5d2b76f77d083f178"), "SUBJECT" : "CC", "SYS" : "D" }
{ "_id" : ObjectId("578899d5d2b76f77d083f179"), "SUBJECT" : "DD", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f17a"), "SUBJECT" : "AA", "SYS" : "B" }
{ "_id" : ObjectId("578899d5d2b76f77d083f17b"), "SUBJECT" : "BB", "SYS" : "B" }
{ "_id" : ObjectId("578899d5d2b76f77d083f17c"), "SUBJECT" : "AA", "SYS" : "A" }
{ "_id" : ObjectId("578899d5d2b76f77d083f17d"), "SUBJECT" : "CC", "SYS" : "C" }
我想获得以下类型的输出
{ "_id" : { "SUBJECT" : "CC", "SYS" : "C" }, "COUNT" : 1 }
{ "_id" : { "SUBJECT" : "DD", "SYS" : "A" }, "COUNT" : 3 }
{ "_id" : { "SUBJECT" : "AA", "SYS" : "B" }, "COUNT" : 2 }
{ "_id" : { "SUBJECT" : "AA", "SYS" : "C" }, "COUNT" : 1 }
{ "_id" : { "SUBJECT" : "CC", "SYS" : "B" }, "COUNT" : 1 }
{ "_id" : { "SUBJECT" : "BB", "SYS" : "A" }, "COUNT" : 3 }
{ "_id" : { "SUBJECT" : "BB", "SYS" : "C" }, "COUNT" : 1 }
{ "_id" : { "SUBJECT" : "AA", "SYS" : "A" }, "COUNT" : 3 }
{ "_id" : { "SUBJECT" : "CC", "SYS" : "A" }, "COUNT" : 1 }
{ "_id" : { "SUBJECT" : "CC", "SYS" : "D" }, "COUNT" : 1 }
{ "_id" : { "SUBJECT" : "BB", "SYS" : "B" }, "COUNT" : 1 }
这是我的代码
db.LogBuff.mapReduce(
function(){
emit( { SUBJECT : this.SUBJECT, SYS : this.SYS } , this.SYS);
},
function(key,values){
return $count:1 <-stuck here
}
)
由于一些限制,我无法使用聚合方法。我使用了以下聚合代码:
db.LogBuff.aggregate([ {"$group" : {_id:{SUBJECT:"$SUBJECT",SYS:"$SYS"},COUNT:{$sum:1}}}, {$sort:{_id:1}},])
虽然这适用于有限数量的记录,但对于大量记录,它会返回此错误(注意 - 我不是 root 用户,因此我无法更改配置):
断言: 命令失败: { "ok" : 0, "errmsg" : "排序超出了 104857600 字节的内存限制,但未选择加入外部排序。正在中止操作。Pass allowDiskUse:true 以选择加入.", "code" : 16819 } :
聚合失败的_getErrorWithCode@src/mongo/shell/utils.js:25:13
尝试使用 allowDiskUse 选项:
db.LogBuff.aggregate([ {"$group" : {_id:{SUBJECT:"$SUBJECT",SYS:"$SYS"},COUNT:{$sum:1}}}, {$sort:{_id:1}}], {allowDiskUse: true})