MongoDB - 错误"Too many results for query, truncating output"与$geoNear



我在我的sharded集群上运行$geoNear查询(6个节点,每个节点有3个副本集,每个2 shardsvr和1仲裁器)。我期望查询返回110万个文档。我只收到~130。xxx文档。我使用Java驱动程序发出查询并处理数据(目前,我只是计算返回的文档)。我使用MongoDB 3.2.9和最新的java驱动程序。

mongod日志显示以下错误,这是由于输出文档大于16MB引起的:

2016-10-10T12:00:22.933+0200 W COMMAND  [conn22] Too many geoNear results for query { location: { $nearSphere: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx] }, $maxDistance: 3900.0 } }, truncating output.
2016-10-10T12:00:22.951+0200 I COMMAND  [conn22] command mydb.data command: geoNear { geoNear: "data", near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] }, 
    num: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: true } keyUpdates:0 writeConflicts:0 numYields:890 reslen:16777310 
    locks:{ Global: { acquireCount: { r: 1784 } }, Database: { acquireCount: { r: 892 } }, Collection: { acquireCount: { r: 892 } } } protocol:op_query 589ms
2016-10-10T12:00:23.183+0200 I COMMAND  [conn22] getmore mydb.data query: { aggregate: "data", pipeline: [ { $geoNear: { near: { type: "Point", coordinates: [ 10.xxxx, 52.xxxxx ] }, 
    distanceField: "dist.calculated", limit: 50000000, maxDistance: 3900.0, query: {}, spherical: true, distanceMultiplier: 1.0, includeLocs: "dist.location" } }, { $project: { _id: false, 
    dist: { calculated: true } } } ], fromRouter: true, cursor: { batchSize: 0 } } cursorid:170255616227 ntoreturn:0 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:43558 
    reslen:1568108 locks:{ Global: { acquireCount: { r: 1786 } }, Database: { acquireCount: { r: 893 } }, Collection: { acquireCount: { r: 893 } } } 820ms

查询:

db.data.aggregate([
   {
      $geoNear:{
         near:{
            type:"Point",
            coordinates:[
               10.xxxx,
               52.xxxxx
            ]
         },
         distanceField:"dist.calculated",
         maxDistance:3900,
         num:50000000,
         includeLocs:"dist.location",
         spherical:true
      }
   }
])

请注意,我发出了带有和不带有参数num的查询,都失败了,出现了上面所示的错误。

我期望查询在超过文档大小限制(16 MB)时返回数据库的块。我错过了什么?如何检索所有数据?

编辑:当我添加分组阶段时,查询也会在mongod日志中失败,并出现相同的错误:

db.data.aggregate([
   {
      $geoNear:{
         near:{
            type:"Point",
            coordinates:[
               10.xxxx,
               52.xxxxxx
            ]
         },
         distanceField:"dist.calculated",
         maxDistance:3900,
         includeLocs:"dist.location",
         num:2000000,
         spherical:true
      }
   },
   {
      $group:{
         _id:"$root_document"
      }
   }
])

MongoDB工作人员Lungang Fang在此期间回答了我对MongoDB用户组的询问。以下是他的回答:

目前,"geoNear"聚合阶段仅限于返回在16MB BSON大小限制内的结果。这与MongoDB早期版本的问题(在https://jira.mongodb.org/browse/server - 13486)。你的查询命中了这个问题,因为" geoNear "返回一个文档(包含一个数组)(结果文档)和"allowDiskUse"聚合管道不幸的是,选项在这种情况下没有帮助。

有两个选项可以考虑:

如果你不需要所有的结果,你可以限制"geoNear"使用num、limit或maxDistance选项聚合结果大小如果需要所有的结果,可以使用find()操作符不限于BSON的最大大小,因为它返回一个游标。以下是我在MongoDB 3.2.10上做的测试供您参考。

为指定的集合创建" 2dsphere "db.coll.createIndex({location: '2dsphere'})创建并插入几个大文档:
var padding = ''; for (var j = 0; j < 15; j++) { for (var i = 1024*128; i > 0; --i) { var padding = padding + '12345678'; } }

 db.coll.insert({location:{type:"Point", coordinates:[-73.861, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.862, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.863, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.864, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.865, 40.73]}, padding:padding})
 db.coll.insert({location:{type:"Point", coordinates:[-73.866, 40.73]}, padding:padding}) Query using “geoNear” and server log shows “Too many geoNear results …, truncating output”
 db.coll.aggregate(
     [
         {
             $geoNear:{
                 near:{type:"Point", coordinates:[-73.86, 40.73]},
                 distanceField:"dist.calculated",
                 maxDistance:150000000,
                 spherical:true
             }
         },
         {$project: {location:1}}
     ]
 ) Query using “find” and all expected documents are returned
 // This and following "var" are necessary to avoid the screen being flushed by padding string.
 var cursor = db.coll.find (
     {
         location: {
             $near: {
                 $geometry:{type:"Point", coordinates:[-73.86, 40.73]},
                 maxDistance:150000,
             }
         }
     }
 )
 // It is necessary to iterate through the cursor. Otherwise, the query is not actually executed.
 var x = cursor.next()
 x._id
 var x = cursor.next()
 x._id
 ... 

问候,Lungang

相关内容

最新更新