pymongo : bson.errors.InvalidBSON: 'utf8' 编解码器无法解码位置的字节0xc0



我有一个脚本,可以将数据从一个集合复制到另一个集合。但有时脚本会停止并显示以下错误:

Traceback (most recent call last):
  File "request_archive.py", line 80, in <module>
    ret = archive_requests(dbType)
  File "request_archive.py", line 41, in archive_requests
    for doc in reportCursor : 
  File "/usr/local/lib64/python2.7/site-packages/pymongo/cursor.py", line 1176, in next
    if len(self.__data) or self._refresh():
  File "/usr/local/lib64/python2.7/site-packages/pymongo/cursor.py", line 1087, in _refresh
    self.__send_message(q)
  File "/usr/local/lib64/python2.7/site-packages/pymongo/cursor.py", line 970, in __send_message
    codec_options=self.__codec_options)
  File "/usr/local/lib64/python2.7/site-packages/pymongo/cursor.py", line 1057, in _unpack_response
    return response.unpack_response(cursor_id, codec_options)
  File "/usr/local/lib64/python2.7/site-packages/pymongo/message.py", line 945, in unpack_response
    return bson.decode_all(self.documents, codec_options)
bson.errors.InvalidBSON: 'utf8' codec can't decode byte 0xc0 in position 2: invalid start byte

当我尝试查找导致此问题的文档时,在 mongo 中通常是这样的:

{
    "_id" : ObjectId("38636f733444373635323637"),
    "mobiles" : "..��..��..��..��..��..��..��..��..��..��etc/passwd",
    "requestDate" : ISODate("2018-03-19T09:32:45.000Z"),
    "isCopied" : NumberLong(0)
}

如何处理?

我也无法将其放入 try-catch 中,因为错误是在线引起的,同时迭代光标。我在SO上找到了一些答案,但这不起作用。

我正在使用python-2.7和pymongo v3.6.0。

编辑 1:

这就是我复制数据的方式:

findData = {'isCopied' : 0 , 'requestDate' : { '$lte' : today } }
collection1Cursor = collection1.find(findData)
for doc in collection1Cursor : # getting error in this line
    updateArr.append(doc['_id'])
    doc.pop('isCopied', None)
    dataArr.append(doc)
collection2.insert(dataArr,continue_on_error=True)
collection1.update_many({'_id' : {'$in' : updateArr}},{'$set' : {'isCopied' : 1}})

请在MongoClient中将其用作连接字符串或其他格式,如下所示

它对我有用。

?unicode_decode_error_handler=ignore
host='',unicode_decode_error_handler='ignore'

相关内容

最新更新