insert_many mongoDB以及如何忽略会话事务中的重复插入



我使用PyMongo之前没有任何事务和会话,并且成功地插入了以下文档:

try:
_ = db[collection].insert_many(dataset, ordered=False)
except:
err = filter(lambda x: x['code'] != 11000, e.details['writeErrors'])
if len(err) > 0:
raise

上面的代码成功地忽略了关于重复密钥的错误,这正是我想要的。

现在,我升级到了MongoDB 4.0,并尝试了新的事务API,并尝试在一个会话中这样做:

def do_insert(db, dataset, session):
try:
_ = db[collection].insert_many(dataset, ordered=False, session=session)
except pymongo.errors.DuplicateKeyError as e:
pass

然而,该操作也会生成一个OperationFailure错误,我得到了类似于:

ERROR: test_insert_duplicate_categories (__main__.TestDefaultAnnotations)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/xargon/Dropbox/infermatica/code/alchera/altrack/altrack/tests/test_mongodb_default.py", line 152, in test_insert_duplicate_categories
insert_dataset(db, ds)
File "/Users/xargon/Dropbox/infermatica/code/alchera/altrack/altrack/data/default.py", line 269, in insert_dataset
session.commit_transaction()
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/client_session.py", line 393, in commit_transaction
self._finish_transaction_with_retry("commitTransaction")
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/client_session.py", line 457, in _finish_transaction_with_retry
return self._finish_transaction(command_name)
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/client_session.py", line 452, in _finish_transaction
parse_write_concern_error=True)
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/database.py", line 514, in _command
client=self.__client)
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/pool.py", line 579, in command
unacknowledged=unacknowledged)
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/network.py", line 150, in command
parse_write_concern_error=parse_write_concern_error)
File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/helpers.py", line 155, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: Transaction 1 has been aborted.

呼叫为:

with db.client.start_session() as session:
try:
session.start_transaction()
do_insert(db, dataset, session)
session.commit_transaction()
except Exception as e:
session.abort_transaction()
raise

如何在事务设置中忽略此重复密钥错误?问题是,尽管我忽略了重复密钥异常,但事务现在似乎处于不一致的状态。所以当我提交时,它会抛出那个异常。

因此,我的用例是,我可以让用户尝试插入重复项,如果记录已经存在,数据库应该默默地忽略插入

如何在事务设置中忽略此重复密钥错误?

正如您所知,在MongoDB(v4.0(的当前稳定版本中,DuplicatedKey错误将中止事务。

这是因为只有在数据写入后才会检查DuplicatedKey,而约定是中止WiredTiger Storage Engine事务。这适用于任何索引约束,都会导致相同的问题,例如在地理索引中插入无效的位置格式。

我升级到MongoDB 4.0并尝试了新的事务API,并尝试在会话中这样做。

请注意,升级到MongoDB 4.0并不意味着您还必须更新代码才能使用多文档事务。仅当用例需要对多个文档进行更新的原子性或对多个文件的读取之间的一致性时。

最新更新