我有一个由以下碎片组成的3个碎片集群:
- bp-rs0
- bp-rs1
- bp-rs3
我想删除1个shard;bp-rs3 .
我执行了db.adminCommand( { removeShard: "bp-rs3" } )
并得到了我所期望的,典型的确认。
它说我需要删除或移动一个不再需要的数据库,所以我删除了它。我不确定这是否导致了我的问题:
几个小时以来,运行db.adminCommand( { removeShard: "bp-rs3" } )
返回的排水消息完全如下所示:
{
"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : 334,
"dbs" : 0
},
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [ ],
"ok" : 1,
"operationTime" : Timestamp(1629235413, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1629235413, 2),
"signature" : {
"hash" : BinData(0,"IkfHFSkxh7gQheeWlXsI/tTjU1U="),
"keyId" : 6978594490403520515
}
}
}
注意剩余的334块. 好久没变了
这不是一个太大的问题,但是我最常用的集合现在是不可查询的,这意味着它所服务的应用程序是不可用的。
当试图查询我唯一的分区集合时,我得到以下错误:
{
"message" : "Encountered non-retryable error during query :: caused by :: Could not find host matching read preference { mode: 'primary' } for set bp-rs1",
"ok" : 0,
"code" : 133,
"codeName" : "FailedToSatisfyReadPreference",
"operationTime" : "Timestamp(1629232940, 1)",
"$clusterTime" : {
"clusterTime" : "Timestamp(1629232944, 2)",
"signature" : {
"hash" : "IlYQ/HU+EWYsm8CL2xtCziX6xtY=",
"keyId" : "6978594490403520515"
}
},
"name" : "MongoError"
}
我不知道为什么bp-rs1会受到影响。Bp-rs0是主节点
sh.status
返回以下内容:
--- Sharding Status ---
sharding version: {
"_id" : NumberInt(1),
"minCompatibleVersion" : NumberInt(5),
"currentVersion" : NumberInt(6),
"clusterId" : ObjectId("602d2def7771e35f1961e454")
}
shards:
{ "_id" : "bp-rs0", "host" : "bp-rs0/xxx:27020,xxx:27020", "state" : NumberInt(1) }
{ "_id" : "bp-rs1", "host" : "bp-rs1/xxx:27020", "state" : NumberInt(1) }
{ "_id" : "bp-rs3", "host" : "bp-rs3/xxx:27020", "state" : NumberInt(1), "draining" : true }
active mongoses:
"4.0.3" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: yes
Failed balancer rounds in last 5 attempts: 5
Last reported error: Could not find host matching read preference { mode: "primary" } for set bp-rs1
Time of Reported error: Tue Aug 17 2021 23:09:45 GMT+0100 (British Summer Time)
Migration Results for the last 24 hours:
241 : Success
1 : Failed with error 'aborted', from bp-rs3 to bp-rs1
databases:
{ "_id" : "xxx", "primary" : "bp-rs0", "partitioned" : true, "version" : { "uuid" : UUID("c6301dba-1f34-4043-be6f-1e99dc9a8fb9"), "lastMod" : NumberInt(1) } }
xxx.listings
shard key: { "meta.canonical" : 1 }
unique: false
balancing: true
chunks:
bp-rs0 696
bp-rs1 695
bp-rs3 334
too many chunks to print, use verbose if you want to force print
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : NumberInt(1) }
unique: false
balancing: true
chunks:
bp-rs0 1
{ "_id" : MinKey } -->> { "_id" : MaxKey } on : bp-rs0 Timestamp(1, 0)
我能做点什么吗?是回滚并重新开始,还是只是让一切正常工作?
Thanks in advance
我连接到bp-rs2,发现服务由于某种原因崩溃了。我再次启动它,迁移完成了我所期望的。
我不知道确切的原因,但可能是因为我在数据流失的时候掉了数据库。