MongoDB分片集群shard节点宕机后的处理

版权声明:本文为博主原创文章,转载时需要带上原文链接。https://blog.csdn.net/weixin_39032311/article/details/92805072

方案:用新shard节点替换宕掉的shard节点

一、新建shard节点并启动(假设新shard节点的端口号为27028)
二、在mongos上删除原shard节点的配置信息

 >  use admin
 >  db.runCommand({removeShard:"10.201.81.105:27018"})

可能出现以下错误信息:

mongos> db.runCommand({removeShard:"10.201.81.105:27018"})

{

    "msg" : "draining ongoing",

    "state" : "ongoing",

    "remaining" : {

    "chunks" : NumberLong(0),

    "dbs" : NumberLong(1)

},

"note" : "you need to drop or movePrimary these databases",

"dbsToMove" : [

    "mgotest2"

  ],

"ok" : 1,

"operationTime" : Timestamp(1560822195, 1),

"$clusterTime" : {

        "clusterTime" : Timestamp(1560822195, 1),

        "signature" : {

               "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

               "keyId" : NumberLong(0)

         }

    }

}

注意:如果删除的片是数据库的大本营(基片),必须手动移动或删除数据库,用moveprimary命令,上面的示例中就提示10.201.81.105:27018是mgotest2库的大本营(primary),这个信息可以通过查看config.databases看到:

mongos>  use config
 switched to db config

mongos> db.databases.find()
 { "_id" : "mgotest2", "primary" : "shardsvr2", "partitioned" : false, "version" : { "uuid" : UUID("360e1a26-b6f9-44a5-90ea-a148bf854e59"), "lastMod" : 1 } }

此操作需要主节点10.201.81.105:27018正常运行状态下进行,否则会出现以下错误

mongos> use admin
  switched to db admin

mongos> db.runCommand({"moveprimary":"mgotest2","to":"10.201.81.218:27018"})
{

    "ok" : 0,

    "errmsg" : "Could not find host matching read preference { mode: \"primary\" } for set shardsvr2",

    "code" : 133,

    "codeName" : "FailedToSatisfyReadPreference",

    "operationTime" : Timestamp(1560823671, 1),

    "$clusterTime" : {

        "clusterTime" : Timestamp(1560823671, 1),

        "signature" : {

                "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

                "keyId" : NumberLong(0)

        }

    }

}

若主节点10.201.81.105:27018不能正常运行,则进行以下操作

mongos> use config

   switched to db config

mongos> db.shards.find()

    { "_id" : "shardsvr1", "host" : "shardsvr1/10.201.81.218:27018", "state" : 1 }

    { "_id" : "shardsvr2", "host" : "shardsvr2/10.201.81.105:27018", "state" : 1, "draining" : true }

mongos> db.databases.find()

    { "_id" : "mgotest2", "primary" : "shardsvr2", "partitioned" : false, "version" : { "uuid" : UUID("360e1a26-b6f9-44a5-90ea-a148bf854e59"), "lastMod" : 1 } }

mongos> db.databases.remove({"_id" : "mgotest2"})

    WriteResult({ "nRemoved" : 1 })

进行以上步骤后,可以正常删除原shard节点的配置信息

三、添加新shard节点的配置信息

mongos> sh.addShard("shardsvr1/10.201.81.105:27028")

{

    "shardAdded" : "shardsvr1",

    "ok" : 1,

    "operationTime" : Timestamp(1560825183, 1),

    "$clusterTime" : {

            "clusterTime" : Timestamp(1560825183, 1),

            "signature" : {

                    "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

                    "keyId" : NumberLong(0)

            }

    }

}

你可能感兴趣的:(Mongodb)