第七章:管理维护Replica Sets(读写分离&故障转移&增删节点)


 
一 . 读写分离


1. 登录主库:
./mongo 192.168.56.88:27017 
插入一条数据:  testrs:PRIMARY> db.person.insert({"name":"zw","sex":"M","age":19})

testrs:PRIMARY> db.person.find()  --主库查询,ok数据出来了
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }

2、登录进入备库
./mongo 192.168.56.88:27017 

查询person下的数据: db.person.find()

testrs:SECONDARY> db.person.find()
Error: error: { "$err" : "not master and slaveOk=false", "code" : 13435 }
报错了:说明从库不能执行查询操作



3. 让从库可以执行查询操作:
testrs:SECONDARY> db.getMongo().setSlaveOk()


testrs:SECONDARY> db.person.find()  --备库已经可以查询啦
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }

注意备库只能查询,不能插入数据
testrs:SECONDARY>  db.person.insert({"name":"zw1","sex":"M","age":22})
WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } }) --报错了额

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
二.故障转移

Replica Sets 比传统的Master-Slave 有改进的地方就是它可以进行自动的故障转移,如果我们停掉Replica Sets 中的一个成员,
那么剩余的成员会再自动选举一个成员,作为主库。

例如:杀掉主库进程
[root@node1 ~]# ps -ef|grep mongo
root       10905      1  1 11:17 ?        00:00:50 ./mongod -f master.conf
root     10960  2563  0 11:20 pts/2    00:00:08 ./mongo 192.168.56.87:27017
root     11596 11570  0 12:02 pts/3    00:00:00 grep mongo

[root@node1 ~] kill -9  10905


[root@node1 bin]# ./mongo 192.168.56.87:27017    --这时候主节点已经连不上了。
MongoDB shell version: 3.0.2
connecting to: 192.168.56.87:27017/test
2015-05-08T12:09:51.202+0800 W NETWORK  Failed to connect to 192.168.56.87:27017, reason: errno:111 Connection refused
2015-05-08T12:09:51.203+0800 E QUERY    Error: couldn't connect to server 192.168.56.87:27017 (192.168.56.87), connection attempt failed
    at connect (src/mongo/shell/mongo.js:181:14)
    at (connect):1:6 at src/mongo/shell/mongo.js:181
exception: connect failed

[root@node1 bin]# ./mongo 192.168.56.88:27017   --备用节点已经连上
MongoDB shell version: 3.0.2
connecting to: 192.168.56.88:27017/test
Server has startup warnings:
2015-05-08T11:18:09.719+0800 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2015-05-08T11:18:09.719+0800 I CONTROL  [initandlisten] 


testrs:PRIMARY> rs.status()   --查看备用节点的健康状况
{
     "set" : "testrs",
     "date" : ISODate("2015-05-08T04:11:22.699Z"),
     "myState" : 1,
     "members" : [
          {
               "_id" : 0,
               "name" : "192.168.56.87:27017",
               "health" : 0,
               "state" : 8,
                 "stateStr" : "(not reachable/healthy)",   --可以看到主节点已经显示为不健康的
               "uptime" : 0,
               "optime" : Timestamp(0, 0),
               "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
               "lastHeartbeat" : ISODate("2015-05-08T04:11:21.636Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T04:08:13.900Z"),
               "pingMs" : 0,
               "lastHeartbeatMessage" : "Failed attempt to connect to 192.168.56.87:27017; couldn't connect to server 192.168.56.87:27017 (192.168.56.87), connection attempt failed",
               "configVersion" : -1
          },
          {
               "_id" : 1,
               "name" : "192.168.56.88:27017",
               "health" : 1,
               "state" : 1,
                "stateStr" : "PRIMARY",   -- 备用库已经变成了主库
               "uptime" : 3193,
               "optime" : Timestamp(1431057178, 1),
               "optimeDate" : ISODate("2015-05-08T03:52:58Z"),
               "electionTime" : Timestamp(1431058095, 1),
               "electionDate" : ISODate("2015-05-08T04:08:15Z"),
               "configVersion" : 1,
               "self" : true
          },
          {
               "_id" : 2,
               "name" : "192.168.56.89:27017",
               "health" : 1,
               "state" : 7,
               "stateStr" : "ARBITER",
               "uptime" : 462,
               "lastHeartbeat" : ISODate("2015-05-08T04:11:20.930Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T04:11:20.724Z"),
               "pingMs" : 0,
               "configVersion" : 1
          }
     ],
     "ok" : 1
}



这时候 看切换后的节点能否正常做插入操作。

testrs:PRIMARY> db.person.find()   - -可以查询操作
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }

testrs:PRIMARY> db.person.insert({"name":"zw2","sex":"M","age":22})    --插入也成功
WriteResult({ "nInserted" : 1 })
testrs:PRIMARY> db.person.find()
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }
{ "_id" : ObjectId("554c38d124f4e026b67bae3e"), "name" : "zw2", "sex" : "M", "age" : 22 }

这时候启动原来的主库看是什么情况?

[root@node1 bin]# ./mongod -f master.conf    --启动原来原来的主库,也就是刚停止的库

note: noprealloc may hurt performance in many applications
about to fork child process, waiting until server is ready for connections.
forked process: 11754
child process started successfully, parent exiting

[root@node1 bin]# ./mongo 192.168.56.87:27017
MongoDB shell version: 3.0.2
connecting to: 192.168.56.87:27017/test
Server has startup warnings:
2015-05-08T12:19:23.600+0800 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2015-05-08T12:19:23.600+0800 I CONTROL  [initandlisten] 

testrs:PRIMARY> db.person.find()  可以看到刚才在故障中在备库插入的数据也同步过来了,这个库自动变成主库
testrs:PRIMARY> use db 
switched to db db
testrs:PRIMARY> db.person.find()
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }
{ "_id" : ObjectId("554c38d124f4e026b67bae3e"), "name" : "zw2", "sex" : "M", "age" : 22 }

testrs:SECONDARY> db.person.find()  备库这时候又从原来的主库自动切换成备库了。

testrs:SECONDARY> db.person.find()  --这时候备库不能查询了
Error: error: { "$err" : "not master and slaveOk=false", "code" : 13435 }

3. 需要执行以下命令可以重新可以查询备库
testrs:SECONDARY>  db.getMongo().setSlaveOk()

testrs:SECONDARY> db.person.find()   --ok,备库又能查询了
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }
{ "_id" : ObjectId("554c38d124f4e026b67bae3e"), "name" : "zw2", "sex" : "M", "age" : 22 }


-------------------------------------------------------------------------------------------------------------------------------------------------------------------
三: 增删节点
MongDB Replica Sets 不仅提供高可用的解决方案,它同时提供负载均衡的解决方案,增减Replica Set 节点在实际应用中十分常见,例如当读压力暴增时
三台节点环境已不能满足要求,那么就需要增加一些节点平均分配一下;当应用压力小时,可以减少一些节点来减少硬件成本。

1.增加节点:
   有两种方案用于增加节点,一种是oplog来增加节点,另一种是通过数据库快照(--fastsync)和oplog来增加节点。

    1.1通过oplog来增加节点

    ①.配置并启动新节点

1. 设置MongoDB存放的目录
      将其解压到/apps,在重命名为mongo ,路径为 /apps/mongo
 [root@node1 /]# mkdir apps
 [root@node1 apps]# chmod -R 755 /apps
 [root@node1 apps]# ll
 total 49308
 -rwxr-xr-x 1 root root 50432804 May  3 12:48 mongodb-linux-x86_64-rhel55-3.0.2.gz

 1.1解压:
 mongodb-linux-x86_64-rhel55-3.0.2.gz
[root@node1 apps]# tar -zxvf mongodb-linux-x86_64-rhel55-3.0.2.gz
mongodb-linux-x86_64-rhel55-3.0.2/README
mongodb-linux-x86_64-rhel55-3.0.2/THIRD-PARTY-NOTICES
mongodb-linux-x86_64-rhel55-3.0.2/GNU-AGPL-3.0
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongodump
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongorestore
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongoexport
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongoimport
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongostat
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongotop
mongodb-linux-x86_64-rhel55-3.0.2/bin/bsondump
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongofiles
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongooplog
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongoperf
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongod
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongos
mongodb-linux-x86_64-rhel55-3.0.2/bin/mongo

1.2 将解压后的
mongodb-linux-x86_64-rhel55-3.0.2 mongo 文件夹更名为mongo

 [root@node1 apps]# mv mongodb-linux-x86_64-rhel55-3.0.2 mongo

[root@node1 apps]# ll
total 49312
drwxr-xr-x 3 root root     4096 May  3 12:50 mongo
       
  1.3 . 建立数据文件夹
 
  mkdir -p /mongodb/data/slaver2         --备
  mkdir -p /mongodb/log/
  touch /mongodb/log/slaver2.log
  touch  /mongodb/slaver2.pid
  chmod -R 755 /mongodb

        1.4.建立配置文件

# slaver2.conf
dbpath=/mongodb/data/slaver2
logpath=/mongodb/log/slaver2.log
pidfilepath=/mongodb/slaver2.pid
directoryperdb=true
logappend=true
replSet=testrs
bind_ip=192.168.56.90
port=27017
oplogSize=10000
fork=true
noprealloc=true

1.5  启动mongodb
           [root@node1 bin]# ./mongod -f slaver2.conf 
  note: noprealloc may hurt performance in many applications
  about to fork child process, waiting until server is ready for connections.
  forked process: 3154
  child process started successfully, parent exiting
      
      ②. 添加次新节点到现有的Replica sets

        testrs:PRIMARY> rs.add("192.168.56.90:27017")
        { "ok" : 1 }

      ③.查看Replica Sets 我们可以清晰的看到内部是如何添加这个新节点的
   
      testrs:PRIMARY>rs.status()     --** 注意这一步在主库添加
       {
               "_id" : 3,
               "name" : "192.168.56.90:27017",
               "health" : 1,
               "state" : 5,
               "stateStr" : "STARTUP2",
               "uptime" : 19,
               "optime" : Timestamp(0, 0),
               "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
               "lastHeartbeat" : ISODate("2015-05-08T05:15:59.147Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T05:15:59.139Z"),
               "pingMs" : 0,
               "configVersion" : 2
          }

      注意这个参数状态:errmsg: still initializing   初始化
                                    errmsg: initial sync need a member to be primary or secondary to do our inital sync    正在进行数据同步
                                    errmsg: intial sync done  初始化同步完成

        ④.在新加的节点验证数据是否同步过来

           ./mongo 192.168.56.90:27017    --登录新加的节点

          testrs:SECONDARY> db.person.find()   --查询失败
          Error: error: { "$err" : "not master and slaveOk=false", "code" : 13435 }

testrs:SECONDARY> use admin
switched to db admin
testrs:SECONDARY>   db.getMongo().setSlaveOk() --让备库查询
testrs:SECONDARY> use db
 testrs:SECONDARY> db.person.find()    --可以看到数据同步成功
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }
{ "_id" : ObjectId("554c38d124f4e026b67bae3e"), "name" : "zw2", "sex" : "M", "age" : 22 }

   到此为止,添加节点已经成功了!
 

1.2.通过数据库快照(--fastsync)和oplog增加节点

oplog直接添加节点简单无需人工干预过多,但oplog是capped collection,采用循环的方式进行日志处理,所以采用oplog 添加节点有可能导致 数据不一致,因为日志中存储的信息有可能已经刷新过了。这时候用  据库快照(--fastsync)和oplog相 结合的方式来增加节点,流程:先取某一个复制集成员的物理文件来作为初始化依据,然后剩余的部分用oplog日志来追, 最终达到数据的一致性。
 

  ①.  建立数据文件夹
 
  mkdir -p /mongodb/data/slaver2         --备
  mkdir -p /mongodb/log/
  touch /mongodb/log/slaver2.log
  touch  /mongodb/slaver2.pid
  chmod -R 755 /mongodb

 ②.取某一个复制集成员的物理文件来作为初始化数据
     [root@node1 /]# scp  -r /mongodb/data/master [email protected]:/mongodb/data
      [email protected]'s password: 

 ②.在读完物理文件后,在db.persion中插入一条新纪录

testrs:PRIMARY> db.person.insert({"name":"tt","sex":"ry","age":99})
WriteResult({ "nInserted" : 1 })
testrs:PRIMARY> db.person.find()
{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }
{ "_id" : ObjectId("554c38d124f4e026b67bae3e"), "name" : "zw2", "sex" : "M", "age" : 22 }
{ "_id" : ObjectId("554c591a214c8f7373ace372"), "name" : "tt", "sex" : "ry", "age" : 99 }


③.配置服务并启动

# slaver2.conf
dbpath=/mongodb/data/slaver2
logpath=/mongodb/log/slaver2.log
pidfilepath=/mongodb/slaver2.pid
directoryperdb=true
logappend=true
replSet=testrs
bind_ip=192.168.56.90
port=27017
oplogSize=10000
fork=true
noprealloc=true
fastsync=true

fastsync=true 暂停主库的写操作

④. 启动新添加的节点: 

[root@node1 bin]#   ./mongod -f slaver2.conf
note: noprealloc may hurt performance in many applications
about to fork child process, waiting until server is ready for connections.
forked process: 11770
child process started successfully, parent exiting


⑤.添加节点:    rs.add("192.168.56.90:27017")

  testrs:PRIMARY> rs.add("192.168.56.90:27017")
{ "ok" : 1 }

testrs:PRIMARY> rs.status()  --查看添加的状态
{
     "set" : "testrs",
     "date" : ISODate("2015-05-08T06:45:23.550Z"),
     "myState" : 1,
     "members" : [
          {
               "_id" : 0,
               "name" : "192.168.56.87:27017",
               "health" : 1,
               "state" : 1,
               "stateStr" : "PRIMARY",
               "uptime" : 8761,
               "optime" : Timestamp(1431067518, 1),
               "optimeDate" : ISODate("2015-05-08T06:45:18Z"),
               "electionTime" : Timestamp(1431058769, 1),
               "electionDate" : ISODate("2015-05-08T04:19:29Z"),
               "configVersion" : 4,
               "self" : true
          },
          {
               "_id" : 1,
               "name" : "192.168.56.88:27017",
               "health" : 1,
               "state" : 2,
               "stateStr" : "SECONDARY",
               "uptime" : 8759,
               "optime" : Timestamp(1431067518, 1),
               "optimeDate" : ISODate("2015-05-08T06:45:18Z"),
               "lastHeartbeat" : ISODate("2015-05-08T06:45:22.146Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T06:45:22.809Z"),
               "pingMs" : 0,
               "configVersion" : 4
          },
          {
               "_id" : 2,
               "name" : "192.168.56.89:27017",
               "health" : 1,
               "state" : 7,
               "stateStr" : "ARBITER",
               "uptime" : 8759,
               "lastHeartbeat" : ISODate("2015-05-08T06:45:22.146Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T06:45:22.682Z"),
               "pingMs" : 0,
               "configVersion" : 4
          },
          {
               "_id" : 3,
               " name" : "192.168.56.90:27017",  --信息已经添加进去
               "health" : 1,
               "state" : 2,
               "stateStr" : "SECONDARY",
               "uptime" : 3,
               "optime" : Timestamp(1431067518, 1),
               "optimeDate" : ISODate("2015-05-08T06:45:18Z"),
               "lastHeartbeat" : ISODate("2015-05-08T06:45:22.448Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T06:45:22.485Z"),
               "pingMs" : 0,
               "syncingTo" : "192.168.56.88:27017",
               "configVersion" : 4
          }
     ],
     "ok" : 1
}

   ⑤.验证数据是否同步

       ./mongo 192.168.56.90:27017   --登录新加的节点

        testrs:SECONDARY> db.person.find()  --查询失败
        Error: error: { "$err" : "not master and slaveOk=false", "code" : 13435 }

 
testrs:SECONDARY>  db.getMongo().setSlaveOk() --让备库查询
testrs:SECONDARY> use db
 testrs:SECONDARY> db.person.find()   --可以看到数据同步成功

{ "_id" : ObjectId("554c2f77478a8bbe95a474d9"), "name" : "zw", "sex" : "M", "age" : 19 }
{ "_id" : ObjectId("554c331a478a8bbe95a474da"), "name" : "zt", "sex" : "M", "age" : 20 }
{ "_id" : ObjectId("554c38d124f4e026b67bae3e"), "name" : "zw2", "sex" : "M", "age" : 22 }
{ "_id" : ObjectId("554c591a214c8f7373ace372"), "name" : "tt", "sex" : "ry", "age" : 99 }  --可以看到这条记录已经同步过来了

  到此 通过数据库快照(--fastsync)和oplog增加节点成功了!
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 四: 删除节点
   1.目前有四个节点,删除IP=192.168.56.90的节点
    testrs:PRIMARY> rs.remove("192.168.56.90:27017")   --删除节点
{ "ok" : 1 }
testrs:PRIMARY> rs.status()
{
     "set" : "testrs",
     "date" : ISODate("2015-05-08T05:41:40.247Z"),
     "myState" : 1,
     "members" : [
          {
               "_id" : 0,
               "name" : "192.168.56.87:27017",
               "health" : 1,
               "state" : 1,
               "stateStr" : "PRIMARY",
               "uptime" : 4938,
               "optime" : Timestamp(1431063686, 1),
               "optimeDate" : ISODate("2015-05-08T05:41:26Z"),
               "electionTime" : Timestamp(1431058769, 1),
               "electionDate" : ISODate("2015-05-08T04:19:29Z"),
               "configVersion" : 3,
               "self" : true
          },
          {
               "_id" : 1,
               "name" : "192.168.56.88:27017",
               "health" : 1,
               "state" : 2,
               "stateStr" : "SECONDARY",
               "uptime" : 4936,
               "optime" : Timestamp(1431063686, 1),
               "optimeDate" : ISODate("2015-05-08T05:41:26Z"),
               "lastHeartbeat" : ISODate("2015-05-08T05:41:38.988Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T05:41:39.522Z"),
               "pingMs" : 0,
               "lastHeartbeatMessage" : "could not find member to sync from",
               "configVersion" : 3
          },
          {
               "_id" : 2,
               "name" : "192.168.56.89:27017",
               "health" : 1,
               "state" : 7,
               "stateStr" : "ARBITER",
               "uptime" : 4936,
               "lastHeartbeat" : ISODate("2015-05-08T05:41:38.988Z"),
               "lastHeartbeatRecv" : ISODate("2015-05-08T05:41:39.810Z"),
               "pingMs" : 0,
               "configVersion" : 3
          }
     ],
     "ok" : 1
}


可以看到  IP=192.168.56.90 节点的信息被删除了


你可能感兴趣的:(读写分离,故障转移,增删节点)