mongdb分片原理
分片,是指将数据拆分,将其分散到不同的机器上,分片类似于raid0,副本类似于raid1
MongoDB的副本集与我们常见的主从有所不同,主从在主机宕机后所有服务将停止
分片集群主要由三种组件组成:mongos,config server,shard
1) mongos (路由进程, 应用程序接入 mongos 再查询到具体分片)
数据库集群请求的入口,所有的请求都通过 mongos 进行协调,不需要在应用程序添加一个路由选择器,mongos 自己就是一个请求分发中心,它负责把对应的数据请求
请求转发到对应的 shard 服务器上。在生产环境通常有多个 mongos 作为请求的入口,防止其中一个挂掉所有的 mongodb 请求都没有办法操作。
2) config server (路由表服务。 每一台都具有全部 chunk 的路由信息)
顾名思义为配置服务器,存储所有数据库元信息(路由、分片)的配置。mongos 本身没有物理存储分片服务器和数据路由信息,只是缓存在内存里,配置服务器则实际存储
这些数据。mongos 第一次启动或者关掉重启就会从 config server 加载配置信息,以后如果配置服务器信息变化会通知到所有的 mongos 更新自己的状态,这样
mongos 就能继续准确路由。在生产环境通常有多个 config server 配置服务器,因为它存储了分片路由的元数据,这个可不能丢失!就算挂掉其中一台,只要还有存货,
mongodb 集群就不会挂掉。
3) shard (为数据存储分片。 每一片都可以是复制集(replica set))
这就是传说中的分片了。如图所示,一台机器的一个数据表 Collection1 存储了 1T 数据,压力太大了!在分给 4 个机器后, 每个机器都是 256G,则分摊了集中在一台
机器的压力。事实上,上图4个分片如果没有副本集(replica set)是个不完整架构,假设其中的一个分片挂掉那四 分之一的数据就丢失了,所以在高可用性的分片架构还
需要对于每一个分片构建 replica set 副本集保 证分片的可靠性。生产环境通常是 2 个副本 + 1 个仲裁。
废话不多说
1.从github上拉取配置文件git clone [email protected]:herrywen-nanj/mongodb.git
2.启动顺序为configserver --> mongos --> shared
3.删除dbPath下的内容rm -rf configserver/dbPath
4.根据对应配置文件启动mongdb进程,进入mongdb中配置副本集
configserver启动:
mongod -f mongdb.conf
configserver2启动:
mongod -f mongdb.conf
配置副本集:
# 进入configserver
mongo --port 27018
# 初始化
>>>rs.initiate()
# 添加副本节点
>>>rs.add("worker2:27019")
# 查看副本集状态
>>>rs.status
返回内容:
MongoDB Enterprise config-rs:PRIMARY> rs.status()
{
"set" : "config-rs", # 副本集已经配置成功
"date" : ISODate("2019-11-23T04:56:35.588Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"configsvr" : true,
"heartbeatIntervalMillis" : NumberLong(2000),
"majorityVoteCount" : 2,
"writeMajorityCount" : 2,
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"lastCommittedWallTime" : ISODate("2019-11-23T04:56:22.464Z"),
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"readConcernMajorityWallTime" : ISODate("2019-11-23T04:56:22.464Z"),
"appliedOpTime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"lastAppliedWallTime" : ISODate("2019-11-23T04:56:22.464Z"),
"lastDurableWallTime" : ISODate("2019-11-23T04:56:22.464Z")
},
"lastStableRecoveryTimestamp" : Timestamp(1574484952, 30),
"lastStableCheckpointTimestamp" : Timestamp(1574484952, 30),
"electionCandidateMetrics" : {
"lastElectionReason" : "electionTimeout",
"lastElectionDate" : ISODate("2019-11-23T04:55:51.134Z"),
"termAtElection" : NumberLong(1),
"lastCommittedOpTimeAtElection" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"lastSeenOpTimeAtElection" : {
"ts" : Timestamp(1574484951, 1),
"t" : NumberLong(-1)
},
"numVotesNeeded" : 1,
"priorityAtElection" : 1,
"electionTimeoutMillis" : NumberLong(10000),
"newTermStartDate" : ISODate("2019-11-23T04:55:52.141Z"),
"wMajorityWriteAvailabilityDate" : ISODate("2019-11-23T04:55:52.266Z")
},
"members" : [
{
"_id" : 0,
"name" : "worker2:27018",
"ip" : "192.168.255.134",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 722,
"optime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2019-11-23T04:56:22Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(1574484951, 2),
"electionDate" : ISODate("2019-11-23T04:55:51Z"),
"configVersion" : 2,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "worker2:27019",
"ip" : "192.168.255.134",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 13,
"optime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2019-11-23T04:56:22Z"),
"optimeDurableDate" : ISODate("2019-11-23T04:56:22Z"),
"lastHeartbeat" : ISODate("2019-11-23T04:56:34.705Z"),
"lastHeartbeatRecv" : ISODate("2019-11-23T04:56:35.176Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : 2
}
],
"ok" : 1,
"$gleStats" : {
"lastOpTime" : {
"ts" : Timestamp(1574484982, 1),
"t" : NumberLong(1)
},
"electionId" : ObjectId("7fffffff0000000000000001")
},
"lastCommittedOpTime" : Timestamp(1574484982, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1574484982, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1574484982, 1)
}
5.删除dbPath目录并启动mongs路由进程
cd luyou
mongos -f mongdb.conf
lsof -i:40000
6.分别进入分片节点目录,启动进程,完成副本集群添加操作
cd shared && rm -rf dbpath && mongod -f mongdb.conf
lsof -i:27020 && lsof -i:27021 && lsof -i:27022
mongo --port 27020
>>>rs.initiate()
>>>rs.add("worker2:27021")
>>>rs.add("worker2:27022")
>>>rs.status()
7.在路由节点添加分片集群并新增一个分片sh.addShard("shard-rs/worker2:27020,worker2:27021,worker2:27022")
新增分片:
cd shard4
sh.addShard("worker2:27023")
查看返回状态:
MongoDB Enterprise mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5dd8bbd8bbb4a8ac81b4b0b6")
}
shards:
{ "_id" : "shard-rs", "host" : "shard-rs/worker2:27020,worker2:27021,worker2:27022", "state" : 1 }
{ "_id" : "shard0001", "host" : "worker2:27023", "state" : 1 }
active mongoses:
"4.2.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
8.片键操作
a.为herrywen这个数据库开启分片功能
sh.enableSharding("herrywen")
查看返回状态:
MongoDB Enterprise mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5dd8bbd8bbb4a8ac81b4b0b6")
}
shards:
{ "_id" : "shard-rs", "host" : "shard-rs/worker2:27020,worker2:27021,worker2:27022", "state" : 1 }
active mongoses:
"4.2.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
{ "_id" : "herrywen", "primary" : "shard-rs", "partitioned" : true, "version" : { "uuid" : UUID("56cf9d23-2f3a-4b53-8b5d-512f1f9e00c6"), "lastMod" : 1 } }
b. 开启特定集合功能,并指定id为片键sh.shardCollection("herrywen.collections_1",{"_id":1})
查看返回结果
MongoDB Enterprise mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5dd8bbd8bbb4a8ac81b4b0b6")
}
shards:
{ "_id" : "shard-rs", "host" : "shard-rs/worker2:27020,worker2:27021,worker2:27022", "state" : 1 }
{ "_id" : "shard0001", "host" : "worker2:27023", "state" : 1 }
active mongoses:
"4.2.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Collections with active migrations:
herrywen.collections_1 started at Sat Nov 23 2019 14:15:13 GMT+0800 (CST)
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
2 : Success
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard-rs 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard-rs Timestamp(1, 0)
{ "_id" : "herrywen", "primary" : "shard-rs", "partitioned" : true, "version" : { "uuid" : UUID("56cf9d23-2f3a-4b53-8b5d-512f1f9e00c6"), "lastMod" : 1 } }
herrywen.collections_1
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard-rs 3
shard0001 4
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : 2 } on : shard0001 Timestamp(2, 0)
{ "_id" : 2 } -->> { "_id" : 28340 } on : shard-rs Timestamp(3, 1)
{ "_id" : 28340 } -->> { "_id" : 42509 } on : shard-rs Timestamp(2, 2)
{ "_id" : 42509 } -->> { "_id" : 61031 } on : shard-rs Timestamp(2, 3)
{ "_id" : 61031 } -->> { "_id" : 75200 } on : shard0001 Timestamp(3, 2)
{ "_id" : 75200 } -->> { "_id" : 94169 } on : shard0001 Timestamp(3, 3)
{ "_id" : 94169 } -->> { "_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(3, 4)
9.测试
a. # 修改chunk块大小为1M,默认64M
use config;
db.settings.find()
db.settings.save({_id:"chunksize",value:1})
MongoDB Enterprise mongos> db.settings.find()
{ "_id" : "chunksize", "value" : 1 }
b.查看当前集合中的数据量
MongoDB Enterprise mongos> use herrywen;
switched to db herrywen
MongoDB Enterprise mongos> db.collections_1.count()
0
c.为了能看到分片效果,往herrywen.collections_1集合中写入10000条数据
mongo --port 40000
MongoDB Enterprise mongos> for(var i=1;i<=100000;i++){
... db.collections_1.insert({"_id":i,"name":"copy"+i});
... }
打开另一个终端连接,查看是否在写入:
[root@worker2 shard3]# mongo --port 40000
MongoDB shell version v4.2.1
connecting to: mongodb://127.0.0.1:40000/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("1101172d-02b7-4881-bc3a-1a360390db39") }
MongoDB server version: 4.2.1
Server has startup warnings:
2019-11-23T13:04:19.245+0800 I CONTROL [main]
2019-11-23T13:04:19.245+0800 I CONTROL [main] ** WARNING: Access control is not enabled for the database.
2019-11-23T13:04:19.245+0800 I CONTROL [main] ** Read and write access to data and configuration is unrestricted.
2019-11-23T13:04:19.245+0800 I CONTROL [main] ** WARNING: You are running this process as the root user, which is not recommended.
2019-11-23T13:04:19.246+0800 I CONTROL [main]
MongoDB Enterprise mongos> use herrywen;
switched to db herrywen
MongoDB Enterprise mongos> db.collections_1.count();
41561
MongoDB Enterprise mongos> db.collections_1.count();
42971
MongoDB Enterprise mongos> db.collections_1.count();
43516
MongoDB Enterprise mongos> db.collections_1.count();
43776
MongoDB Enterprise mongos> db.collections_1.count();
44055
MongoDB Enterprise mongos> db.collections_1.count();
44291
MongoDB Enterprise mongos> db.collections_1.count();
44541
MongoDB Enterprise mongos> db.collections_1.count();
44775
MongoDB Enterprise mongos> db.collections_1.count();
45012
MongoDB Enterprise mongos> db.collections_1.count();
45257
MongoDB Enterprise mongos> db.collections_1.count();
45470
d.查看到数据也写到其他分片中去了
MongoDB Enterprise mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5dd8bbd8bbb4a8ac81b4b0b6")
}
shards:
{ "_id" : "shard-rs", "host" : "shard-rs/worker2:27020,worker2:27021,worker2:27022", "state" : 1 }
{ "_id" : "shard0001", "host" : "worker2:27023", "state" : 1 }
active mongoses:
"4.2.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Collections with active migrations:
herrywen.collections_1 started at Sat Nov 23 2019 14:15:13 GMT+0800 (CST)
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
2 : Success
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard-rs 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard-rs Timestamp(1, 0)
{ "_id" : "herrywen", "primary" : "shard-rs", "partitioned" : true, "version" : { "uuid" : UUID("56cf9d23-2f3a-4b53-8b5d-512f1f9e00c6"), "lastMod" : 1 } }
herrywen.collections_1
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard-rs 3
shard0001 4
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : 2 } on : shard0001 Timestamp(2, 0)
{ "_id" : 2 } -->> { "_id" : 28340 } on : shard-rs Timestamp(3, 1)
{ "_id" : 28340 } -->> { "_id" : 42509 } on : shard-rs Timestamp(2, 2)
{ "_id" : 42509 } -->> { "_id" : 61031 } on : shard-rs Timestamp(2, 3)
{ "_id" : 61031 } -->> { "_id" : 75200 } on : shard0001 Timestamp(3, 2)
{ "_id" : 75200 } -->> { "_id" : 94169 } on : shard0001 Timestamp(3, 3)
{ "_id" : 94169 } -->> { "_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(3, 4)
10.程序代码内无需太大更改,直接按照连接普通的mongo数据库那样,将数据库连接接入接口40000