副本集解决主节点发生故障导致数据丢失或不可用的问题,但遇到需要存储海量数据的情况时,副本集机制就束手无策了。副本集中的一台机器可能不足以存储数据,或者说集群不足以提供可接受的读写吞吐量。这就需要用到 MongoDB 的分片(Sharding)技术,这也是 MongoDB 的另外一种集群部署模式。
分片是指将数据拆分并分散存放在不同机器上的过程。有时也用分区来表示这个概念。将数据分散到不同的机器上,不需要功能强大的大型计算机就可以存储更多的数据,处理更大的负载。
MongoDB 支持自动分片,可以使数据库架构对应用程序不可见,简化系统管理。对应用程序而言,就如同始终在使用一个单机的 MongoDB 服务器一样。
MongoDB 的分片机制允许创建一个包含许多台机器的集群,将数据子集分散在集群中,每个分片维护着一个数据集合的子集。与副本集相比,使用集群架构可以使应用程序具有更强大的数据处理能力。
上图中主要有如下所述三个主要组件:
用于存储实际的数据块,实际生产环境中一个shard server角色可由几台机器组个一个relica set承担,防止主机单点故障
mongod实例,存储了整个 ClusterMetadata,其中包括 chunk信息。4.0版本以后要求最低3个副本集
前端路由,客户端由此接入,且让整个集群看上去像单一数据库,前端应用可以透明使用。
准备三台机器,这里用的三台虚拟机
系统: CentOS-7.2
软件: MongDB-4.2.7
role/port | 192.168.0.41 | 192.168.0.42 | 192.168.0.43 |
---|---|---|---|
config | 30000 | 30000 | 30000 |
mongos | 30010 | 30010 | 30010 |
shard1 | 27001 | 27001 | 27001 |
shard2 | 27002 | 27002 | 27002 |
shard3 | 27003 | 27003 | 27003 |
官网新版本配置文件说明:
https://docs.mongodb.com/manual/reference/configuration-options/
[root@localhost /]# mkdir -p /usr/local/mongodb/data/config
[root@localhost /]# mkdir -p /usr/local/mongodb/log/config
[root@localhost /]# touch /usr/local/mongodb/log/config.log
[root@localhost /]# cat /usr/local/mongodb/conf/config.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/config.log
storage:
dbPath: /usr/local/mongodb/data/config
journal:
enabled: true
# engine:
# wiredTiger:
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/config1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
port: 30000
bindIp: 0.0.0.0
#security:
# authorization: enabled
# keyFile: /usr/local/mongodb/conf/mongodb.keyfile
#operationProfiling:
replication:
replSetName: configs
sharding:
clusterRole: configsvr
## Enterprise-Only Options
#auditLog:
#snmp:
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/config.cfg
about to fork child process, waiting until server is ready for connections.
forked process: 223
child process started successfully, parent exiting
ps: 30000端口成功监听
另外2台的操作同上
任意一台机器上操作
[root@localhost mongodb]# mongo --port 30000
> rs.initiate(
{
_id: "configs",
configsvr: true,
members: [
{ _id : 0, host : "192.168.0.41:30000" },
{ _id : 1, host : "192.168.0.42:30000" },
{ _id : 2, host : "192.168.0.43:30000" },
]
}
)
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1591233702, 1),
"electionId" : ObjectId("000000000000000000000000")
},
"lastCommittedOpTime" : Timestamp(0, 0),
"$clusterTime" : {
"clusterTime" : Timestamp(1591233702, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1591233702, 1)
}
ps: 初始化id必须要和config.cfg内定义的replSetName相同才能成功
rs.status() 查看集群状态
[root@localhost /]# mkdir -p /usr/local/mongodb/data/shard{1..3}
[root@localhost /]# mkdir -p /usr/local/mongodb/log/shard{1..3}
[root@localhost /]# touch /usr/local/mongodb/log/shard{1..3}/shard.log
[root@localhost /]# cat /usr/local/mongodb/conf/shard1.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/shard1/shard.log
storage:
dbPath: /usr/local/mongodb/data/shard1
journal:
enabled: true
# engine:
# wiredTiger:
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/shard1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27001
bindIp: 0.0.0.0
#security:
# authorization: enabled
# keyFile: /usr/local/mongodb/conf/mongodb.keyfile
#operationProfiling:
replication:
replSetName: shard1
sharding:
clusterRole: shardsvr
## Enterprise-Only Options
#auditLog:
#snmp:
[root@localhost /]# cat /usr/local/mongodb/conf/shard2.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/shard2/shard.log
storage:
dbPath: /usr/local/mongodb/data/shard2
journal:
enabled: true
# engine:
# wiredTiger:
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/shard1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27002
bindIp: 0.0.0.0
#security:
#operationProfiling:
replication:
replSetName: shard2
sharding:
clusterRole: shardsvr
## Enterprise-Only Options
#auditLog:
#snmp:
shard3.cfg参考shard2.cfg
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/shard1.cfg
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/shard2.cfg
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/shard3.cfg
另外2台的操作同上
任意一台机器上执行
[root@localhost mongodb]# mongo --port 27001
> rs.initiate(
{
_id: "shard1",
members: [
{ _id : 0, host : "192.168.0.41:27001" },
{ _id : 1, host : "192.168.0.42:27001" },
{ _id : 2, host : "192.168.0.43:27001" },
]
}
)
[root@localhost mongodb]# mongo --port 27002
> rs.initiate(
{
_id: "shard2",
members: [
{ _id : 0, host : "192.168.0.41:27002" },
{ _id : 1, host : "192.168.0.42:27002" },
{ _id : 2, host : "192.168.0.43:27002" },
]
}
)
[root@localhost mongodb]# mongo --port 27003
> rs.initiate(
{
_id: "shard3",
members: [
{ _id : 0, host : "192.168.0.41:27003" },
{ _id : 1, host : "192.168.0.42:27003" },
{ _id : 2, host : "192.168.0.43:27003" },
]
}
)
目前已经创建了shard1,shard2,shard3三个具有三副本的复制集,后续还会创建分片集群
monogs作为集群访问的入口,是不存储数据的,只需要创建日志文件。每个服务器上启动一个mongos,提高可用性。
[root@localhost mongodb]# touch /usr/local/mongodb/log/mongos.log
[root@localhost /]# cat /usr/local/mongodb/conf/mongos.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/mongos.log
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/mongos1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 30010
bindIp: 0.0.0.0
#security:
#operationProfiling:
#eplication:
sharding:
# 配置哪些ip 需要添加
#configDB: configs/192.168.0.153:30000,192.168.0.216:30000,192.168.0.218:30000
configDB: configs/192.168.0.41:30000,192.168.0.42:30000,192.168.0.43:30000
# clusterRole: configsvr
## Enterprise-Only Options
#auditLog:
#snmp:
ps: configDB指向三个config服务的地址和端口
[root@localhost mongodb]# mongos -f /usr/local/mongodb/conf/mongos.cfg
ps:mongos启动不是使用mongod
每台机器都要启动一个mongos服务
[root@localhost mongodb]# openssl rand 700 -base64 > /usr/local/mongodb/conf/mongodb.keyfile
[root@localhost mongodb]# chmod 600 /usr/local/mongodb/conf/mongodb.keyfile
[root@localhost mongodb]# scp /usr/local/mongodb/conf/mongodb.keyfile [email protected]:/usr/local/mongodb/conf
[root@localhost mongodb]# scp /usr/local/mongodb/conf/mongodb.keyfile [email protected]:/usr/local/mongodb/conf
config.cfg,shard.cfg,mongos.cfg三类文件新增以下内容
security:
keyFile: /usr/local/mongodb/conf/mongodb.keyfile
mongos> use admin
switched to db admin
mongos> db.createUser({user:"mongouser",pwd:"mongopwd",roles:["root"]})
mongos> db.auth("mongouser","mongopwd")
mongos> use config
switched to db config
mongos> db.createUser({user:"mongouser",pwd:"mongopwd",roles:["root"]})
mongos> db.auth("mongouser","mongopwd")
security:
authorization: enabled
show users // 查看当前库下的用户
db.dropUser(‘testadmin’) // 删除用户
db.updateUser(‘admin’, {pwd: ‘654321’}) // 修改用户密码
db.auth(‘admin’, ‘654321’) // 密码认证
configs:SECONDARY> rs.status()
{
"operationTime" : Timestamp(1591253807, 2),
"ok" : 0,
"errmsg" : "command replSetGetStatus requires authentication",
"code" : 13,
"codeName" : "Unauthorized",
"lastCommittedOpTime" : Timestamp(1591253807, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1591253807, 2),
"signature" : {
"hash" : BinData(0,"QCmeAVDoq7jZzwHfPYNn/mU4eS8="),
"keyId" : NumberLong("6834296761922617360")
}
}
}
configs:SECONDARY> use admin
switched to db admin
configs:SECONDARY> db.auth('mongouser','mongopwd')
1
任意一台上连接
[root@localhost mongodb]# mongo --port 30010
mongos> use admin
switched to db admin
mongos> sh.addShard("shard1/192.168.0.41:27001,192.168.0.42:27001,192.168.0.43:27001")
{
"shardAdded" : "shard1",
"ok" : 1,
"operationTime" : Timestamp(1591241739, 4),
"$clusterTime" : {
"clusterTime" : Timestamp(1591241739, 4),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.addShard("shard2/192.168.0.41:27002,192.168.0.42:27002,192.168.0.43:27002")
{
"shardAdded" : "shard2",
"ok" : 1,
"operationTime" : Timestamp(1591241810, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1591241810, 2),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.addShard("shard3/192.168.0.41:27003,192.168.0.42:27003,192.168.0.43:27003")
{
"shardAdded" : "shard3",
"ok" : 1,
"operationTime" : Timestamp(1591241833, 6),
"$clusterTime" : {
"clusterTime" : Timestamp(1591241833, 6),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
查看分片状态
sh.status()
移除分片
db.runCommand({removeShard : “shard1”})
mongos> use admin
mongos> sh.enableSharding("notice_set")
mongos> sh.shardCollection("notice_set.notice",{"msgid": "hashed"})
mongos> use notice_set
mongos> for (i = 1; i <= 10; i++) db.notice.insert({"msgid":i})
可以查看每个分片的数据量等于10
选任一mongos直接连接,如果该mongos不可用,服务也无法继续使用
mongos --host --port
mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]