配置过程是启动config server, route server,然后才是添加shard server。我平时的习惯都是先启动shard server,呵呵!
Scaling is a key feature of MongoDB. And even though manual sharding is supported by most databases, MongoDB supports the concept of autosharding . This 15 minute high speed post provides a detailed overview of autosharding in MongoDB and, specifically, how to create shards supporting autosharding in MongoDB.
The process of splitting up data and storing portions of data on different machines is called sharding . By splitting up data across machines, it becomes possible to store more data and handle much more load without requiring large or powerful machines, e.g., machines that consist of powerful CPU’s and/or massive amounts of RAM.
Two types are sharding can occur. Manual sharding and autosharding .
In manual sharding , the application code manages storing different data on different servers and querying the appropriate server to get it back. Manual sharding can be done with virtually any database software package.
In MongoDB autosharding , some of the administrative overhead required in manual sharding is eliminated. The cluster of database servers, or shards , handles splitting up of data and rebalancing of data automatically.
Autosharding
The basic concept behind MongoDB’s sharding is to break up collections into smaller chunks, or documents . These documents can be distributed across shards so that each shard is responsible for a subset of the total dataset.
As an example, consider the following. When you set up sharding you choose a key from a collection and use that key’s values to split up the data. This key is called the shard key .
Suppose we had a collection of contacts. If we chose “lastName” as our shard key, one shard could hold documents where “lastName” starts with A-F, the next shard could hold last names from G-P, and the final shard could hold last names Q-Z. As you add or remove shards, MongoDB would rebalance this data so that each shard was getting a balanced amount of traffic and a practical amount of data.
So when should you decide to start sharding? Consider the following reasons:
- When you’ve run out of disk space on your current machine.
- You want to write data faster than a single mongod can handle.
- You want to keep a larger portion of data in memory to improve performance.
Setting up Sharding
Three different components are involved in sharding as follows:
shard
A shard is a container that holds a subset of a collection’s data. A shard is either a single mongod server (for development/testing), or a replica set (for production).
mongos
This is the router process. It routes requests and aggregates responses. It doesn’t store any data or configuration information, although it does cache information from the config servers.
config server
Config servers store the configuration of the cluster. For example, which data is located on which shard. Used by mongos to determine request routing.
Starting the Servers
First we need to strat up our config server and mongos. We need to start a config server because mongos uses it to get its configuration.
$ mkdir -p ~/dbs/config $ ./mongod --dbpath ~/dbs/config --port 20000
Now we start a mongos process for an application to connect to. Routing servers do not even need a data directory, but they need to know the location of the config server.
$ ./mongos --port 30000 --configdb localhost:20000
Shard administration is always done through a mongos.
Adding a Shard
Start a normal mongod instance (or replica set), since this is what a shard naturally is
$ mkdir -p ~/dbs/shard1 $ ./mongod --dbpath ~/dbs/shard1 --port 10000
Now connect to the mongos process started earlier and add the shard to the cluster.
First, start up a shell connected to the mongos process as follows:
Now add this shard with the addshard database command:
The “allowLocal” key is necessary only if you are running the shard on localhost and lets MongoDB know that you’re in development and know what you are doing.
Sharding Data
In order to allow MongoDB to distribute data, you have to explicitly turn sharding on at both the database and collection levels. For example, the following enaables sharding for the database acme :
Once you’ve enabled on the acme database, a collection is sharded by running the shardcollection command as follows:
> db.runCommand({"shardcollection" : "acme.products", "key", : {"_id" : 1}})
Now the collection will be sharded by the “_id” key. When data is added to acme , it will be automatically distributed across the shards based on the values of “_id”.
I hope this post enlightens you on the possibilities that MongoDB’s auto-sharding feature provides for ease of scaling.
文章来源:http://blog.brianbuikema.com/2011/01/mongodb-sharding-a-detailed-overview-and-15-minute-high-speed-read/