ZooKeeper Installation

来源:https://ccp.cloudera.com/display/CDHDOC/ZooKeeper+Installation#ZooKeeperInstallation-InstallingtheZooKeeperServerPackage

 

 

Contents

 

Apache ZooKeeper is a highly reliable and available service that provides coordination between distributed processes.

For More Information
From the Apache ZooKeeper site:

"ZooKeeper is a high-performance coordination service for distributed applications. It exposes common services — such as naming, configuration management, synchronization, and group services - in a simple interface so you don't have to write them from scratch. You can use it off-the-shelf to implement consensus, group management, leader election, and presence protocols. And you can build on it for your own, specific needs."


To learn more about Apache ZooKeeper, visithttp://zookeeper.apache.org/.

Upgrading ZooKeeper to the Latest CDH3 Release

Note
To see which version of ZooKeeper is shipping in the latest CDH3 release, check theVersion and Packaging Information. For important information on new and changed components, see theRelease Notes.

Cloudera recommends that you use arolling upgradeprocess to upgrade ZooKeeper: that is, upgrade one server in the ZooKeeper ensemble at a time. This means bringing down each server in turn, upgrading the software, then restarting the server. The server will automatically rejoin the quorum, update its internal state with the current ZooKeeper leader, and begin serving client sessions.

This method allows you to upgrade ZooKeeper without any interruption in the service, and also lets you monitor the ensemble as the upgrade progresses, and roll back if necessary if you run into problems.

The instructions that follow assume that you are upgrading ZooKeeper as part of an upgrade to the latest CDH3 release, and have already performed the steps underUpgrading CDH3.

Performing a ZooKeeper Rolling Upgrade

Follow these steps to perform a rolling upgrade.

Step 1: Stop the ZooKeeper Server on the First Node

To stop the ZooKeeper server:

$ sudo /sbin/service hadoop-zookeeper-server stop

or

$ sudo /sbin/service hadoop-zookeeper stop

depending on the platform and release.

Warning
Youmustshut down the ZooKeeper. If ZooKeeper is running during the upgrade, the new version will not work correctly.

Step 2: Install the ZooKeeper Base Package on the First Node

SeeInstalling the ZooKeeper Base Package.

Step 3: Install the ZooKeeper Server Package on the First Node

SeeInstalling the ZooKeeper Server Package.

Note
Do not try to start the server yet.

Step 4: Re-enable the Server

Because of a packaging problem in earlier releases, you need to re-enable the server manually after upgrading ZooKeeper from CDH3 Update 1 or earlier to the latest CDH3 release:

$ sudo /sbin/chkconfig --add hadoop-zookeeper-server

Step 5: Restart the Server

SeeInstalling the ZooKeeper Server Packagefor instructions on starting the server.

The upgrade is now complete on this server and you can proceed to the next.

Step 6: Upgrade the Remaining Nodes

Repeat Steps 1-5 above on each of the remaining nodes.

The ZooKeeper upgrade is now complete.

Installing the ZooKeeper Packages

There are two ZooKeeper server packages:

  • Thehadoop-zookeeperbase package provides the basic libraries and scripts that are necessary to run ZooKeeper servers and clients. The documentation is also included in this package.
  • Thehadoop-zookeeper-serverpackage contains theinit.dscripts necessary to run ZooKeeper as a daemon process. Becausehadoop-zookeeper-serverdepends onhadoop-zookeeper, installing the server package automatically installs the base package.
Important
If you have not already done so, install Cloudera'syum,zypper/YaSToraptrepository before using the following commands to install ZooKeeper. For instructions, seeCDH3 Installation.

Installing the ZooKeeper Base Package

To install ZooKeeper on Ubuntu and other Debian systems:

$ sudo apt-get install hadoop-zookeeper


To install ZooKeeper On Red Hat-compatible systems:

$ sudo yum install hadoop-zookeeper


To install ZooKeeper on SUSE systems:

$ sudo zypper install hadoop-zookeeper

Installing the ZooKeeper Server Package and Starting ZooKeeper on a Single Server

The instructions provided here deploy a single ZooKeeper server in "standalone" mode. This is appropriate for evaluation, testing and development purposes, but may not provide sufficient reliability for a production application. SeeInstalling ZooKeeper in a Production Environmentfor more information.

To install a ZooKeeper server on Ubuntu and other Debian systems:

$ sudo apt-get install hadoop-zookeeper-server


To install ZooKeeper On Red Hat-compatible systems:

$ sudo yum install hadoop-zookeeper-server


To install ZooKeeper on SUSE systems:

$ sudo zypper install hadoop-zookeeper-server

To start ZooKeeper

Note
ZooKeeper may start automatically on installation on Ubuntu and other Debian systems.

Use the following command to start ZooKeeper:

$ sudo /sbin/service hadoop-zookeeper-server start

Installing ZooKeeper in a Production Environment

For use in a production environment, you should deploy ZooKeeper as an ensemble with an odd number of nodes. As long as a majority of the servers in the ensemble are available, the ZooKeeper service will be available. The minimum recommended ensemble size is three ZooKeeper servers, and it is recommended that each server run on a separate machine.

ZooKeeper deployment on multiple servers requires a bit of additional configuration. The configuration file (zoo.cfg) on each server must include a list of all servers in the ensemble, and each server must also have amyidfile in its data directory (by default/var/zookeeper) that identifies it as one of the servers in the ensemble.

For instructions describing how to set up a multi-server deployment, seeInstalling a Multi-Server Setup.

Setting up Supervisory Process for the ZooKeeper Server

The ZooKeeper server is designed to be both highly reliable and highly available. This means that:

  • If a ZooKeeper server encounters an error it cannot recover from, it will "fail fast" (the process will exit immediately)
  • When the server shuts down, the ensemble remains active, and continues serving requests
  • Once restarted, the server rejoins the ensemble without any further manual intervention.

Cloudera recommends that you fully automate this process by configuring a supervisory service to manage each server, and restart the ZooKeeper server process automatically if it fails. See theZooKeeper Administrator's Guidefor more information.

Maintaining a ZooKeeper Server

The ZooKeeper server continually saves znode snapshot files and, optionally, transactional logs in a Data Directory to enable you to recover data. It's a good idea to back up the ZooKeeper Data Directory periodically. Although ZooKeeper is highly reliable because a persistent copy is replicated on each server, recovering from backups may be necessary if a catastrophic failure or user error occurs.

The ZooKeeper server does not remove the snapshots and log files, so they will accumulate over time. You will need to cleanup this directory occasionally, based on your backup schedules and processes. To automate the cleanup, azkCleanup.shscript is provided in thebindirectory of thehadoop-zookeeperbase package. Modify this script as necessary for your situation. In general, you want to run this as a cron task based on your backup schedule.

The data directory is specified by thedataDirparameter in the ZooKeeperconfiguration file, and the data log directory is specified by thedataLogDirparameter.

For more information, seeOngoing Data Directory Cleanup.

Viewing the ZooKeeper Documentation

For additional ZooKeeper documentation, seehttp://archive.cloudera.com/cdh/3/zookeeper/.

 

你可能感兴趣的:(hadoop)