Ganglia meeting Hadoop

1. Introduction

Ganglia is a monitoring system for grids and cluster. Depending of version, it can be integrated with Hadoop. Ganglia consists of the following  components:
  • gmond
Ganglia Monitoring Daemon (gmond) runs on each node in the cluster and collects statistics from the node it runs on as well as other nodes in the cluster. Normally it is a multicast system where each gmond node receives data from its peers. However, since Amazon EC2 does not support multicast at this time, you must setup Ganglia Monitoring Daemons in unicast mode where each node in a cluster is configured to send its data to one pre-designated node.
  • gmetad
Ganglia Meta Daemon (gmetad) runs for each grid and collects data from the Ganglia Monitoring Daemons, one from each cluster. It stores the data it collects on the file system. We'll only be configuring one grid and therefore one gmetad. We'll be running gmetad from the same node that the PHP Web Front End is installed on.
  • Web Front End
A PHP application reads the data and provides a UI to visualize the data over time with pretty graphs. It requires RRDTools library.


2. Cluster Configuration
  • Ubuntu 9.04;
  • Ganglia Monitoring Core 3.0.7
  • Hadoop 0.20.2
Note: beware the software versions. Natively, Ganglia 3.1.x is incompatible with Hadoop.


3. Installing gmond and gmetad

Installing dependencies:
$ sudo apt-get install build-essential librrd2-dev libapr1-dev libconfuse-dev libexpat1-dev python-dev

Creating a user called 'ganglia' and extracting the packet:
$ sudo adduser --disabled-login --no-create-home ganglia
$ sudo tar -xzvf ganglia-3.0.7.tar.gz -C /opt

Changing owner to 'ganglia':
$ sudo chown -R ganglia:ganglia /opt/ganglia-3.0.7

Installing gmond:
(Assume installation directory at /opt/ganglia-3.0.7 and configuration directory at /etc)
$ cd /opt/ganglia-3.0.7
$ sudo ./configure --with-gmetad$ sudo make && make install

Installing Web front-end:
$ sudo apt-get install rrdtool
$ sudo apt-get install apache2 php5-mysql libapache2-mod-php5
$ sudo cp -r ganglia-3.0.7/web /var/www && mv /var/www/web /var/www/ganglia

3.1 Running gmond

Generate a configuration file:
# gmond --default_config > /etc/gmond.conf

Edit /etc/gmond.conf changing the following lines:
globals {
 user = ganglia
}

cluster {
 name = "<cluster_name>"
 owner = "<owner_name>"
 latlong = "unspecified"
 url = "unspecified"
}

(Disable multicast and define the host where nodes in the cluster send data)
udp_send_channel {
 #mcast_join = 239.2.11.71
 host = <hostname>
 port = 8649
 ttl = 1
}

udp_recv_channel {
 #mcast_join = 239.2.11.71
 port = 8649
 #bind = 239.2.11.71
}

Run gmond as sudo:
$ sudo gmond

Check the daemon with ps :
$ ps aux | grep gmond
nobody   24069 3.1 0.7  4304  1872 ? Ss    15:45   0:00 gmond
rhodesmi 24071 0.0 0.2  3004   756 pts/0 R+   15:45   0:00 grep gmond

Listen to gmond port with telnet to check if everything is alright:
$ telnet localhost 8649

Note:If XML lines appear in your terminal, everything is working fine


3.2 Running gmetad

Like gmond, gmetad also has a configuration file. Move it to config directory:
$ sudo cp gmetad/gmetad.conf /etc/

Afterwards insert the following lines:
setuid_username "ganglia"
data_source "<master>" <hostname>
gridname "<cluster_name>"

Now, create a directory to storage rrd files:
$ sudo mkdir -p /var/lib/ganglia/rrds/
$ sudo chown -R ganglia:ganglia /var/lib/ganglia/rrds/

Run gmetad on debug mode to check if everything is alright:
$ sudo gmetad -d 1

Open front-end to test the services:
http://<hostname>/ganglia/

If everything is working fine, kill gmetad process (debug mode) and start it as:
$ sudo gmetad

4. Configuring Hadoop


Insert Ganglia Context in $HADOOP_HOME/conf/hadoop-metrics.properties as:

dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
dfs.period=10
dfs.servers=<hostname>:8649

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.period=10
mapred.servers=<hostname>:8649

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
jvm.period=10
jvm.servers=<hostname>:8649

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
rpc.period=10
rpc.servers=<hostname>:8649


Note: restart Hadoop. Afterwards, restart gmond and gmetad daemons.


References
  • Ken's Blog
  • Ryan Greenhall Home Page

你可能感兴趣的:(Ganglia meeting Hadoop)