nagios+Ganglia安装备忘录


yum install -y expat expat-devel pcre pcre-devel apr-devel apr-util check-devel cairo-devel pango-devel libxml2-devel rpmbuild glib2-devel dbus-devel freetype-devel fontconfig-devel gcc-c++ expat-devel python-devel libXrender-devel
wget http://mirror.bit.edu.cn/apache/apr/apr-1.4.6.tar.gz
tar zxf apr-1.4.6.tar.gz
cd apr-1.4.6
./configure;make;make install
cd ..
wget http://download.savannah.gnu.org/releases/confuse/confuse-2.7.tar.gz
tar zxf confuse-2.7.tar.gz
cd confuse-2.7
./configure CFLAGS=-fPIC --disable-nls ;make;make install

调整lib库的位置:
vi /etc/ld.so.conf.d/libconfuse.conf,添加:/usr/local/lib,然后/sbin/ldconfig -v

cd ..
wget http://downloads.sourceforge.net/project/ganglia/ganglia%20monitoring%20core/3.3.1/ganglia-3.3.1.tar.gz
tar zxf ganglia-3.3.1.tar.gz
cd ganglia-3.3.1
#server
./configure --prefix=/opt/modules/ganglia --with-static-modules --enable-gexec --enable-status --with-gmetad --with-python=/usr  --with-libexpat=/usr --with-libconfuse=/usr/local --with-libpcre=/usr/local --with-librrd=/usr/local/lib --sysconfdir=/etc/ganglia


cp gmond/gmond.init /etc/rc.d/init.d/gmond
cp gmetad/gmetad.init /etc/rc.d/init.d/gmetad
chkconfig --add gmond && chkconfig gmond on
chkconfig --add gmetad && chkconfig gmetad on
service gmetad start

Ganglia web前端的安装:
mkdir /usr/local/apache2/httdoc/ganglia
cp -r web/* /usr/local/apache2/httdoc/ganglia

Ganglia的Wiki特别指出,web前端的运行需要rrdtool,以及gmetad中的rrds/目录,没有这两个东西,无法出图。gmetad中rrd图形默认存储的目录位置为/var/lib/ganglia/rrds:
mkdir -p /var/lib/ganglia/rrds
chown nobody:nobody /var/lib/ganglia/rrds 


Ganglia的简单配置:
1)生成gmond默认配置文件:
    1.    gmond -t |tee /etc/ganglia/gmond.conf 
2)服务器端配置文件gmetad.conf,主要是配置data_source参数。它设定了监控服务器的地址及端口,可以指定多个监控服务器:
    1.    data_source "hadoop" 10 192.168.1.185
    2.    grid_name "hadoop cluster status"    /*设置一个web前端显示的名称,随意命名。
3)被监控节点配置文件gmond.conf:
gmond.conf包括了几个部分:globals、cluster、udp_send_channel、udp_recv_channel等,如果只是想要Ganglia简单地运行,两个操作就可以了,两个操作都是在cluster配置段中进行修改:
首先,命名你的集群:
命名一个cluster名称,名称与gmetad.conf中的data_source保持一致。我的命名:name = "hadoop"
然后,修改tcp_accept_channel配置段如下:
tcp_accept_channel {
port = 8649
acl {
default = "deny"
access {
ip = 192.168.1.185 /*这里用来监控服务器的地址
mask = 32
action = "allow"
 }
    10.       }
    11.    }
把这个gmond.conf配置文件分发到每个被监控的节点服务器上,重启监控端的gmetad和gmond,以及节点端的gmond,即可对节点进行监控。
4)修改web前端配置文件/var/www/html/conf.php,指定gmetad中存储rrd图形的目录,以及rrdtool的位置:
$gmetad_root = "/var/lib/ganglia";
$rrds = "$gmetad_root/rrds";  
define("RRDTOOL", "/usr/local/bin/rrdtool");

客户端安装

yum install -y expat expat-devel pcre pcre-devel
wget http://mirror.bit.edu.cn/apache/apr/apr-1.4.6.tar.gz
tar zxf apr-1.4.6.tar.gz
cd apr-1.4.6
./configure;make;make install
cd ..
wget http://download.savannah.gnu.org/releases/confuse/confuse-2.7.tar.gz
tar zxf confuse-2.7.tar.gz
cd confuse-2.7
./configure CFLAGS=-fPIC --disable-nls ;make;make install
cd ..
wget http://downloads.sourceforge.net/project/ganglia/ganglia%20monitoring%20core/3.3.1/ganglia-3.3.1.tar.gz
tar zxf ganglia-3.3.1.tar.gz
cd ganglia-3.3.1
#client
./configure --prefix=/opt/modules/ganglia --enable-gexec --enable-status --with-python=/usr --with-libapr=/usr/local/apr/bin/apr-1-config --with-libconfuse=/usr/local --with-libexpat=/usr --with-libpcre=/usr
make; make install
cd gmond

集群的分组部署

Ganglia的分组很简单,就是分端口,不同的组配置不同的监听端口
gmetad
data_source "Namenode" 192.168.1.128:8653
data_source "Datanode" 192.168.1.127:8649
data_source "Portal" 192.168.1.143:8650
data_source "Collector" 192.168.1.135:8651
data_source "DB" 192.168.1.151:8652

gridname "Hadoop"
rrd_rootdir "/opt/modules/ganglia/html/rrds"
#配置rrd数据保存文件的路径,给web界面用的,这个是固定的,最好放在web文件夹下,并赋予正确的权限
case_sensitive_hostnames 0

客户端配置

gmond
cluster {
    name = "Portal"
    #对应gmetad中的Portal,名称一定要写对。
    owner = "unspecified"
    latlong = "unspecified"
    url = "unspecified"
}
/* Feel free to specify as many udp_send_channels as you like.    Gmond
     used to only support having a single channel */
udp_send_channel {
    #bind_hostname = yes # Highly recommended, soon to be default.
                                             # This option tells gmond to use a source address
                                             # that resolves to the machine's hostname.    Without
                                             # this, the metrics may appear to come from any
                                             # interface and the DNS names associated with
                                             # those IPs will be used to create the RRDs.
    mcast_join = 192.168.1.185
    port = 8650
    #gmetad中的Portal所分配的端口号。
    ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
    mcast_join = 192.168.1.185
    port = 8650
    bind = 192.168.1.185
}

/* You can specify as many tcp_accept_channels as you like to share
     an xml description of the state of the cluster */
tcp_accept_channel {
    port = 8650
}

红色部分就是Portal小组的端口,从gmetad.conf中可以看到,Portal小组属于8650端口,那么相应的在gmond中,也要将udp和tcp端口写为8650。

再加上另外一个组的成员gmond就更容易理解了
cluster {
    name = "DB"
    owner = "unspecified"
    latlong = "unspecified"
    url = "unspecified"
}

/* The host section describes attributes of the host, like the location */
host {
    location = "unspecified"
}

/* Feel free to specify as many udp_send_channels as you like.    Gmond
     used to only support having a single channel */
udp_send_channel {
    #bind_hostname = yes # Highly recommended, soon to be default.
                                             # This option tells gmond to use a source address
                                             # that resolves to the machine's hostname.    Without
                                             # this, the metrics may appear to come from any
                                             # interface and the DNS names associated with
                                             # those IPs will be used to create the RRDs.
    mcast_join = 192.168.1.185
    port = 8652
    ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
    mcast_join = 192.168.1.185
    port = 8652
    bind = 192.168.1.185
}

/* You can specify as many tcp_accept_channels as you like to share
     an xml description of the state of the cluster */
tcp_accept_channel {
    port = 8652
}

你可能感兴趣的:(安装,配置,System,nagios,ganglia,linux/unix)