Ganglia 扩展之Python实现方法

Ganglia 扩展之 Python 实现方法

                                                                                  --作者:Terry,Schubert

1.      Ganglia 简介

Ganglia UC Berkeley 发起的一个开源监视项目,设计用于测量数以千计的节点。每台计算机都运行一个收集和发送度量数据(如处理器速度、内存使用量等)的名为  gmond  的守护进程。它将从操作系统和指定主机中收集。接收所有度量数据的主机可以显示这些数据并且可以将这些数据的精简表单传递到层次结构中。正因为有这种层次结构模式,才使得 Ganglia 可以实现良好的扩展。 gmond  带来的系统负载非常少,这使得它成为在集群中各台计算机上运行的一段代码,而不会影响用户性能。

 

所有这些数据收集会多次影响节点性能。网络中的 抖动( Jitter 发生在大量小消息同时出现时。我们发现通过将节点时钟保持一致,就可以避免这个问题。

 

2.      Ganglia 扩展能力

基本 Ganglia 安装已经给我们提供了大量有用信息。使用 Ganglia 的插件将给我们提供两种添加更多功能的方法:

  • 通过添加带内( in-band )插件。
  • 通过添加一些其他来源的带外( out-of-band )欺骗。

Ganglia 安装启动部分参照文档尾部的参考资料,本文档主要讲解 Ganglia 扩展方法的带内 Python 插件实现。

 

3.      系统准备

实验环境:

l  机器:

n  机型: DELL OPTIPLEX 755

n  操作系统: Linux 2.6.18-164.15.1.el5.centos.plus #1 SMP Wed Mar 17 19:54:20 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

n  内存: 2G

 

l  Ganglia 部署环境

n  Ganglia 根目录: /usr/local/ganglia/                   $GANGLIA_ROOT

n  Ganglia 配置文件目录: /etc/ganglia/                 $GANGLIA_CONF

n  Ganglia RRDTool 目录: /var/lib/ganglia/rrds/       $GANGLIA_RRDS

n  Ganglia html 目录: /var/www/html/ganglia/   $GANGLIA_WEB

为了方便描述,将采用 $GANGLIA_*** 代表相关环境。

 

如果安装成功,可发现 $GANGLIA_ROOT/lib64/ganglia/modpython.so ,该文件是 Ganglia Python 扩展的动态库文件,若不存在,则无法支持 python 扩展。

4.      Python 扩展的实现

1)     实例描述

我们以实现一个 random_module random 模块中有两个 metric random1 random2 。我们限定 random 的取值为 [RandMin,RandomMax], 其中 random1+random2 互补,即

random2 = RandomMin + RandomMax - random1

 

2)     需要做的工作

为实现该模块需要做的工作如下:

l  修改配置文件,添加扩展的模块

l  编写扩展模块 Python 代码

l  增加扩展模块的统计表(可省略)

 

3)     修改配置文件

l  gmond.conf 文件

修改 $GANGLIA_CONF/gmond.conf 文件,操作如下:

 

*********************** Start ****************************

modules {

      …..

  module {

    name = "sys_module"

    path = "modsys.so"

  }

 

  /* 添加 python 主模块 */

  module {

    name = "python_module"

 

/* 动态库路径, 完整路径为   $GANGLIA_ROOT/lib64/ganglia/modpython.so */

    path = "modpython.so"  

 

/*  Python 扩展模块代码存放目录,不存在则创建 */

    params="/etc/ganglia/python_modules/"

 

}

  }

 

include ('/etc/ganglia/conf.d/*.conf')

/* /etc/ganglia/conf.d/ python 扩展模块配置文件存放目录,不存在则创建, gmond 启动时,会 load 所有的配置文件和 python 模块代码

  */

include ('/etc/ganglia/conf.d/*.pyconf')

….

*********************** End ****************************

 

我们将用 $GANGLIA_PY_CODE $GANGLIA_PY_CONF 来表示 Python 扩展模块的代码和配置文件存放的目录

 

l  random_module.pyconf 文件

vi $GANGLIA_PY_CONF/random_module.pyconf

random_module.pyconf 内容如下:

*********************** Start ****************************

modules {

  module {

  /*  模块名 $PY_MODULE ,创建 Python 文件路径为 $GANGLIA_PY_CODE/$PY_MODULE.py */

    name = "random_module"

language = "python"

/* 参数列表,所有的参数作为一个 dict( map) 传给 python 脚本的 metric_init(params) 函数。

   本例中, metric_init 调用时, params={“RandomMax”:”10”,”RandomMin”:”0”}

   */

    param RandomMax{ 

        value = 10

    }

    param RandomMin{

        value = 0

    }

  }

}

 

/*  需要收集的 metric 列表,一个模块中可以扩展任意个 metric

 

本例中,我们收集的 metric random1 random2.

Title 的内容,作为 metric 图的标题

  */

collection_group {

  /* 汇报周期

   可选参数:

collect_once – Specifies that the group of static metrics

collect_every – Collection interval (only valid for non-static)

time_threshold – Max data send interval

*/

  collect_every = 10  /* 10 s 汇报一次 */

 

  time_threshold = 50

  metric {

    name = "random1"

    title = "test random1"   /* Metric name (see “gmond –m”)  */

    value_threshold = 50  /* Metric variance threshold (send if exceeded) */

  }

  metric {

     name="random2"

     title = "test random2"

     value_threshold = 50

  }

}

*********************** End ****************************

 

4)     编写模块代码

Ganglia 模块扩展时, Python 脚本主要要实现的函数有:

l  metric_init(params): 

§  Called once at module initialization time

§  Must return a metric description dictionary or list of dictionaries

Metric definition data dictionary

d = {      ‘name’ : ‘’,

‘ call_back’ : ,

‘ time_max’ : int(),

‘ value_type’ : ‘’,

‘ units’ : ’’,

‘ slope’ : ‘’,

‘ format’ : ‘’,

‘ description’ : ‘’}

 

Can be a single dictionary or a list of dictionaries

Must be returned from the metric_init() function

 

§  Any other module initialization can also take place here

l  metric_handler() – may have multiple handlers

§  Metric gathering handler

§  Must return a single data value of the same type as specified in the metric_init() function

l  metric_cleanup()

§  Called once at module termination time

§  Does not return a value

vi $GANGLIA_PY_CODE /random_module.py

random_module.py 的代码如下:

*********************** Start ****************************

 

import random random_max = 100 random_min = 0 v = 0 def random1_handler(name): global v,random_max,random_min v = random.randint(random_min,random_max) return v def random2_handler(name): global v,random_max,random_min return random_min+random_max-v def metric_init(params): global random_max,random_min if params: if params.has_key("RandomMin"): random_min = int(params["RandomMin"]) if params.has_key("RandomMax"): random_max = int(params["RandomMax"]) tmp = {'name':'random1','call_back':random1_handler, 'value_type':'uint','units':'usage', 'slope':'both','format':'%u', 'description':'test random plugin', 'groups':'random'} descriptors = [tmp] tmp1 = {'name':'random2','call_back':random2_handler, 'value_type':'uint','units':'usage', 'slope':'both','format':'%u', 'description':'test subs plugin', 'groups':'random'} descriptors.append(tmp1) return descriptors def metric_cleanup(): pass if __name__=='__main__': descriptors = metric_init(None) for d in descriptors: print "value for %s is %d"%(d['name'],d['call_back'](d['name']))  

*********************** End ****************************

 

 

完成以上步骤后,重启 gmond ,就可以在 web 界面的节点视图上看到新添的 random1 random2 的统计图表

 

截图如下:

 

5)     增加统计表

通过以上扩展,我们增加了自定义的 metric ,这些 metric 各自以图表的形式展现,为了方便查找问题,需要将若干 metric 结合起来,显示在同一个表中,这时我们就需要增加相应的统计表展现。该部分实现仅需修改 Ganglia web 代码。这里,我们以扩展的 random 模块作为例子,将 random1 random2 两个 metric 画在同一张表中。

 

$GANGLIA_WEB 目录文件组及相关文件说明见附录 1.

 

为此我们需要做的工作:

l  修改 $GANGLIA_WEB/conf.php 文件,添加要显示的统计表名称 $GRAPH_NAME ,本例中为“ random ”;

l  编写 $GANGLIA_WEB/graph.d/{$GRAPH_NAME}_report.php 文件,并实现函数:

function graph_{$GRAPH_NAME}_report ( &$rrdtool_graph )

l  重启 httpd 服务。

a)     修改 $GANGLIA_WEB/conf.php 文件

Vi $GANGLIA_WEB/conf.php

************************* Start **********************

 

#

# Colors for the load ranks.

#

$load_colors = array(

   "100+" => "ff634f",

   "75-100" =>"ffa15e",

   "50-75" => "ffde5e",

   "25-50" => "caff98",

   "0-25" => "e2ecff",

   "down" => "515151"

);

 

#

# 添加我们自定义 metric 的颜色,也可以在 random_report.php 中定义

# Colors for the random report graph

#

$random1_color = "0000FF";

$random2_color = "FF0000";

#

# Default metric

#

$default_metric = "load_one";

 

#

# Optional summary graphs

#

#$optional_graphs = array('packet');

# 需要添加的统计表名

$optional_graphs = array('random');

************************ End *************************

 

b)     编写 $GANGLIA_WEB/graph.d/random_report.php 文件

我们可以参考 $GANGLIA_WEB/graph.d/ 目录下已经实现的内容。

 

Vi $GANGLIA_WEB/graph.d/random_report.php

************************ Start  *************************

 

************************ End *************************

 

摘录《 Custom Graphs in Ganglia 3.1.x 》,相关说明如下:



  • Set values for the following hash keys:

$rrdtool_graph['title']

This will be used as the "title" of the graph.

$rrdtool_graph['vertical_label']

This will set the label for the Y-axis on the chart (there is no corresponding X-axis key, since it is always "time.)

$rrdtool_graph['series']

Commands to actually generate the  rrdtool   graph are here. This is a string variable, so care must be taken to properly format and space each command. Developers may find it useful to create a temporary array and  push()   commands into it, then  implode()   them into a single string.

  • Set other variables as desired. Setting the "upper-limit" and "lower-limit" keys can be useful to clamp a chart to a fixed range in the Y-axis. For example, if you are monitoring a percentage, and always want the low and high values to be 0 and 100, respectively. This can also be used to ignore values outside the norm that would otherwise cause rrdtool to chose an inappropriate range; basically, cheap spike removal.
  • The most difficult part of generating the graph is properly setting the  $rrdtool_graph['series']   value. It is suggested that you experiment first on the command line, using rrdtool directly, then convert that into a set of PHP statements. The PHP code can be as simple or complicated as required.
  • There are many variables are pre-defined and available for use in custom graphs. Users are encouraged to make use of these, although changing the values inside the report PHP file is  not recommended. Variables are imported into the scope of the PHP file using the  global   PHP function (yes, it's ugly, we know). A list of the more commonly used variables is:

$context   (e.g. "host", "cluster", "meta", etc)

$cpu_*_color   (see list in conf.php)

$hostname   (set to the current hostname, as known to gmetad)

$load_one_color

$range   (time range of the graph, usually "hour", "week", "month", or "year")

$load_colors   (assoc. array that stores colors for the CPU report. Valid keys are: "down", "0-25", "25-50", "50-75", "75-100", "100+")

$rrd_dir   (Appropriate filesystem directory for the RRD file in question. Different contexts (host/cluster/meta) will be handled correctly.  Use this   instead of hardcoding paths to RRD files.)

$mem_*_color   (similar to  $cpu_ *color, above)

$size   (Current size of the graph, usually "small", "medium", "large", etc)

$strip_domainname   (should the "shortname" of the host be used, instead of the FQDN?)

  • Penultimately, any value in the  $rrdtool_graph['extras']   key will be passed, verbaitim, to rrdtool after all other keys, but before various data definition, calculation, graphing and printing elements. This key is essentially a way for the developer to add any other rrdtool options that are desired, and make a last-ditch effort to override other settings.
  • And lastly, the $rrdtool_graph variable should be the return value of the function; Ganglia will take care of the rest!

 


#rrdtool_graph_keys?

  • Keys present in the $rrdtool_graph associative array.

A list of keys in $rrdtool_graph that are used:

    $series (string: holds the meat of the rrdgraph definition. REQUIRED!)   

//见参考文档 《RRD 数据库及RRDTool 简介》

    $title           (string: title of the report. REQUIRED!) 
$vertical_label (label for Y-Axis. REQUIRED!)

$start (string: Start time of the graph, can usually be
left alone)
$end (string: End time of the graph, also can usually be
left alone)

$width (strings: Width and height of *graph*, the actual image
$height will be slightly larger due to text elements
and padding. These are normally set
automatically, depending on the graph size
chosen from the web UI)

$upper-limit (strings: Maximum and minimum Y-value for the graph.
$lower-limit RRDTool normally will auto-scale the Y min
and max to fit the data. You may override
this by setting these variables to specific
limits. The default value is a null string,
which will force the auto-scale behavior)

$color (array: Sets one or more chart colors. Usually used
for setting the background color of the chart.
Valid array keys are BACK, CANVAS, SHADEA,
SHADEB, FONT, FRAME and ARROW. Usually,
only BACK is set, and only rarely at that.)

$extras (Any other custom rrdtool commands can be added to this
variable. For example, setting a different --base
value or use a --logarithmic scale)


做完以上操作,重启
httpd ,操作如下:

 

Service httpd restart

 

我们可以在 cluster 视图中看到 random 统计图表,截图如下:


 

但在节点视图上看不到该图表,尽管 $GANGLIA_WEB/graph.d/random_report.php 中针对集群视图和节点视图都做了处理。

 

6)     Ganglia 自定义 Web 模板

检查 web 执行流程,我们发现 $GANGLIA_WEB/host_view.php 以及 $GANGLIA_WEB/ templates/default/host_view.tpl 中均没有自定义统计表显示的代码。为了保证默认模板不变,我们建立新的模板目录 $GANGLIA_WEB/ templates/onest, 并将 default 的内容拷贝到该目录下。

 

a)     修改 $GANGLIA_WEB/host_view.php

在文件尾部 $tpl->printToScreen(); 语句前,添加如下代码:

// 添加自定义统计表信息

if (!isset($optional_graphs))

        $optional_graphs = array();

foreach ($optional_graphs as $g) {

        $tpl->newBlock('optional_graphs');

        $tpl->assign('name',$g);

        $tpl->assign("cluster_url", $cluster_url);

        $tpl->assign("graphargs", "h=$hostname&$get_metric_string&st=$cluster[LOCALTIME]");

        $tpl->gotoBlock('_ROOT');

}


该部分是通过替换 $GANGLIA_WEB/ templates/onest/host_view.tpl optional_graphs 模块实现的。

 

b)     修改 $GANGLIA_WEB/ templates/onest/host_view.tpl

**************** Start ****************************

…….

{cluster_url} NETWORK

   SRC="./graph.php?g=network_report&z=medium&c={cluster_url}&{graphargs}">

{cluster_url} PACKETS

   SRC="./graph.php?g=packet_report&z=medium&c={cluster_url}&{graphargs}">

{cluster_url} {name}

    SRC="./graph.php?g={name}_report&z=medium&c={cluster_url}&{graphargs}">

…..

************************* End ****************************

以上修改中,在节点视图中增加了网络包统计表以及自定义统计表

 

c)      修改 $GANGLIA_WEB/conf.php

修改 $template_name 参数 , 让其指向我们的模板目录

**************** Start ****************************

# $Id: conf.php.in 1688 2008-08-15 12:34:40Z carenas $

#

# Gmetad-webfrontend version. Used to check for updates.

#

include_once "./version.php";

 

#

# The name of the directory in "./templates" which contains the

# templates that you want to use. Templates are like a skin for the

# site that can alter its look and feel.

#

$template_name = "onest";

…..

************************* End ****************************

 

d)     重启 httpd 服务

执行命令: service httpd restart ,这样我们就可以在节点视图,看到节点的 random 统计了。

截图如下:

 

e)     自定义显示

如果不喜欢 Ganglia 显示界面,我们可以修改 $GANGLIA_WEB/ templates /$template_name 目录中相应的模板文件

 

 

 

附录 1

 

$GANGLIA_WEB

|-- AUTHORS

|-- COPYING

|-- Makefile.am

|-- auth.php

|-- class.TemplatePower.inc.php

|-- cluster_legend.html

|-- cluster_view.php           // 集群视图

|-- conf.php

|-- conf.php.in      // 配置文件初始模板

|-- footer.php      // 脚注

|-- functions.php

|-- ganglia.php

|-- get_context.php   // 解析视图类型

|-- get_ganglia.php  

|-- graph.d              // 存放绘图脚本, metric 以及统计图表

|   |-- cpu_report.php

|   |-- load_report.php

|   |-- mem_report.php

|   |-- metric.php

|   |-- network_report.php

|   |-- packet_report.php

|   |-- random_report.php

|   `-- sample_report.php

|-- graph.php              // 绘图脚本调用起点文件

|-- grid_tree.php

|-- header.php

|-- host_view.php          // 节点视图

|-- index.php

|-- meta_view.php

|-- node_legend.html

|-- physical_view.php

|-- pie.php

|-- private_clusters

|-- show_node.php

|-- styles.css

|-- templates                 // 相关模板,可以修改该模板,自定义显示

|   |-- default               // 默认模板目录

|   |   |-- cluster_extra.tpl     // 集群视图扩展内容模板

|   |   |-- cluster_view.tpl     // 集群视图模板

|   |   |-- footer.tpl

|   |   |-- grid_tree.tpl

|   |   |-- header-nobanner.tpl

|   |   |-- header.tpl

|   |   |-- host_extra.tpl       // 节点视图扩展内容模板

|   |   |-- host_view.tpl       // 节点视图模板

|   |   |-- images

|   |   |   |-- cluster_0-24.jpg

|   |   |   |-- cluster_25-49.jpg

|   |   |   |-- cluster_50-74.jpg

|    |   |   |-- cluster_75-100.jpg

|   |   |   |-- cluster_overloaded.jpg

|   |   |   |-- cluster_private.jpg

|   |   |   |-- grid_0-24.jpg

|   |   |   |-- grid_25-49.jpg

|   |   |   |-- grid_50-74.jpg

|   |   |   |-- grid_75-100.jpg

|   |   |   |-- grid_overloaded.jpg

|   |   |   |-- grid_private.jpg

|   |   |   |-- logo.jpg

|   |   |   |-- node_0-24.jpg

|   |   |   |-- node_25-49.jpg

|   |   |   |-- node_50-74.jpg

|   |   |   |-- node_75-100.jpg

|   |   |   |-- node_dead.jpg

|   |   |   `-- node_overloaded.jpg

|   |   |-- meta_view.tpl

|   |   |-- node_extra.tpl

|   |   |-- physical_view.tpl

|   |   `-- show_node.tpl

|   `-- onest           // 自定义模板目录,通过 conf.php $template_name 参数指定

|       |-- cluster_extra.tpl   

|       |-- cluster_view.tpl

|       |-- footer.tpl

|       |-- grid_tree.tpl

|       |-- header-nobanner.tpl

|       |-- header.tpl

|       |-- host_extra.tpl

|       |-- host_view.tpl

|       |-- images

|       |   |-- cluster_0-24.jpg

|       |   |-- cluster_25-49.jpg

|       |   |-- cluster_50-74.jpg

|       |   |-- cluster_75-100.jpg

|       |   |-- cluster_overloaded.jpg

|       |   |-- cluster_private.jpg

|       |   |-- grid_0-24.jpg

|       |   |-- grid_25-49.jpg

|       |   |-- grid_50-74.jpg

|       |   |-- grid_75-100.jpg

|       |   |-- grid_overloaded.jpg

|       |   |-- grid_private.jpg

|       |   |-- logo.jpg

|       |   |-- node_0-24.jpg

|       |   |-- node_25-49.jpg

|       |   |-- node_50-74.jpg

|       |   |-- node_75-100.jpg

|       |   |-- node_dead.jpg

|       |   `-- node_overloaded.jpg

|       |-- meta_view.tpl

|       |-- node_extra.tpl

|       |-- physical_view.tpl

|       `-- show_node.tpl

|-- version.php

`-- version.php.in

 

 

参考资料:

 

1)        Custom Graphs in Ganglia 3.1.x

http://sourceforge.net/apps/trac/ganglia/wiki/Custom_graphs

 

2)        Ganglia Monitoring Tool

http://www.slideshare.net/sudhirpg/ganglia-monitoring-tool

 

3)        《针对 ganglia3.1.1 开发自定义的模块》

http://yaoweibin2008.blog.163.com/blog/static/11031392009085410345/

 

4)        Ganglia Nagios ,第 1 部分 Ganglia 监视企业集群》

http://www.ibm.com/developerworks/cn/linux/l-ganglia-nagios-1/

 

5)        PHP 语法参考

http://www.w3school.com.cn/php/php_looping.asp

 

6)        RRD 数据库及 RRDTool  简介

http://linux.chinaunix.net/salon/200712/files/RRD_RRDTool_xa.pdf

 

你可能感兴趣的:(python,扩展,random,graph,colors,variables)