自启动
java -javaagent:./jmx_prometheus_javaagent-0.13.0.jar=8080:config.yaml -jar yourJar.jar
随组件启动
随组件启动时启动,在组件运行命令中添加以下代码:
-javaagent:./jmx_prometheus_javaagent-0.13.0.jar=8080:config.yaml
举例,要配置监控hadoop,可在hadoop配置中的hadoop-env.sh中添加以下配置:
if [[ $HADOOP_NAMENODE_OPTS != *jmx_prometheus_javaagent* ]];then
export HADOOP_NAMENODE_OPTS="-javaagent:/opt/apps/exporters/jmx_prometheus_javaagent-0.15.1-SNAPSHOT.jar=30001:/opt/apps/exporters/hadoop/namenode.yaml $HADOOP_NAMENODE_OPTS"
fi
可访问
官方提供的文档:
例如以上配置中/opt/apps/exporters/hadoop/namenode.yaml举例,参数意义见以下注释:
# jmx_exporter启动的延迟时间;启动前不会返回任何数据
startDelaySeconds: 0
# 这个是被监控组件的jmx的ip和端口号,如果不配置,默认设置是本地jvm
# 注:如果是随组件一起启动,此处建议不配置(除非自己变更过jmx端口)
hostPort: 127.0.0.1:1234
# jmx远程连接认证的用户名和密码
username: someuser
password: somepwd
# 完整的jmx的url,如果已配置以上三项,此处不用配置
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:1234/jmxrmi
# 是否使用ssl加密通讯,若要使用,需要额外在被监控组件启动命令中添加以下配置:
# -Djavax.net.ssl.keyStore=/home/user/.keystore -Djavax.net.ssl.keyStorePassword=changeit -Djavax.net.ssl.trustStore=/home/user/.truststore -Djavax.net.ssl.trustStorePassword=changeit
ssl: false
# 是否自动将指标名转换成小写,默认不转换
lowercaseOutputName: false
# 是否自动将指标的标签名转换成小写,默认不转换
lowercaseOutputLabelNames: false
# 白名单列表,只会查询白名单内的Bean。其它的不会被查询,也不会被rules中的规则匹配;如果为空,将匹配所有Bean;支持正则
whitelistObjectNames: ["org.apache.cassandra.metrics:*"]
# 黑名单列表,在黑名单内的Bean不会被查询,也不会被rules中的规则匹配;支持正则
blacklistObjectNames: ["org.apache.cassandra.metrics:type=ColumnFamily,*"]
# 将Bean转换为prometheus指标的规则,支持配置多个;每个规则可生成一个或多个指标
rules:
# 匹配正则表达式,其底层实现是基于java的Pattern和Matcher,全匹配模式
- pattern: 'org.apache.cassandra.metrics<>Value: (\d+)'
# 匹配到的bean,转换为哪个prometheus指标
name: cassandra_$1_$2
# 将attribute转换为蛇形模式,即JavaAgent转换为java_agent,不配置此项时默认不转换
attrNameSnakeCase: false
# 指标的值
value: $3
# 指标值的放大缩小倍数,一般用于单位转换;prometheus指标一般使用基本单位,如果放大可以写1000,缩小写0.0001,根据单位转换实际情况填写
valueFactor: 0.001
# 指标的标签
labels:
"tag1": $2
"tag2": "some tag value"
# 指标的描述说明
help: "Cassandra metric $1 $2"
# 指标的数据类型,可以是Counter,GAUGE,Histogram,Summary,不写是默认为Untyped
type: GAUGE
发现目前网络上的文章都是从官网翻译而来,实际并未自已实践尝试过。作者对于每个配置项都进行了测试,通过研读源码,结合配置样例讲解以加深大家理解。
jmx的bean样例:
配置的解析规则:
生成的指标:
jmx的bean样例:
配置的解析规则:
生成的指标:
jmx的bean样例:
{
beans:[{
"name" : "java.lang:type=MemoryPool,name=PS Old Gen",
"modelerType" : "sun.management.MemoryPoolImpl",
"Valid" : true,
"CollectionUsage" : {
"committed" : 932184064,
"init" : 1431830528,
"max" : 21474836480,
"used" : 22340736
}]
}
配置的解析规则:
对于多层属性的数据结构,可在第二个<>中依次分级添加属性,注意每个<>中的内容都顺序都不能错,且每个属性之间的逗号(,)后面必须要有空格;在最后的冒号(:)后面也必须要有空格
- pattern: 'java.lang(\w+): (.*)'
name: Hadoop_DataNode_metrics
value: $2
help: "DataNodeStatus mem pool metric"
type: COUNTER
labels:
"version": "$2"
"atrrname": "$1"
生成的指标:
# HELP Hadoop_DataNode_metrics_total DataNodeStatus mem pool metric
# TYPE Hadoop_DataNode_metrics_total counter
Hadoop_DataNode_metrics{atrrname="committed",version="932184064",} 9.32184064E8
Hadoop_DataNode_metrics{atrrname="init",version="1431830528",} 1.431830528E9
Hadoop_DataNode_metrics{atrrname="max",version="21474836480",} 2.147483648E10
Hadoop_DataNode_metrics{atrrname="used",version="22340736",} 2.2340736E7
jmx的bean样例:
{
"beans" : [ {
"name" : "java.lang:type=Runtime",
"modelerType" : "sun.management.RuntimeImpl",
"BootClassPath" : "/usr/java/jdk1.8.0_162/jre/lib/resources.jar:/usr/java/jdk1.8.0_162/jre/lib/rt.jar:/usr/java/jdk1.8.0_162/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_162/jre/lib/jsse.jar:/usr/java/jdk1.8.0_162/jre/lib/jce.jar:/usr/java/jdk1.8.0_162/jre/lib/charsets.jar:/usr/java/jdk1.8.0_162/jre/lib/jfr.jar:/usr/java/jdk1.8.0_162/jre/classes",
"LibraryPath" : "/opt/apps/hadoop_everdc/lib/native",
"Uptime" : 24540387,
"VmName" : "Java HotSpot(TM) 64-Bit Server VM",
"VmVendor" : "Oracle Corporation",
"VmVersion" : "25.162-b12",
"BootClassPathSupported" : true,
"InputArguments" : [ "-Dproc_datanode", "-Xmx1000m", "-Djava.net.preferIPv4Stack=true", "-Dhadoop.log.dir=/data/hadoop/log//hdfs", "-Dhadoop.log.file=hadoop.log", "-Dhadoop.home.dir=/opt/apps/hadoop_everdc", "-Dhadoop.id.str=hdfs", "-Dhadoop.root.logger=INFO,console", "-Djava.library.path=/opt/apps/hadoop_everdc/lib/native", "-Dhadoop.policy.file=hadoop-policy.xml", "-Djava.net.preferIPv4Stack=true", "-Djava.net.preferIPv4Stack=true", "-Djava.net.preferIPv4Stack=true", "-Dhadoop.log.dir=/data/hadoop/log//hdfs", "-Dhadoop.log.file=hadoop-hdfs-datanode-ambari58.log", "-Dhadoop.home.dir=/opt/apps/hadoop_everdc", "-Dhadoop.id.str=hdfs", "-Dhadoop.root.logger=INFO,RFA", "-Djava.library.path=/opt/apps/hadoop_everdc/lib/native", "-Dhadoop.policy.file=hadoop-policy.xml", "-Djava.net.preferIPv4Stack=true", "-Dhadoop.security.logger=ERROR,RFAS", "-Xmx30720m", "-Dhadoop.security.logger=ERROR,RFAS", "-Xmx30720m", "-javaagent:/opt/apps/exporters/jmx_prometheus_javaagent-0.15.1-SNAPSHOT.jar=30003:/opt/apps/exporters/hadoop/datanode.yaml", "-Dhadoop.security.logger=ERROR,RFAS", "-Xmx30720m", "-Dhadoop.security.logger=INFO,RFAS" ],
"ManagementSpecVersion" : "1.2",
"SpecName" : "Java Virtual Machine Specification",
"SpecVendor" : "Oracle Corporation",
"SpecVersion" : "1.8",
"SystemProperties" : [ {
"key" : "awt.toolkit",
"value" : "sun.awt.X11.XToolkit"
}, {
"key" : "file.encoding.pkg",
"value" : "sun.io"
}, {
"key" : "java.specification.version",
"value" : "1.8"
}, {
"key" : "sun.jnu.encoding",
"value" : "UTF-8"
}],
"StartTime" : 1625627195308,
"Name" : "41805@ambari58"}
]
}
配置的解析规则:
- pattern: 'java.lang<>SystemProperties: (.*)'
name: runtime_info_system_properties_toolkit
value: 1
labels:
"clz": "$1"
生成的指标:
# HELP runtime_info_system_properties_toolkit java.util.Map (java.lang<>SystemProperties)
# TYPE runtime_info_system_properties_toolkit untyped
runtime_info_system_properties_toolkit{clz="sun.awt.X11.XToolkit",} 1.0
jmx的bean样例:
{
beans:[{
"name": "Hadoop:service=DataNode,name=DataNodeInfo",
"modelerType": "org.apache.hadoop.hdfs.server.datanode.DataNode",
"Version": "2.6.0-cdh5.14.2",
"XceiverCount": 26,
"DatanodeNetworkCounts": [
{
"key": "/10.193.40.4",
"value": [
{
"key": "networkErrors",
"value": 27
}
]
},
{
"key": "/10.193.40.3",
"value": [
{
"key": "networkErrors",
"value": 2545
}
]
},
{
"key": "/10.193.40.5",
"value": [
{
"key": "networkErrors",
"value": 33
}
]
}
],
"RpcPort": "50020",
"HttpPort": null,
"NamenodeAddresses": "{\"yh-shhd-cdh05\":\"BP-1654582017-10.193.40.10-1585051030504\",\"yh-shhd-cdh02\":\"BP-1654582017-10.193.40.10-1585051030504\"}",
"VolumeInfo": "{\"/mnt/disk3/dfs/dn/current\":{\"usedSpace\":2123832329532,\"freeSpace\":5403410074903,\"reservedSpace\":10737418240,\"reservedSpaceForRBW\":263679405},\"/mnt/disk2/dfs/dn/current\":{\"usedSpace\":2141850634176,\"freeSpace\":5385258770496,\"reservedSpace\":10737418240,\"reservedSpaceForRBW\":396679168},\"/mnt/disk0/dfs/dn/current\":{\"usedSpace\":2185856311266,\"freeSpace\":5341126906910,\"reservedSpace\":10737418240,\"reservedSpaceForRBW\":522865664},\"/mnt/disk1/dfs/dn/current\":{\"usedSpace\":2132307337216,\"freeSpace\":5394803232768,\"reservedSpace\":10737418240,\"reservedSpaceForRBW\":395513856}}",
"ClusterId": "cluster7",
"DiskBalancerStatus": ""
}]
}
配置的解析规则:
注意当复杂数据模型在多层之间存在相同的属性名称时(例如本例中的key出现在了两层),在写匹配规则时,第二个key要在后面加_,第三个key要在后面加两个_(__),依次类推,第n层要加n-1个下划线。(这个是比较坑的,官方文档影儿都没有)
另外,几乎jmx相关的多层复杂数据模型在暴露jmx指标数据时,都是类似以上案例中的key,value的形式;
rules:
- pattern: 'Hadoop<>DatanodeNetworkCounts: (.*)'
name: hadoop_datanode_network_errors
value: $2
help: "DataNode networkErrors every host"
type: COUNTER
labels:
"host": "$1"
以上案例中,如果DatanodeNetworkCounts不在bean属性的第一层,而在第3层,前两层为
a:{b:DatanodeNetworkCounts:....},则只需要更改pattern为:
HadoopDatanodeNetworkCounts: (.*)
生成的指标:
# HELP hadoop_datanode_network_errors DataNode networkErrors every host
# TYPE hadoop_datanode_network_errors gauge
hadoop_datanode_network_errors{host="/10.193.40.4",} 27.0
hadoop_datanode_network_errors{host="/10.193.40.3",} 2545.0
hadoop_datanode_network_errors{host="/10.193.40.5",} 33.0
官方jmx_exporter对于一个jvm最多只能有一个jmx_exporter运行进行监控,所以当一台机器安装了jdk后,有多个组件都使用这个jdk时,无法使用多个jmx_exporter监控所有组件;对于此问题,作者对官方exporter进行了改造,使其可以支持一个jvm使用多个jmx_exporter,可参见:https://gitee.com/keamspring/jmx_exporter/tree/multiplue_component_assist