hadoop监控:JMX

前言
最近在研究hadoop集群的负载分析,同学推荐Hadoop自带的监控端口JMX,于是查阅资料做了一下总结。
认识JMX
1.首先看官方接口说明

  • 类JMXJsonServlet继承父类 javax.servlet.http.HttpServlet
  • 只提供web页面来访问JMX
    例如:启动了hadoop集群(master slave1 slave2)
    通过端口50070访问:master:50070查看namenode节点
    若查看namenode的监控信息,可直接访问:master:50070/jmx
    在web页面即可看到json对象(JMX Beans)的内容
    原文说明:
    This servlet generally will be placed under the /jmx URL for each HttpServer. It provides read only access to JMX metrics. The optional qry parameter may be used to query only a subset of the JMX Beans. This query functionality is provided through the MBeanServer.queryNames(ObjectName, javax.management.QueryExp) method.

    For example http://…/jmx?qry=Hadoop:* will return all hadoop metrics exposed through JMX.

  • 可以通过qry参数来过滤查看内容

  • 若参数qry不正确,即bad request,将返回一个格式化的json对象
  {
    "beans" : [
      {
        "name":"bean-name"
        ...
      }
    ]
  }
  • servlet将JMXBeans转换成JSON,每个bean的属性将会被转换为一个json对象的成员。如果属性是布尔值,字符串或者数组,将会被转换为json等价物。如果value是一个组合数据将会被转换为一个key value键值对形式的json对象,如果值是表格数据,将会被转换为包含所有元素的数组。其他对象转换为string格式输出。所有bean的name和modeleler type也会返回。

实例:
url:192.168.84.141:50070/jmx
页面显示:

{
  "beans" : [ {
    "name" : "Hadoop:service=NameNode,name=JvmMetrics",
    "modelerType" : "JvmMetrics",
    "tag.Context" : "jvm",
    "tag.ProcessName" : "NameNode",
    "tag.SessionId" : null,
    "tag.Hostname" : "master",
    "MemNonHeapUsedM" : 46.013527,
    "MemNonHeapCommittedM" : 47.210938,
    "MemNonHeapMaxM" : -9.536743E-7,
    "MemHeapUsedM" : 34.044724,
    "MemHeapCommittedM" : 61.88672,
    "MemHeapMaxM" : 966.6875,
    "MemMaxM" : 966.6875,
    "GcCountCopy" : 51,
    "GcTimeMillisCopy" : 357,
    "GcCountMarkSweepCompact" : 3,
    "GcTimeMillisMarkSweepCompact" : 316,
    "GcCount" : 54,
    "GcTimeMillis" : 673,
    "GcNumWarnThresholdExceeded" : 0,
    "GcNumInfoThresholdExceeded" : 0,
    "GcTotalExtraSleepTime" : 5121,
    "ThreadsNew" : 0,
    "ThreadsRunnable" : 8,
    "ThreadsBlocked" : 0,
    "ThreadsWaiting" : 4,
    "ThreadsTimedWaiting" : 24,
    "ThreadsTerminated" : 0,
    "LogFatal" : 0,
    "LogError" : 0,
    "LogWarn" : 207,
    "LogInfo" : 349
  }, {
    "name" : "java.lang:type=MemoryPool,name=Survivor Space",
    "modelerType" : "sun.management.MemoryPoolImpl",
    "Valid" : true,
    "Usage" : {
      "committed" : 2228224,
      "init" : 524288,
      "max" : 34930688,
      "used" : 28208
    },
    "PeakUsage" : {
      "committed" : 2228224,
      "init" : 524288,
      "max" : 34930688,
      "used" : 1638400
    },
    .....//省略
    }

2.在其他网站上看到的
除了官网给出的参数qry之外,还有两个比较常用的参数:callback get

  • callback
    callback用于需要JSONP的请求。JSONP(JSON with Padding)是json的一种使用模式,可以用于解决主流浏览器跨域的问题。
    将访问地址添加callback参数之后http://192.168.84.141:8088/jmx?callback=hadoop
    callback的值为用户名
    返回的内容如下:(目前没有发现与不加callback的差异,可能因为我运行hadoop集群也是hadoop用户吧)
{
  "beans" : [ {
    "name" : "java.lang:type=MemoryPool,name=Survivor Space",
    "modelerType" : "sun.management.MemoryPoolImpl",
    "Valid" : true,
    "Usage" : {
      "committed" : 1114112,
      "init" : 524288,
      "max" : 34930688,
      "used" : 161480
    },
    "PeakUsage" : {
      "committed" : 1114112,
      "init" : 524288,
      "max" : 34930688,
      "used" : 1108168
    },
    "MemoryManagerNames" : [ "MarkSweepCompact", "Copy" ],
    "UsageThresholdSupported" : false,
    "CollectionUsageThreshold" : 0,
    "CollectionUsageThresholdExceeded" : false,
    "CollectionUsageThresholdCount" : 0,
    "CollectionUsage" : {
      "committed" : 1114112,
      "init" : 524288,
      "max" : 34930688,
      "used" : 161480
    },
    "CollectionUsageThresholdSupported" : true,
    "Name" : "Survivor Space",
    "Type" : "HEAP",
    "ObjectName" : "java.lang:type=MemoryPool,name=Survivor Space"
  }, {
    "name" : "Hadoop:service=ResourceManager,name=RpcActivityForPort8031",
    "modelerType" : "RpcActivityForPort8031",
    "tag.port" : "8031",
    "tag.Context" : "rpc",
    "tag.NumOpenConnectionsPerUser" : "{\"hadoop\":3}",
    "tag.Hostname" : "master",
    "ReceivedBytes" : 2897837,
    "SentBytes" : 694106,
    "RpcQueueTimeNumOps" : 16941,
    "RpcQueueTimeAvgTime" : 0.21456848772763262,
    "RpcProcessingTimeNumOps" : 16941,
    "RpcProcessingTimeAvgTime" : 0.10979150171549222,
    "RpcAuthenticationFailures" : 0,
    "RpcAuthenticationSuccesses" : 0,
    "RpcAuthorizationFailures" : 0,
    "RpcAuthorizationSuccesses" : 7,
    "RpcSlowCalls" : 0,
    "RpcClientBackoff" : 0,
    "NumOpenConnections" : 3,
    "CallQueueLength" : 0,
    "NumDroppedConnections" : 0
  }, {
  //省略。。。
  }
  }
  • get
    如果我们想获取jmx某个属性的值而不是一堆的信息,可以使用get参数。这个参数的值要求是MXBeanName::AttributeName格式的。比如我们想获取某个队列的tag.Hostname属性的值,我们可以这么请求:http://192.168.84.141:8088/jmx?qry=hadoop:service=ResourceManager,name=QueueMetrics,q0=root,user=hadoop::tag.Hostname,它的返回值为:
    验证返回结果为空的json:{}
    这个参数还有待研究,有可能版本更新去掉该参数了

——-后续更新——–

参考网址:https://www.iteblog.com/archives/1694.html

你可能感兴趣的:(hadoop+spark)