LearningStorm第6章笔记

YARN集群的主要组件

  1. 资源管理(RM):YARN集群应用的入口,是集群的主进程,负责管理集群资源。也负责调度多个job提交到集群。调度的原则是可插拔(pluggable)且可以被用户定制如果他们像支持新应用。
  2. NodeManager:NodeManager代理部署在集群每个节点进程,与RM配对。它与RM交互来更新节点状态,获取job请求。也负责管理的生命循环,报告节点变化给RM
  3. ApplicationMaster:一旦job被RM调度,它就不再跟踪job状态和进程,这导致RM支持集群中完全不同的应用,无需担心内部交流和应用的逻辑

在hadoop/conf/mapred-site.xml中指定YARN框架

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

配置yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<!-- Minimum amount of memory allocated for containers in MBs.-->
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<!--Total memory that can be allocated to containers in MBs. -->
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.aux-
services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<!-- This is ratio of physical memory to virtual memory used when setting memory requirements for containers. If you don't have enough RAM, increase this value. -->
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>8</value>
</property>

启动yarn进程

start-yarn.sh

starting yarn daemons
starting( )resourcemanager, logging to /home/anand/opt/
hadoop-2.2.0/logs/yarn-anand-resourcemanager-localhost.
localdomain.out
localhost: starting nodemanager, logging to /home/anand/opt/
hadoop-2.2.0/logs/yarn-anand-nodemanager-localhost.localdomain.out
jps
We will get the following output:
50275 NameNode
50547 SecondaryNameNode
50394 DataNode
51091 Jps
50813 NodeManager
50716 ResourceManager

ResourceManager Web UI

浏览器查看YARN的状态
http://localhost:8088/
用yarn命令与yarn交互可以查看yarn
查看yarn上的应用:

yarn application -list

集成storm 和hadoop

Yahoo开发的Storm-YARN项目把storm部署在YARN集群上
LearningStorm第6章笔记_第1张图片

Setting up Storm-YARN

1,部署storm-yarn仓库

cd /usr/local/cloud/
git clone https://github.com/yahoo/storm-yarn.git
cd storm-yarn

2,编译storm-yarn

mvn package

3,把storm-yarn/lib下的storm.zip复制到hdfs

hdfs dfs -mkdir -p
/lib/storm/0.9.0-wip21
hdfs dfs -put lib/storm.zip /lib/storm/0.9.0-wip21/storm.zip

The exact version might be different in your case from 0.9.0-wip21.
4,创建保存storm配置的目录

mkdir -p /usr/local/cloud/storm-data
cp lib/storm.zip /usr/local/cloud/storm-data/
cd /usr/local/cloud/storm-data/
unzip storm.zip

5,配置storm.yaml
/usr/local/cloud/storm-data/storm-0.9.0-wip21/conf/storm.yaml

storm.zookeeper.servers:
- "localhost"
nimbus.host: "localhost"
master.initial-num-supervisors: 2
master.container.size-mb: 128

6,配置环境变量

vi ~/.bashrc
export PATH=$PATH:/usr/local/cloud/storm-data/storm-0.9.0-
wip21/bin:/usr/local/cloud/storm-yarn/bin
source ~/.bashrc

8,启动zookeeper

zookeeper-3.4.5/bin/zkServer.sh start

9,启动storm-yarn

storm-yarn launch /usr/local/cloud/storm-data/storm-0.9.0-wip21/conf/storm.yaml
14/04/15 10:14:49 INFO client.RMProxy: Connecting to
ResourceManager at /0.0.0.0:8032
14/04/15 10:14:49 INFO yarn.StormOnYarn: Copy App Master jar from
local filesystem and add to local environment
... ...
14/04/15 10:14:51 INFO impl.YarnClientImpl: Submitted application
application_1397537047058_0001 to ResourceManager at /0.0.0.0:8032
application_1397537047058_0001
The Storm-YARN application has been submitted with the
application_1397537047058_0001 application ID.

10,查看应用状态

yarn application -list

11,使用webUI查看
12,nimbus现在也在运行,http:localhost:7070
13,获取部署基于YARN之上的storm集群的拓扑需要的storm配置

mkdir ~/.storm
storm-yarn getStormConfig --appId application_1397537047058_0001
--output ~/.storm/storm.yaml
We will get the following output:
14/04/15 10:32:01 INFO client.RMProxy: Connecting to
ResourceManager at /0.0.0.0:8032
14/04/15 10:32:02 INFO yarn.StormOnYarn: application report for
application_1397537047058_0001 :localhost.localdomain:9000
14/04/15 10:32:02 INFO yarn.StormOnYarn: Attaching
to localhost.localdomain:9000 to talk to app master
application_1397537047058_0001
14/04/15 10:32:02 INFO yarn.StormMasterCommand: storm.yaml
downloaded into /home/anand/.storm/storm.yaml

请确保applicationID正确,在第九步中获得的

部署storm-starter到storm-YARN

1,克隆storm-starter项目

git clone https://github.com/nathanmarz/storm-starter
cd storm-starter

2,打包

mvn package –DskipTests

3,部署wordcount拓扑

storm jar target/storm-starter-0.0.1-SNAPSHOT.jar storm.starter.
WordCountTopology word-count-topology
The following information is displayed:
545 [main] INFO backtype.storm.StormSubmitter - Jar not uploaded
to master yet. Submitting jar...
558 [main] INFO backtype.storm.StormSubmitter - Uploading
topology jar target/storm-starter-0.0.1-SNAPSHOT.jar to assigned
location: storm-local/nimbus/inbox/stormjar-9ab704ff-29f3-4b9d-
b0ac-e9e41d4399dd.jar
609 [main] INFO backtype.storm.StormSubmitter - Successfully
uploaded topology jar to assigned location: storm-local/nimbus/
inbox/stormjar-9ab704ff-29f3-4b9d-b0ac-e9e41d4399dd.jar
609 [main] INFO backtype.storm.StormSubmitter - Submitting
topology word-cout-topology in distributed mode with conf
{"topology.workers":3,"topology.debug":true}
937 [main] INFO backtype.storm.StormSubmitter - Finished
submitting topology: word-cout-topology

5,获取与跑在storm-yarn上的拓扑交互的命令

storm-yarn help

你可能感兴趣的:(storm,yarn)