###step1: 基于伪分布式环境安装进行展开
###step2: 规划机器与服务
hadoop-senior hadoop-senior02 hadoop-senior03
HDFS
NameNode
DataNode DataNode DataNode
SecondaryNameNode
YARN
ResourceManager
NodeManager NodeManager NodeManager
MapReduce
JobHistoryServer
删除文件:
[root@hadoop-senior01 opt]# rm -rf app
给app文件赋权限:
[root@hadoop-senior01 opt]# chown -R xiangkun:xiangkun /opt/app/
###step3: 修改配置文件,设置服务运行机器节点
配置
* hdfs
* hadoop-env.sh
* core-site.xml
* hdfs-site.xml
* slaves
* yarn
* yarn-env.sh
* yarn-site.xml
* slaves
* mapredue
* mapred-env.sh
* mapred-site.xml
####hadoop-env.sh
####core-site.xml
####hdfs-site.xml
####slaves
####yarn-env.sh
####yarn-site.xml
####mapred-env.sh
####mapred-site.xml
###step4: 分发HADOOP安装包至各个机器节点
####由于doc文件作用不大,而且占用空间,我们将其清除(rm -rf ./doc/)
[xiangkun@hadoop-senior01 hadoop-2.5.0]$ cd share
[xiangkun@hadoop-senior01 share]$ ls
doc hadoop
[xiangkun@hadoop-senior01 share]$ df -lh
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_xiangkunqin-lv_root 18G 13G 3.6G 79% /
tmpfs 940M 80K 940M 1% /dev/shm
/dev/sda1 485M 39M 421M 9% /boot
[xiangkun@hadoop-senior01 share]$ rm -rf ./doc/
[xiangkun@hadoop-senior01 share]$ df -lh
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_xiangkunqin-lv_root 18G 12G 5.1G 69% /
tmpfs 940M 80K 940M 1% /dev/shm
/dev/sda1 485M 39M 421M 9% /boot
####分发使用fcp协议,故配置ssh无密钥登录
[xiangkun@hadoop-senior01 ~]$ cd .ssh
[xiangkun@hadoop-senior01 .ssh]$ ls
[xiangkun@hadoop-senior01 .ssh]$ ls
[xiangkun@hadoop-senior01 .ssh]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/xiangkun/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/xiangkun/.ssh/id_rsa.
Your public key has been saved in /home/xiangkun/.ssh/id_rsa.pub.
The key fingerprint is:
de:76:78:38:e6:84:74:f1:2c:f8:da:50:9a:be:a4:33 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| . |
| . + |
| S + o |
| o O + |
| B X o |
| E+ O + |
| .o+.o |
+-----------------+
[xiangkun@hadoop-senior01 .ssh]$
[xiangkun@hadoop-senior01 .ssh]$ ll
总用量 8
-rw-------. 1 xiangkun xiangkun 1675 7月 4 14:05 id_rsa
-rw-r--r--. 1 xiangkun xiangkun 415 7月 4 14:05 id_rsa.pub
[xiangkun@hadoop-senior01 .ssh]$ hostname
hadoop-senior01.xiangkun
[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior01.xiangkun
The authenticity of host 'hadoop-senior01.xiangkun (192.168.111.106)' can't be established.
RSA key fingerprint is da:12:42:76:de:23:3a:01:48:18:cd:9e:60:d6:83:b8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop-senior01.xiangkun,192.168.111.106' (RSA) to the list of known hosts.
[email protected]'s password:
Now try logging into the machine, with "ssh 'hadoop-senior01.xiangkun'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[xiangkun@hadoop-senior01 .ssh]$ ll
总用量 16
-rw-------. 1 xiangkun xiangkun 415 7月 4 14:08 authorized_keys
-rw-------. 1 xiangkun xiangkun 1675 7月 4 14:05 id_rsa
-rw-r--r--. 1 xiangkun xiangkun 415 7月 4 14:05 id_rsa.pub
-rw-r--r--. 1 xiangkun xiangkun 422 7月 4 14:08 known_hosts
####分别配置三台机器无秘钥登录
第一台:
[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior01.xiangkun
[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior02.xiangkun
[xiangkun@hadoop-senior01 .ssh]$ ssh-copy-id hadoop-senior03.xiangkun
第二台:
同上
第三台:
同上
####分别将hadoop分发到另外三台机器上
[xiangkun@hadoop-senior01 app]$ scp -r ./hadoop-2.5.0/ [email protected]: /opt/app/
###step5: 依据官方集群安装文档,分别启动各个节点相应服务
启动:
bin/start-dfs.sh
###step6: 测试HDFS、YARN、MapReduce, Web UI监控集群
###step7: 配置主节点SSH无密钥登陆
###step8: 集群基准测试(实验环境中必须的)—>面试题
集群基准测试(实际环境中必须的)-面试题
** 基本测试
&服务启动,是否可用,简单的应用
&HDFS创建和删除是否能够成功
读写操作
bin/hdfs dfs -mkdir -p /user/xingkun/tmp/conf
bin/hdfs dfs -put etc/hadoop/*-site.xml /user/xingkun/tmp/conf
bin/hdfs bfs -text /user/xiangkun/tmp/conf/core-site.xml
&yarn
run jar
&mapreduce
bin/yarn jar share/hadoop/mapreduce/hadoop* example*.jar
word count /user/xiangkun/mapreduce/wordcount/input /user/xaingkun/mapreduce/wordcount/output
**基准测试
测试集群的性能
&hdfs:写数据、读数据
**监控集群
&cloudera
&Cloudera Manager
部署安装集群
监控集群
配置同步集群
预警