单节点安装
开发Hadoop需要的基本软件
vmware
vmware安装ubuntu 12虚拟机配置:
开启root用户:
sudo -s
sudo passwd root
详细参考:
http://blog.csdn.net/flash8627/article/details/44729077
安装vsftpd:
root@ubuntu:/usr/lib/java# apt-getinstall vsftpd
配置vsftpd.conf即可使用本机帐户登陆
root@ubuntu:/usr/lib/java# cp/etc/vsftpd.conf /etc/vsftpd.conf.bak
详细信息网上很多,不多说了.
Java 1.7
上传至服务器后解压,设置环境变量即可,环境变量具体参数如下:
root@ubuntu:/usr/lib/java# tar -zxvfjdk-7u80-linux-x64.tar.gz
root@ubuntu:/usr/lib/java# mv jdk1.7.0_80/usr/lib/java/jdk1.7
root@ubuntu:/usr/lib/java# vim/root/.bashrc
export JAVA_HOME=/usr/lib/java/jdk1.7
export JRE_HOME=${JAVA_HOME}/jre
exportCLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:/usr/local/hadoop/hadoop-2.6.0/bin:$PATH
安装ssh
设置ssh免密码登陆
root@ubuntu:/usr/lib/java# ssh-keygen -trsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key(/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Your identification has been saved in/root/.ssh/id_rsa.
Your public key has been saved in/root/.ssh/id_rsa.pub.
The key fingerprint is:
d3:bb:1e:df:10:09:ed:62:78:43:66:9f:8f:6a:b0:e7root@ubuntu
The key's randomart image is:
+--[ RSA 2048]----+
| |
| . |
| = . |
| * + o |
| S * * |
| .+ + + |
| oo o . |
| . o= o |
| =E . . |
+-----------------+
root@ubuntu:/usr/lib/java# ls /root/.ssh/
id_rsa id_rsa.pub
root@ubuntu:/usr/lib/java# cat/root/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
root@ubuntu:/usr/lib/java# ls /root/.ssh/
authorized_keys id_rsa id_rsa.pub
安装rsync
root@ubuntu:/usr/lib/java#apt-get install rsync
hadoop 2.6
解压hadoop
tar -zxvf /home/ftp/hadoop-2.6.0
配置hadoop-env.sh
cd /usr/local/hadoop/hadoop-2.6.0/etc/hadoop/
vim hadoop-env.sh
# export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/lib/java/jdk1.7
配置hadoop环境变量,文件相对于用户目录下.bashrc
cat ~/.bashrc
export JAVA_HOME=/usr/lib/java/jdk1.7
export JRE_HOME=${JAVA_HOME}/jre
exportCLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:/usr/local/hadoop/hadoop-2.6.0/bin:$PATH
验证环境变量:hadoopversion
运行wordcount
mkdir input
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0#cp README.txt input
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0#hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcountinput output
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0# cat output/*
主要涉及以下配置信息:修改hadoop核心配置文件core-site.xml,主要是配置hdfs的地址和端口号.修改hadoop中hdfs的配置文件hdfs-site.xml,主要是配置replication.修改hadoop的MapReduce的配置文件mapred-site.xml,主要是配置JobTracker的地址和端口.文件所在的目录:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop
core-site.xml
vim hdfs-site.xml
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop#cp mapred-site.xml.template mapred-site.xml
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop#vim mapred-site.xml
接下来进行namenode格式化:
hadoop namenode -format
第二次格式化需要输入Y完成格式化过程
启动hadoop:start-all.sh
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop#../../sbin/start-all.sh
This script is Deprecated. Instead usestart-dfs.sh and start-yarn.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to/usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-ubuntu.out
localhost: starting datanode, logging to/usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-ubuntu.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0(0.0.0.0)' can't be established.
ECDSA key fingerprint is81:a2:0b:4d:95:43:c7:3f:84:f1:a4:d4:24:30:53:bf.
Are you sure you want to continueconnecting (yes/no)? yes
0.0.0.0: Warning: Permanently added'0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode,logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-ubuntu.out
starting yarn daemons
starting resourcemanager, logging to/usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-ubuntu.out
localhost: starting nodemanager, logging to/usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-ubuntu.out
查看hadoop运行进程jps
root@ubuntu:/usr/local/hadoop/hadoop-2.6.0/etc/hadoop#jps
4300 NodeManager
4085 ResourceManager
4510 Jps
3951 SecondaryNameNode
3652 DataNode
3443 NameNode
集群监控查看:
http://localhost:50070/dfshealth.jsp
或用新的UI: http://192.168.222.143:50070/dfshealth.html#tab-overview
在hdfs上建目录:
hadoop fs -mkdir /input
上传文件:
hadoop fs -copyFromLocal /usr/local/hadoop/hadoop-2.6.0/etc/hadoop/* /input至此伪集群完成.
如有需要可进QQ群[大数据交流 208881891]询问.
集群安装
1./etc/hostname修改主机名并在/etc/hosts中配置主机名和IP的映射关系
主要修改主机名:/etc/hostname
配置映射关系:/etc/hosts
192.168.222.143 Master
192.168.222.144 Slave1
192.168.222.145 Slave2
配置ssh无密码登陆ssh-keygen -t rsa -P ""
scp id_rsa.pub Slave1:/root/.ssh/Master.pub 远程拷贝
cat id_rsa.pub >>authorized_keys
修改hadoop配置:
把先前的localhost改成Master
具体配置如下:
core-site.xml
|
hdfs-site.xml
|
mapred-site.xml
|
slaves
Master Slave1 Slave2 |
将java和hadoop拷贝到远程节点:
root@Master:/usr/lib/java#
scp -r jdk1.7 Slave1:/usr/lib/java/
scp -r hadoop-2.6.0 Slave1:/usr/local/hadoop/
拷贝完成后修改slave的环境配置
export JAVA_HOME=/usr/lib/java/jdk1.7 export JRE_HOME=${JAVA_HOME}/jre export CLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:/usr/local/hadoop/hadoop-2.6.0/bin:$PATH |
先清理hdfs/name和data, tmp目录
格式化集群:hadoop namenode -format
启动集群:
root@Master:/usr/local/hadoop/hadoop-2.6.0/sbin# ./start-all.sh
Thisscript is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Startingnamenodes on [Master]
Theauthenticity of host 'master (192.168.222.143)' can't be established.
ECDSAkey fingerprint is 81:a2:0b:4d:95:43:c7:3f:84:f1:a4:d4:24:30:53:bf.
Areyou sure you want to continue connecting (yes/no)? yes
Master:Warning: Permanently added 'master,192.168.222.143' (ECDSA) to the list ofknown hosts.
Master:starting namenode, logging to/usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-Master.out
Master:starting datanode, logging to /usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-Master.out
Slave2:starting datanode, logging to/usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-Slave2.out
Slave1:starting datanode, logging to/usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-Slave1.out
Startingsecondary namenodes [0.0.0.0]
0.0.0.0:starting secondarynamenode, logging to/usr/local/hadoop/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-Master.out
startingyarn daemons
startingresourcemanager, logging to /usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-Master.out
Slave1:starting nodemanager, logging to/usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-Slave1.out
Master:starting nodemanager, logging to/usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-Master.out
Slave2:starting nodemanager, logging to/usr/local/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-Slave2.out
root@Master:/usr/local/hadoop/hadoop-2.6.0/sbin# jps
2912 DataNode
3182 SecondaryNameNode
3557 NodeManager
3855 Jps
3342 ResourceManager
2699 NameNode
root@Master:/usr/local/hadoop/hadoop-2.6.0/sbin# hadoop dfsadmin-report
DEPRECATED: Use of this script toexecute hdfs command is deprecated.
Instead use the hdfs command for it.
Configured Capacity: 56254304256(52.39 GB)
Present Capacity: 48346591232 (45.03GB)
DFS Remaining: 48346517504 (45.03 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.222.143:50010 (Master)
Hostname: Master
Decommission Status : Normal
Configured Capacity: 18751434752(17.46 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2651889664 (2.47 GB)
DFS Remaining: 16099520512 (14.99 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.86%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Jun 11 10:51:41 CST2016
Name: 192.168.222.144:50010 (Slave1)
Hostname: Slave1
Decommission Status : Normal
Configured Capacity: 18751434752(17.46 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2653249536 (2.47 GB)
DFS Remaining: 16098160640 (14.99 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.85%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Jun 11 10:51:41 CST2016
Name: 192.168.222.145:50010 (Slave2)
Hostname: Slave2
Decommission Status : Normal
Configured Capacity: 18751434752(17.46 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 2602573824 (2.42 GB)
DFS Remaining: 16148836352 (15.04 GB)
DFS Used%: 0.00%
DFS Remaining%: 86.12%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Jun 11 10:51:42 CST 2016
root@Master:/usr/local/hadoop/hadoop-2.6.0/sbin#./stop-all.sh
This script is Deprecated. Insteaduse stop-dfs.sh and stop-yarn.sh
Stopping namenodes on [Master]
Master: stopping namenode
Master: stopping datanode
Slave1: stopping datanode
Slave2: stopping datanode
Stopping secondary namenodes[0.0.0.0]
0.0.0.0: stopping secondarynamenode
stopping yarn daemons
stopping resourcemanager
Slave1: stopping nodemanager
Master: stopping nodemanager
Slave2: stopping nodemanager
Slave1: nodemanager did not stopgracefully after 5 seconds: killing with kill -9
Slave2: nodemanager did not stopgracefully after 5 seconds: killing with kill -9
no proxyserver to stop
下一篇:在此基础上 spark集群搭建
啥情况都可以进群讨论.
QQ群:大数据交流 208881891