安装JDK
1、下载jdk1.6及以上版本,在/usr下安装
chmod u+x jdk-6u26-linux-i586.bin
./ jdk-6u26-linux-i586.bin
2、配置环境变量
vi /etc/profile
找到如下代码:
for i in /etc/profile.d/*.sh ;do
if [ -r "$i" ]; then
. $i
fi
done
在之后加入:
#javaconfig
JAVA_HOME=/usr/jdk1.6.0_26
export JAVA_HOME
PATH=$PATH:$JAVA_HOME/bin
export PATH
CLASSPATH=.:$JAVA_HOME/lib
export CLASSPATH
3、配置软链接:
---删除旧的链接
cd /usr/bin
rm –rf java
rm –rf javac
---配置新的链接
ln –s /usr/jdk1.6.0_26/bin/java java
ln –s /usr/jdk1.6.0_26/bin/javac javac
4、测试是否安装成功: 看是否显示1.6版本
[root@localhost jdk1.6.0_26]# java-version
java version "1.6.0_26"
Java(TM)SE Runtime Environment (build 1.6.0_26-b03)
JavaHotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)
新建用户
为保证利于管理,最好是新建一个Hadoop用户,作为运行环境。
groupaddhadoop ---建立hadoop组
useradd-g hadoop hadoop --建立hadoop用户,加入hadoop组
passwdhadoop --设置密码
为设置ssh,需要将hadoop加入wheel组
usermod–gwheel hadoop
应该还有其他方式,设置hadoop组到可以使用ssh,暂时未研究。
配置SSH
在hadoop用户下:
[hadoop@localhost~]$ ssh-keygen -t rsa
[hadoop@localhost~]$ cat id_rsa.pub >> authorized_keys
测试:
sshlocalhost
在单机配置伪分布方式可以按照如上执行,如果设置集群,则需要将id_rsa.pub 复制到各子机,然后导入验证密钥。
安装HADOOP
1、安装文件
到hadoop官方网站(http://hadoop.apache.org/)下载hadoop安装包,这里下载的是0.20.203。
上传到hadoop目录下:/home/hadoop
[hadoop@localhost ~]$ tar -zvxfhadoop-0.20.203.0rc1.tar.gz
2、配置环境变量:
[hadoop@localhost~]$ vi /etc/profile
在java配置下添加如下:
exportHADOOP_HOME=/home/hadoop/hadoop-0.20.203.0
exportPATH=$PATH:$HADOOP_HOME/bin
注意刷新配置!
3、修改hadoop配置文件:
[hadoop@localhost conf]$ vi/home/hadoop/hadoop-0.20.203.0/conf/hadoop-env.sh
修改JAVA_HOME配置
#export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/jdk1.6.0_26
4、检查安装:
[hadoop@localhost ~]$ hadoopversion
Hadoop 0.20.203.0
Subversionhttp://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r 1099333
Compiled by oom on Wed May 4 07:57:50 PDT 2011
5、配置伪分布模式配置文件
[hadoop@localhostconf]$ vi core-site.xml
fs.default.name
hdfs://localhost/
[hadoop@localhostconf]$ vi hdfs-site.xml
dfs.replication
1
[hadoop@localhostconf]$ vi mapred-site.xml
mapred.job.tracker
localhost:8021
可参考:http://hadoop.apache.org/common/docs/current/single_node_setup.html
也可以把配置文件放在任意目录,只需要在启动守护进程时使用—config选项。
运行HADOOP
1、 格式化HDFS文件系统
[hadoop@localhost bin]$hadoop namenode –format
如下是执行日志,可以看到运行参数信息,后续需要仔细研究字段含义:
[hadoop@localhostbin]$ hadoop namenode -format
11/08/1312:52:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG:Starting NameNode
STARTUP_MSG: host = localhost.localdomain/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.203.0
STARTUP_MSG: build =http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011
************************************************************/
11/08/13 12:52:57 INFOutil.GSet: VM type = 32-bit
11/08/13 12:52:57 INFOutil.GSet: 2% max memory = 19.33375 MB
11/08/13 12:52:57 INFOutil.GSet: capacity = 2^22 = 4194304entries
11/08/1312:52:57 INFO util.GSet: recommended=4194304, actual=4194304
11/08/1312:52:58 INFO namenode.FSNamesystem: fsOwner=hadoop
11/08/1312:52:59 INFO namenode.FSNamesystem: supergroup=supergroup
11/08/1312:52:59 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/08/1312:52:59 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
11/08/1312:52:59 INFO namenode.FSNamesystem: isAccessTokenEnabled=falseaccessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/08/1312:52:59 INFO namenode.NameNode: Caching file names occuring more than 10 times
11/08/1312:52:59 INFO common.Storage: Image file of size 112 saved in 0 seconds.
11/08/1312:52:59 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name hasbeen successfully formatted.
11/08/1312:52:59 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG:Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
2、 启动守护进程
[hadoop@localhostbin]$ start-dfs.sh
starting namenode,logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-namenode-localhost.localdomain.out
localhost:starting datanode, logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-datanode-localhost.localdomain.out
localhost:starting secondarynamenode, logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out
[hadoop@localhostbin]$ start-mapred.sh
startingjobtracker, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out
3、 关闭守护进程
[hadoop@localhostbin]$ stop-dfs.sh
[hadoop@localhostbin]$ stop-mapred.sh
4、 监控界面
http://192.168.128.133:50070/dfshealth.jsp
· NameNode - http://localhost:50070/
· JobTracker - http://localhost:50030/