hadoop安装实验总结Linux,Hadoop安装总结

安装JDK

1、下载jdk1.6及以上版本,在/usr下安装

chmod u+x jdk-6u26-linux-i586.bin

./ jdk-6u26-linux-i586.bin

2、配置环境变量

vi /etc/profile

找到如下代码:

for i in /etc/profile.d/*.sh ;do

if [ -r "$i" ]; then

. $i

fi

done

在之后加入:

#javaconfig

JAVA_HOME=/usr/jdk1.6.0_26

export JAVA_HOME

PATH=$PATH:$JAVA_HOME/bin

export PATH

CLASSPATH=.:$JAVA_HOME/lib

export CLASSPATH

3、配置软链接:

---删除旧的链接

cd /usr/bin

rm –rf java

rm –rf javac

---配置新的链接

ln –s /usr/jdk1.6.0_26/bin/java  java

ln –s /usr/jdk1.6.0_26/bin/javac  javac

4、测试是否安装成功: 看是否显示1.6版本

[root@localhost jdk1.6.0_26]# java-version

java version "1.6.0_26"

Java(TM)SE Runtime Environment (build 1.6.0_26-b03)

JavaHotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)

新建用户

为保证利于管理,最好是新建一个Hadoop用户,作为运行环境。

groupaddhadoop                         ---建立hadoop组

useradd-g hadoop hadoop           --建立hadoop用户,加入hadoop组

passwdhadoop                            --设置密码

为设置ssh,需要将hadoop加入wheel组

usermod–gwheel hadoop

应该还有其他方式,设置hadoop组到可以使用ssh,暂时未研究。

配置SSH

在hadoop用户下:

[hadoop@localhost~]$ ssh-keygen -t rsa

[hadoop@localhost~]$ cat id_rsa.pub >> authorized_keys

测试:

sshlocalhost

在单机配置伪分布方式可以按照如上执行,如果设置集群,则需要将id_rsa.pub 复制到各子机,然后导入验证密钥。

安装HADOOP

1、安装文件

到hadoop官方网站(http://hadoop.apache.org/)下载hadoop安装包,这里下载的是0.20.203。

上传到hadoop目录下:/home/hadoop

[hadoop@localhost ~]$ tar -zvxfhadoop-0.20.203.0rc1.tar.gz

2、配置环境变量:

[hadoop@localhost~]$ vi /etc/profile

在java配置下添加如下:

exportHADOOP_HOME=/home/hadoop/hadoop-0.20.203.0

exportPATH=$PATH:$HADOOP_HOME/bin

注意刷新配置!

3、修改hadoop配置文件:

[hadoop@localhost conf]$ vi/home/hadoop/hadoop-0.20.203.0/conf/hadoop-env.sh

修改JAVA_HOME配置

#export JAVA_HOME=/usr/lib/j2sdk1.5-sun

export JAVA_HOME=/usr/jdk1.6.0_26

4、检查安装:

[hadoop@localhost ~]$ hadoopversion

Hadoop 0.20.203.0

Subversionhttp://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r 1099333

Compiled by oom on Wed May  4 07:57:50 PDT 2011

5、配置伪分布模式配置文件

[hadoop@localhostconf]$ vi core-site.xml

fs.default.name

hdfs://localhost/

[hadoop@localhostconf]$ vi hdfs-site.xml

dfs.replication

1

[hadoop@localhostconf]$ vi mapred-site.xml

mapred.job.tracker

localhost:8021

可参考:http://hadoop.apache.org/common/docs/current/single_node_setup.html

也可以把配置文件放在任意目录,只需要在启动守护进程时使用—config选项。

运行HADOOP

1、  格式化HDFS文件系统

[hadoop@localhost bin]$hadoop namenode –format

如下是执行日志,可以看到运行参数信息,后续需要仔细研究字段含义:

[hadoop@localhostbin]$ hadoop namenode -format

11/08/1312:52:56 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG:Starting NameNode

STARTUP_MSG:   host = localhost.localdomain/127.0.0.1

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 0.20.203.0

STARTUP_MSG:   build =http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011

************************************************************/

11/08/13 12:52:57 INFOutil.GSet: VM type       = 32-bit

11/08/13 12:52:57 INFOutil.GSet: 2% max memory = 19.33375 MB

11/08/13 12:52:57 INFOutil.GSet: capacity      = 2^22 = 4194304entries

11/08/1312:52:57 INFO util.GSet: recommended=4194304, actual=4194304

11/08/1312:52:58 INFO namenode.FSNamesystem: fsOwner=hadoop

11/08/1312:52:59 INFO namenode.FSNamesystem: supergroup=supergroup

11/08/1312:52:59 INFO namenode.FSNamesystem: isPermissionEnabled=true

11/08/1312:52:59 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100

11/08/1312:52:59 INFO namenode.FSNamesystem: isAccessTokenEnabled=falseaccessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)

11/08/1312:52:59 INFO namenode.NameNode: Caching file names occuring more than 10 times

11/08/1312:52:59 INFO common.Storage: Image file of size 112 saved in 0 seconds.

11/08/1312:52:59 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name hasbeen successfully formatted.

11/08/1312:52:59 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG:Shutting down NameNode at localhost.localdomain/127.0.0.1

************************************************************/

2、  启动守护进程

[hadoop@localhostbin]$ start-dfs.sh

starting namenode,logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-namenode-localhost.localdomain.out

localhost:starting datanode, logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-datanode-localhost.localdomain.out

localhost:starting secondarynamenode, logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out

[hadoop@localhostbin]$ start-mapred.sh

startingjobtracker, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out

localhost: starting tasktracker, logging to/home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out

3、  关闭守护进程

[hadoop@localhostbin]$ stop-dfs.sh

[hadoop@localhostbin]$ stop-mapred.sh

4、  监控界面

http://192.168.128.133:50070/dfshealth.jsp

·        NameNode - http://localhost:50070/

·        JobTracker - http://localhost:50030/0b1331709591d260c1c78e86d0c51c18.png

你可能感兴趣的:(hadoop安装实验总结Linux,Hadoop安装总结)