hadoop_2.6.5测试集群安装

安装hadoop2.6.5集群:

1.规划设计:

JacK6:NameNode,jobtracker

JacK7:secondnode,datenode,tasktracker

JacK8:datanode,tasktracker

2.配置ssh免密钥登录

1.关闭SElinux

su root

setenforce 0

vi /etc/selinux/config

SELINUX=disabled

2.配置ssh免密钥:分别在6、7、8(需要免密钥自己)执行(pssh值得研究)

ssh-keygen -t rsa -P ''

ssh-copy-id -i ~/.ssh/id_rsa.pubhadoop@JacK7

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@JacK8

ssh JacK7

3. 系统配置:

1.关闭防火墙

service iptables stop

service iptables status

chkconfig iptables off

2.关闭透明大页

查看:cat /sys/kernel/mm/redhat_transparent_hugepage/defrag

[always] madvise never 标识启用

关闭:echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled

3.修改swappiness

Linux内核参数vm.swappiness,值的范围为0~100,表示系统什么时候开始进行物理内存 与虚拟内存的交换。

举个例子,系统总内存为64G,vm.swappiness为60,表示在系统内存使用64*0.4=25.6G 的时候开始物理内存与虚拟内存的交换,

这个动作势必会影响系统的性能。因此,Cloudera建议把这个值修改为1~10。

查看:cat /proc/sys/vm/swappiness

修改:

临时:sysctl -w vm.swappiness=10

永久生效:

echo "vm.swappiness=10" >> /etc/sysctl.conf

4.修改文件打开最大数和最大进程数:后面两个文件有待研究

查看:ulimit -a

修改可打开的最大文件数:vi /etc/security/limits.conf

* soft nofile 65535

* hard nofile 65535

* soft nproc 65535

* hard nproc 65535

hadoop soft nproc 10240

hadoop hard nofile 10240

hadoop soft nproc 10240

hadoop hard nproc 10240

重启生效,其他两个文件:

/etc/security/limits.d/90-nproc.conf文件尾添加

* soft nproc 204800

* hard nproc 204800

/etc/security/limits.d/def.conf文件尾添加

* soft nofile 204800

* hard nofile 204800

5.禁用IPv6:以后再看

vi /etc/sysconfig/network

6.屏蔽文件访问时间:以后再看

4.建立本地yum仓库:以后再建

5.NTP配置:以后

6.安装Java

7.hadoop安装

1.mkdir Hadoop_2.6.5

tar -xvf /data/tar/hadoop-2.6.5.tar.gz -C /data/hadoop/Hadoop_2.6.5/

tar -xvf hadoop-native-64-2.6.0.tar -C /data/hadoop/Hadoop_2.6.5/lib/native

vi ~/.bash_profile

#Hadoop_2.6.5

export HADOOP_HOME=/data/hadoop/Hadoop_2.6.5

export HADOOP_PREFIX=$HADOOP_HOME

export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}

export HADOOP_COMMON_HOME=${HADOOP_PREFIX}

export HADOOP_HDFS_HOME=${HADOOP_PREFIX}

export YARN_HOME=${HADOOP_PREFIX}

# Native Path

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native"

export PATH=$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:

$JAVA_HOME/bin:$PATH

scp .bash_profile JacK7

scp .bash_profile JacK8

2.修改配置文件:

cd /data/hadoop/Hadoop_2.6.5/etc/hadoop

1.vi hadoop-env.sh

# 明确指定JAVA_HOME

export JAVA_HOME=/usr/software/java_1.8

# 明确指定log的存放目录,默认位置是安装目录下的logs文件夹

export HADOOP_LOG_DIR=/data/tmp_data/hadoop_data/logs

2.vi yarn-env.sh

export JAVA_HOME=/usr/software/java_1.8

#if [ "$JAVA_HOME" != "" ]; then

# #echo "run java in $JAVA_HOME"

# JAVA_HOME=$JAVA_HOME

#fi

#

#if [ "$JAVA_HOME" = "" ]; then

# echo "Error: JAVA_HOME is not set."

# exit 1

#fi

3.vi slaves 修改namenode和secondnode上的slaves文件

JacK7

JacK8

4.vi core-site.xml 配置core-site文件

fs.defaultFS

hdfs://JacK6:9000

hadoop.tmp.dir

file:/data/tmp_data/hadoop_data/tmp

Abase for other temporary directories.

5.vi hdfs-site.xml配置secondnamenode

dfs.namenode.secondary.http-address

JacK7:50090

dfs.replication

2

dfs.namenode.name.dir

file:/data/tmp_data/hadoop_data/name

dfs.datanode.data.dir

file:/data/tmp_data/hadoop_data/hdfs

6.cp mapred-site.xml.template mapred-site.xml

vi mapred-site.xml

mapreduce.framework.name

yarn

mapreduce.jobhistory.address

JacK6:10020

mapreduce.jobhistory.webapp.address

JacK6:19888

7.vi yarn-site.xml

yarn.resourcemanager.hostname

JacK6

yarn.nodemanager.aux-services

mapreduce_shuffle

8.复制到其他节点:

scp -r Hadoop_2.6.5/ JacK7:/data/hadoop/

scp -r Hadoop_2.6.5/ JacK8:/data/hadoop/

9.启停测试:

1 $hdfs namenode -format HDFS格式化

首次启动需要先在 Master 节点执行 NameNode 的格式化,之后的启动不需要再去进行:

2 start-dfs.sh 在主节点启动所有守护进程,通过在各节点jps来查看

start-yarn.sh

mr-jobhistory-daemon.sh start historyserver

3. hdfs dfsadmin -report 主节点查看集群的DataNode是否启动

4. stop-yarn.sh

stop-dfs.sh

mr-jobhistory-daemon.sh stop historyserver

你可能感兴趣的:(hadoop_2.6.5测试集群安装)