主机名 |
IP地址 |
安装的软件 |
运行的进程 |
|
Node10 |
192.168.18.23 |
jdk,hadoop,spark |
namenode,resourcemanager,zkfc,Master |
|
Node20 |
192.168.18.230 |
jdk,hadoop,spark |
namenode,resourcemanager,zkfc,Master |
|
Node30 |
192.168.18.248 |
jdk,hadoop,zookeeper,spark |
datanode,nodemanager,journalnode,QuorumPeerMain,Worker |
|
Node40 |
192.168.18.246 |
jdk,hadoop,zookeeper,spark |
datanode,nodemanager,journalnode,QuorumPeerMain,Worker |
|
Node50 |
192.168.18.232 |
jdk,hadoop,zookeeper,spark |
datanode,nodemanager,journalnode,QuorumPeerMain,Worker |
1.关闭selinux和iptables
vi /etc/sysconfig/selinux
SELINUX=disabled
service iptables stop
chkconfig iptables off
2. 配置/etc/hosts
Vi /etc/hosts
192.168.1.23 node10
192.168.1.230 node20
192.168.1.248 node30
192.168.1.246 node40
192.168.1.232 node50
3.创建用户并配置ssh免密码登录
[root@localhost ~]# useradd heren
[root@localhost ~]# passwd heren
[root@localhost ~]# su - heren
[heren@localhost ~]$ ssh-keygen -t rsa
[heren@localhost ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[heren@localhost ~]$ chmod 600 .ssh/authorized_keys
下面操作只需在node10上执行即可
[heren@node10 ~]$ ssh node20 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[heren@node10 ~]$ ssh node30 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[heren@node10 ~]$ ssh node40 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[heren@node10 ~]$ ssh node50 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[heren@node10 ~]$ scp ~/.ssh/authorized_keys node20:/home/heren/.ssh/authorized_keys
[heren@node10 ~]$ scp ~/.ssh/authorized_keys node30:/home/heren/.ssh/authorized_keys
[heren@node10 ~]$ scp ~/.ssh/authorized_keys node40:/home/heren/.ssh/authorized_keys
[heren@node10 ~]$ scp ~/.ssh/authorized_keys node50:/home/heren/.ssh/authorized_keys
4.安装jdk1.7,配置环境变量
rpm -qa | grep java | xargs rpm -e --nodeps
mkdir -p /usr/java
[root@localhost ~]#cp /root/jdk-7u80-linux-x64.tar.gz /usr/java/
[root@localhost java]#tar xf jdk-7u80-linux-x64.tar.gz
[root@localhost ~]#vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_80
export JRE_HOME=/usr/java/jdk1.7.0_80/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$JAVA_HOME:$PATH
export ZOOKEEPER_HOME=/usr/local/software/zookeeper
export CLASSPATH=$CLASSPATH:$ZOOKEEPER_HOME/lib
export PATH=$PATH:$ZOOKEEPER_HOME/bin
export HADOOP_HOME=/usr/local/software/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export SCALA_HOME=/usr/local/software/scala
export PATH=$SCALA_HOME/bin:$PATH
export SPARK_HOME=/usr/local/software/spark
export PATH=$SPARK_HOME/bin:$PATH
[root@localhost ~]#source .bash_profile
[root@localhost ~]#. /etc/profile
[root@localhost ~]#java -version
5.创建目录
创建应用目录
[root@node10 ~]# mkdir -p /usr/local/software
隶属于heren用户
[root@node10 ~]#chown -R heren:heren /usr/local/software
创建数据目录
mkdir -p /data/hadoop/
mkdir -p /data/spark/
mkdir -p /data/zookeeper/
chown -R heren:heren /data/
6.安装zookeeper集群 node30 node40 node50
[root@node30 software]# su - heren
[heren@node30 ~]$ cd /usr/local/software/
[heren@node30 software]$ tar xf zookeeper-3.4.6.tar.gz
[heren@node30 software]$ mv zookeeper-3.4.6 zookeeper
配置文件
[heren@node30 software]$ cp zookeeper/conf/zoo_sample.cfg zookeeper/conf/zoo.cfg
[heren@node30 software]$ vi zookeeper/conf/zoo.cfg
dataLogDir=/data/zookeeper/logs
dataDir=/data/zookeeper/data
server.1=node30:2888:3888
server.2=node40:2888:3888
server.3=node50:2888:3888
创建id文件
mkdir -p /data/zookeeper/data/
echo '1' > /data/zookeeper/data/myid
echo '2' > /data/zookeeper/data/myid
echo '3' > /data/zookeeper/data/myid
在各个节点上分别启动zookooper (node30 node40 node50)
[heren@node30 zookeeper]$ ./bin/zkServer.sh start
JMX enabled by default
Using config: /usr/local/software/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[heren@node30 ~]$ /usr/local/software/zookeeper/bin/zkServer.sh status
JMX enabled by default
Using config: /usr/local/software/zookeeper/bin/../conf/zoo.cfg
Mode: leader
启动zookeeper sh /usr/local/software/zookeeper/bin/zkServer.sh start
检查zookeeper是否启动成功 sh /usr/local/software/zookeeper/bin/zkServer.sh status
停止zookeeper sh /usr/local/software/zookeeper/bin/zkServer.sh stop
7.安装配置hadoop集群 node10-node50
[root@node10 ~]# su - heren
[heren@node10 ~]$ cd /usr/local/software/
[heren@node10 software]$ tar xf hadoop-2.6.0.tar.gz
[heren@node10 software]$ mv hadoop-2.6.0 hadoop
修改hadoop-env.sh
[heren@node10 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
修改core-site.xml
[heren@node10 hadoop]$ vi core-site.xml
修改hdfs-stie.xml
[heren@node10 hadoop]$ vi hdfs-site.xml
sshfence
shell(/bin/true)
修改mapred-site.xml
[heren@node10 hadoop]$ vi mapred-site.xml
修改yarn-site.xml
[heren@node10 hadoop]$ vi yarn-site.xml
修改slaves
[heren@node10 hadoop]$ vi slaves
node30
node40
node50
将配置好的hadoop复制到其他节点
[heren@node10 etc]$ scp -r hadoop [email protected]:/usr/local/software/hadoop/etc/
[heren@node10 etc]$ scp -r hadoop [email protected]:/usr/local/software/hadoop/etc/
[heren@node10 etc]$ scp -r hadoop [email protected]8:/usr/local/software/hadoop/etc/
[heren@node10 etc]$ scp -r hadoop [email protected]32:/usr/local/software/hadoop/etc/
启动格式化启动hadoop集群
启动journalnode(分别在node30,node40,node50上执行 )
[heren@node30 software]$ hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/software/hadoop/logs/hadoop-heren-journalnode-node30.out
格式化hdfs
[heren@node10 software]$ hdfs namenode -format
[heren@node10 hadoop]$ scp -r tmp/ heren@node20:/usr/local/software/hadoop/
格式化zk在node10上执行
[heren@node10 hadoop]$ hdfs zkfc -formatZK
启动hdfs
[heren@node10 hadoop]$ start-dfs.sh
启动yarn
[heren@node10 hadoop]$ start-yarn.sh
Node20上的standby resourcemanger是需要手动启动的
[heren@node20 hadoop]$ yarn-daemon.sh start resourcemanager
通过web查看集群状态
查看namenode
http://node10:50070/
http://node20:50070/
查看resourcemanger
http://node10:8088/
http://node20:8088/
通过hdfs命名查看集群状态
[heren@node10 hadoop]$ hdfs dfsadmin -report
下载完以后,解压到hadoop的lib/native目录下,覆盖原有文件即可
[heren@node10 lib]$tar xf hadoop-native-64-2.6.0.tar
[heren@node10 lib]$ mv libh* native
验证HDFS HA
首先向hdfs上传一个文件
[heren@node10 hadoop]$ hadoop fs -put /etc/profile /profile
[heren@node10 hadoop]$ hadoop fs -ls /
然后再kill掉active的NameNode
[heren@node10 hadoop]$ kill -9 10020
文件仍然在
[heren@node10 hadoop]$ hadoop fs -ls /
手动启动那个挂掉的NameNode
[heren@node10 hadoop]$ sbin/hadoop-daemon.sh start namenode
验证YARN:
运行一下hadoop提供的demo中的WordCount程序:
[heren@node10 mapreduce]$ hadoop jar /usr/local/software/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /profile /out
启动停止集群命令
/usr/local/software/hadoop/sbin/start-dfs.sh
/usr/local/software/hadoop/sbin/stop-dfs.sh
8.安装spark集群
[root@node10 ~]#mkdir -p /usr/scala
[root@node10 scala]# tar xf scala-2.11.7.tgz
[heren@node10 software]$ tar xf spark-1.5.2-bin-hadoop2.6.tgz
[heren@node10 software]$ mv spark-1.5.2-bin-hadoop2.6 spark
[heren@node10 conf]$ vi spark-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
export SCALA_HOME=/usr/scala/scala-2.11.7
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/usr/local/software/hadoop/etc/hadoop
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_HOME/lib/native"
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=node30:2181,node40:2181,node50:2181 -Dspark.deploy.zookeeper.dir=/spark"
[heren@node10 conf]$ vi spark-defaults.conf
spark.master spark://node10:7077,node20:7077
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.eventLog.enabled true
spark.eventLog.dir hdfs://mycluster/sparklogs
[heren@node10 conf]$ vi slaves
node20
node30
node40
node50
启动停止spark集群
/usr/local/software/spark1.3/sbin/start-all.sh
/usr/local/software/spark1.3/sbin/stop-all.sh