Hadoop + hbase + Zookeeper + spark + scala 集群搭建

在学习大数据过程中,搭建环境笔记:

1.装载3个vm  master、slave1、slave2(CentOS6.4 64位)
2.(所有 root用户)新建用户/密码 hadoop/hadoop123
3.(所有 root用户)修改主机名称
$ vi /etc/sysconfig/network


ETWORKING=yes
HOSTNAME=master


ETWORKING=yes
HOSTNAME=slave1


ETWORKING=yes
HOSTNAME=slave2


4.(所有 root用户)配置静态IP,只改贴的部分其他不动
$ vi /etc/sysconfig/network-scripts/ifcfg-eth0




BOOTPROTO="static"
IPADDR=192.168.246.175
NETMASK=255.255.0.0
GATEWAY=192.168.246.1
DNS1=8.8.8.8
#TYPE="Ethernet"


BOOTPROTO="static"
IPADDR=192.168.246.176
NETMASK=255.255.0.0
GATEWAY=192.168.246.1
DNS1=8.8.8.8
#TYPE="Ethernet"


BOOTPROTO="static"
IPADDR=192.168.246.177
NETMASK=255.255.0.0
GATEWAY=192.168.246.1
DNS1=8.8.8.8
#TYPE="Ethernet"


5.(所有 root用户)重启使配置的静态IP生效
$ /etc/init.d/network restart


6.(所有 root用户)配置hosts
$ vi /etc/hosts


192.168.246.175 master
192.168.246.176 slave1
192.168.246.177 slave2


7.(所有 root用户)永久关闭防火墙 并重启使之生效
$ chkconfig iptables off
$ shutdown -r now


8.配置无密码SSH连接
(主机master hadoop用户)生成公钥 并 发布到 分机slave1、slave2
$ ssh-keygen -t rsa
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@master
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@slave1
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@slave2


8. (所有 root用户)安装jdk
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf jdk-7u79-linux-x64.tar.gz


9.(所有 root用户)配置环境变量 并 使文件生效
$ vi /etc/profile


export JAVA_HOME=/opt/jdk1.7.0_79
export PATH=$PATH:$JAVA_HOME/bin


$ source /etc/profile




-- (所有 root用户) 取得/opt权限给hadoop用户
$ chown -R hadoop /opt


10.(主机master hadoop用户)安装hadoop


将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf hadoop-2.5.2.tar.gz
$ mv hadoop-2.5.2 hadoop


11. (主机master hadoop用户)修改配置文件


$ vi /opt/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/opt/jdk1.7.0_79
export HADOOP_HOME=/opt/hadoop


$ vi /opt/hadoop/etc/hadoop/core-site.xml

 
    fs.default.name
    hdfs://master:9000
 




$ vi /opt/hadoop/etc/hadoop/hdfs-site.xml

   
    dfs.replication 
    3 
 

 
    dfs.name.dir
    /opt/hdfs/name
 

 
    dfs.data.dir
    /opt/hdfs/data
 




$ vi /opt/hadoop/etc/hadoop/mapred-site.xml

   
    mapred.job.tracker 
    master:9001 
 




修改master文件内容为 
$ vi /opt/hadoop/etc/hadoop/masters
master


修改slaves文件内容为
$ vi /opt/hadoop/etc/hadoop/slaves
slave1
slave2


12. (主机master hadoop用户)hadoop文件复制
$ scp -r /opt/hadoop hadoop@slave1:/opt
$ scp -r /opt/hadoop hadoop@slave2:/opt


13. (所有 root用户)配置环境变量
$ vi /etc/profile


export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/sbin


$ source /etc/profile


13. 格式化hadfs
$ hadoop namenode -format


14. (所有 hadoop权限)赋予脚本可执行权限
$ chmod +X -R /opt/hadoop/sbin/




15. (主机master hadoop 用户)启动hadoop
$ cd /opt/hadoop/sbin
$ ./start-all.sh


16. (主机master hadoop 用户)安装Zookeeper
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf zookeeper-3.4.6.tar.gz
$ mv zookeeper-3.4.6 zookeeper


创建文件夹
$ mkdir /opt/zookeeperdata 
$ echo 1 > /opt/zookeeperdata/myid






赋予hadoop所属权限
$ chown -R hadoop /opt/zookeeperdata
$ chown -R hadoop /opt/zookeeper


配置zoo.cfg
$ vi /opt/zookeeper/conf/zoo.cfg


# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDi =/opt/zookeeperdata
f milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeperdata
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
maxSessionTimeout=180000
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=1


server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888




17. (主机master hadoop用户)zookeeper文件复制
$ scp -r /opt/zookeeper hadoop@slave1:/opt
$ scp -r /opt/zookeeper hadoop@slave2:/opt


$ scp -r /opt/zookeeperdata hadoop@slave1:/opt
$ scp -r /opt/zookeeperdata hadoop@slave2:/opt


18. 修改myid文件
修改slave1分机下的 /opt/zookeeperdata/zookeeperdata/myid文件内容 为 2


修改slave2分机下的 /opt/zookeeperdata/zookeeperdata/myid文件内容 为 3


19. (所有 hadoop用户)启动Zookeeper
$ cd /opt/zookeeper/bin
$ ./zkServer.sh start


20. (主机master hadoop 用户)安装HBase
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf hbase-0.98.6-hadoop2-bin.tar.gz
$ mv hbase-0.98.6-hadoop2 hbase


赋予所属权限
$ chown -R hadoop /opt/hbase






21. (主机master hadoop用户)修改配置文件
$ vi /opt/hbase/conf/hbase-env.sh


export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"
export JAVA_HOME=/opt/jdk1.7.0_79
export HBASE_MANAGES_ZK=false
export HADOOP_HOME=/opt/hadoop
export HBASE_HOME=/opt/hbase


$ source /opt/hbase/conf/hbase-env.sh




$ vi /opt/hbase/conf/hbase-site.xml



 
    hbase.rootdir
    hdfs://master:9000/hbase
 

 
    hbase.cluster.distributed
    true
 

 
    hbase.master
    master:60000
 

   
    hbase.master.port
    60000
    The port master should bind to.
 

 
    hbase.zookeeper.quorum
    master,slave1,slave2
 

 
hbase.zookeeper.property.clientPort
2181                     
 




$ vi /opt/hbase/conf/regionservers
master
slave1
slave2




22. (所有 root用户)配置环境变量
$ vi /etc/profile


export HBASE_HOME=/opt/hbase
export PATH=$PATH:$HBASE_HOME/bin


$ source /etc/profile




23. (主机master hadoop用户)HBase文件复制
$ scp -r /opt/hbase hadoop@slave1:/opt
$ scp -r /opt/hbase hadoop@slave2:/opt




23. (主机master hadoop用户)启动HBase
$ cd /opt/hbase/bin
$ ./start-hbase.sh




24. (主机master hadoop 用户)安装Scala
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf scala-2.9.3.tgz
$ mv scala-2.9.3 scala




赋予所属权限
$ chown -R hadoop /opt/scala




25. (主机master hadoop用户)Scala文件复制
$ scp -r /opt/scala hadoop@slave1:/opt
$ scp -r /opt/scala hadoop@slave2:/opt




26. (所有 root用户)配置环境变量
$ vi /etc/profile


export SCALA_HOME=/opt/scala
export PATH=$PATH:$SCALA_HOME/bin






27. (主机master hadoop 用户)安装Spark
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf spark-1.6.1-bin-hadoop2.4.tgz
$ mv spark-1.6.1-bin-hadoop2.4 spark




赋予所属权限
$ chown -R hadoop /opt/spark




28. (所有 root用户)配置环境变量
$ vi /etc/profile


export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/sbin


29.(主机master hadoop 用户)修改配置文件
$ cp /opt/spark/conf/spark-env.sh.template /opt/spark/conf/spark-env.sh
$ vi /opt/spark/conf/spark-env.sh
 
export JAVA_HOME=/opt/jdk1.7.0_79  
export HADOOP_HOME=/opt/hadoop 
export SCALA_HOME=/opt/scala
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_JAR=/opt/spark/lib/spark-assembly-1.6.1-hadoop2.4.0.jar


$ source /opt/spark/conf/spark-env.sh




18. 修改slaves文件
$ cp /opt/spark/conf/slaves.template /opt/spark/conf/slaves
$ vi /opt/spark/conf/slaves


slave1
slave2




30. (主机master hadoop用户)Spark文件复制
$ scp -r /opt/spark hadoop@slave1:/opt
$ scp -r /opt/spark hadoop@slave2:/opt




31. (主机master hadoop用户)启动Spark
$ cd /opt/spark/sbin
$ ./start-all.sh




34.启动顺序:
1.主机master hadoop用户 启动hadoop
2.所有机器 hadoop用户 启动Zookeeper
3.主机master hadoop用户 启动Spark
4.主机master hadoop用户 启动HBase






你可能感兴趣的:(大数据)