在学习大数据过程中,搭建环境笔记:
1.装载3个vm master、slave1、slave2(CentOS6.4 64位)
2.(所有 root用户)新建用户/密码 hadoop/hadoop123
3.(所有 root用户)修改主机名称
$ vi /etc/sysconfig/network
ETWORKING=yes
HOSTNAME=master
ETWORKING=yes
HOSTNAME=slave1
ETWORKING=yes
HOSTNAME=slave2
4.(所有 root用户)配置静态IP,只改贴的部分其他不动
$ vi /etc/sysconfig/network-scripts/ifcfg-eth0
BOOTPROTO="static"
IPADDR=192.168.246.175
NETMASK=255.255.0.0
GATEWAY=192.168.246.1
DNS1=8.8.8.8
#TYPE="Ethernet"
BOOTPROTO="static"
IPADDR=192.168.246.176
NETMASK=255.255.0.0
GATEWAY=192.168.246.1
DNS1=8.8.8.8
#TYPE="Ethernet"
BOOTPROTO="static"
IPADDR=192.168.246.177
NETMASK=255.255.0.0
GATEWAY=192.168.246.1
DNS1=8.8.8.8
#TYPE="Ethernet"
5.(所有 root用户)重启使配置的静态IP生效
$ /etc/init.d/network restart
6.(所有 root用户)配置hosts
$ vi /etc/hosts
192.168.246.175 master
192.168.246.176 slave1
192.168.246.177 slave2
7.(所有 root用户)永久关闭防火墙 并重启使之生效
$ chkconfig iptables off
$ shutdown -r now
8.配置无密码SSH连接
(主机master hadoop用户)生成公钥 并 发布到 分机slave1、slave2
$ ssh-keygen -t rsa
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@master
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@slave1
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@slave2
8. (所有 root用户)安装jdk
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf jdk-7u79-linux-x64.tar.gz
9.(所有 root用户)配置环境变量 并 使文件生效
$ vi /etc/profile
export JAVA_HOME=/opt/jdk1.7.0_79
export PATH=$PATH:$JAVA_HOME/bin
$ source /etc/profile
-- (所有 root用户) 取得/opt权限给hadoop用户
$ chown -R hadoop /opt
10.(主机master hadoop用户)安装hadoop
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf hadoop-2.5.2.tar.gz
$ mv hadoop-2.5.2 hadoop
11. (主机master hadoop用户)修改配置文件
$ vi /opt/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/opt/jdk1.7.0_79
export HADOOP_HOME=/opt/hadoop
$ vi /opt/hadoop/etc/hadoop/core-site.xml
$ vi /opt/hadoop/etc/hadoop/hdfs-site.xml
$ vi /opt/hadoop/etc/hadoop/mapred-site.xml
修改master文件内容为
$ vi /opt/hadoop/etc/hadoop/masters
master
修改slaves文件内容为
$ vi /opt/hadoop/etc/hadoop/slaves
slave1
slave2
12. (主机master hadoop用户)hadoop文件复制
$ scp -r /opt/hadoop hadoop@slave1:/opt
$ scp -r /opt/hadoop hadoop@slave2:/opt
13. (所有 root用户)配置环境变量
$ vi /etc/profile
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/sbin
$ source /etc/profile
13. 格式化hadfs
$ hadoop namenode -format
14. (所有 hadoop权限)赋予脚本可执行权限
$ chmod +X -R /opt/hadoop/sbin/
15. (主机master hadoop 用户)启动hadoop
$ cd /opt/hadoop/sbin
$ ./start-all.sh
16. (主机master hadoop 用户)安装Zookeeper
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf zookeeper-3.4.6.tar.gz
$ mv zookeeper-3.4.6 zookeeper
创建文件夹
$ mkdir /opt/zookeeperdata
$ echo 1 > /opt/zookeeperdata/myid
赋予hadoop所属权限
$ chown -R hadoop /opt/zookeeperdata
$ chown -R hadoop /opt/zookeeper
配置zoo.cfg
$ vi /opt/zookeeper/conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDi =/opt/zookeeperdata
f milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeperdata
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
maxSessionTimeout=180000
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=1
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
17. (主机master hadoop用户)zookeeper文件复制
$ scp -r /opt/zookeeper hadoop@slave1:/opt
$ scp -r /opt/zookeeper hadoop@slave2:/opt
$ scp -r /opt/zookeeperdata hadoop@slave1:/opt
$ scp -r /opt/zookeeperdata hadoop@slave2:/opt
18. 修改myid文件
修改slave1分机下的 /opt/zookeeperdata/zookeeperdata/myid文件内容 为 2
修改slave2分机下的 /opt/zookeeperdata/zookeeperdata/myid文件内容 为 3
19. (所有 hadoop用户)启动Zookeeper
$ cd /opt/zookeeper/bin
$ ./zkServer.sh start
20. (主机master hadoop 用户)安装HBase
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf hbase-0.98.6-hadoop2-bin.tar.gz
$ mv hbase-0.98.6-hadoop2 hbase
赋予所属权限
$ chown -R hadoop /opt/hbase
21. (主机master hadoop用户)修改配置文件
$ vi /opt/hbase/conf/hbase-env.sh
export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"
export JAVA_HOME=/opt/jdk1.7.0_79
export HBASE_MANAGES_ZK=false
export HADOOP_HOME=/opt/hadoop
export HBASE_HOME=/opt/hbase
$ source /opt/hbase/conf/hbase-env.sh
$ vi /opt/hbase/conf/hbase-site.xml
$ vi /opt/hbase/conf/regionservers
master
slave1
slave2
22. (所有 root用户)配置环境变量
$ vi /etc/profile
export HBASE_HOME=/opt/hbase
export PATH=$PATH:$HBASE_HOME/bin
$ source /etc/profile
23. (主机master hadoop用户)HBase文件复制
$ scp -r /opt/hbase hadoop@slave1:/opt
$ scp -r /opt/hbase hadoop@slave2:/opt
23. (主机master hadoop用户)启动HBase
$ cd /opt/hbase/bin
$ ./start-hbase.sh
24. (主机master hadoop 用户)安装Scala
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf scala-2.9.3.tgz
$ mv scala-2.9.3 scala
赋予所属权限
$ chown -R hadoop /opt/scala
25. (主机master hadoop用户)Scala文件复制
$ scp -r /opt/scala hadoop@slave1:/opt
$ scp -r /opt/scala hadoop@slave2:/opt
26. (所有 root用户)配置环境变量
$ vi /etc/profile
export SCALA_HOME=/opt/scala
export PATH=$PATH:$SCALA_HOME/bin
27. (主机master hadoop 用户)安装Spark
将安装包放在/opt目录下
$ cd /opt
$ tar -zxvf spark-1.6.1-bin-hadoop2.4.tgz
$ mv spark-1.6.1-bin-hadoop2.4 spark
赋予所属权限
$ chown -R hadoop /opt/spark
28. (所有 root用户)配置环境变量
$ vi /etc/profile
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/sbin
29.(主机master hadoop 用户)修改配置文件
$ cp /opt/spark/conf/spark-env.sh.template /opt/spark/conf/spark-env.sh
$ vi /opt/spark/conf/spark-env.sh
export JAVA_HOME=/opt/jdk1.7.0_79
export HADOOP_HOME=/opt/hadoop
export SCALA_HOME=/opt/scala
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_JAR=/opt/spark/lib/spark-assembly-1.6.1-hadoop2.4.0.jar
$ source /opt/spark/conf/spark-env.sh
18. 修改slaves文件
$ cp /opt/spark/conf/slaves.template /opt/spark/conf/slaves
$ vi /opt/spark/conf/slaves
slave1
slave2
30. (主机master hadoop用户)Spark文件复制
$ scp -r /opt/spark hadoop@slave1:/opt
$ scp -r /opt/spark hadoop@slave2:/opt
31. (主机master hadoop用户)启动Spark
$ cd /opt/spark/sbin
$ ./start-all.sh
34.启动顺序:
1.主机master hadoop用户 启动hadoop
2.所有机器 hadoop用户 启动Zookeeper
3.主机master hadoop用户 启动Spark
4.主机master hadoop用户 启动HBase