Hbaes 安装部署
安装准备:
1. jre-7-linux-x64.rpm (http://www.oracle.com/technetwork/java/javase/downloads/java-se-jre-7-download-432155.html)
2. cloudera-cdh-4-0 (http://archive.cloudera.com/cdh4/one-click-install/redhat/5/x86_64/cloudera-cdh-4-0.x86_64.rpm)
3. 更新/etc/hosts 文件 hbase[X].test.exmple.com 本机也另写一条记录 不要使用127.0.0.1 hbaseX.test.example.com
4. ssh key 在NameNode(secondary name node)上 生成ssh key【ssh-keygen -t rsa】 然后把公钥添加到集群的其它机器上
5. 配置NTP * * * * * /usr/sbin/ntpdate ntpserver && /sbin/hwclock -w
HDFS安装部署
机器角色 以及所需安装的包(https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation)
角色 安装包
hbase1 NameNode hadoop-hdfs-namenode
hbase2 JobTracker hadoop-0.20-mapreduce-jobtracker
hbase3 DataNode hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
hbase4 DataNode hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
HDFS部署(https://ccp.cloudera.com/display/CDH4DOC/Deploying+HDFS+on+a+Cluster#DeployingHDFSonaCluster-CustomizingConfigurationFiles)
1. 拷贝默认的配置文件到我们自己的目录 sudo cp -r /etc/hadoop/conf.empty /etc/hadoop/conf.my_cluster
2. 设置我们的目录为配置目录 后面配置步骤中 修改的文件都在conf.my_cluster中
sudo alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50
sudo alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster3. 配置core-site.xml 并且scp到所有其它集群几点上去
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hbase1.test.example.com:9000/</value> </property> <property> <name>fs.trash.interval</name> # 如果文件通过shell被删除 则被移到.Trash中 <value>1440</value> # 保留的时间 </property> <property> <name>fs.trash.checkpoint.interval</name> <value>0</value> </property> </configuration>
4. 配置hdfs-site.xml
a. 非datanode上:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.datanode.max.xcievers</name> <value>4096</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> </configuration>
b. datanode上 a基础上多加一个属性
<property> <name>dfs.datanode.data.dir</name> <value>/data/1/dfs/dn</value> # 在data node上创建这个目录 目录属性 hdfs:hdfs </property>
c. NameNode a基础上多加一个属性
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/1/dfs/nn</value> # 在name node上创建这个目录 目录属性 hdfs:hdfs 700
</property>
5. 格式化NameNode
sudo -u hdfs hadoop namenode -format (注意 使用hdfs这个用户 来格式化)
6. 部署MRv1 ( https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v1+%28MRv1%29+on+a+Cluster)
a. 配置mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>hbase2.test.example.com:8021</value> </property> <property> <name>mapred.local.dir</name> <value>/data/1/mapred/local</value> # 在job tracker上加这个目录 属性mapred:hadoop </property> <property> <name>mapreduce.jobtracker.restart.recover</name> <value>true</value> </property> </configuration>
b. 把上述文件scp到其它集群节点上
c. 启动每个节点相应的服务 for service in /etc/init.d/hadoop-hdfs-*; do sudo $service start; done
d. 在hdfs上创建目录 /tmp
sudo -u hdfs hadoop fs -mkdir /tmp sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
e. 创建MapReduce /var目录
sudo -u hdfs hadoop fs -mkdir /var sudo -u hdfs hadoop fs -mkdir /var/lib sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache/mapred sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache/mapred/mapred sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache/mapred/mapred/staging sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
f. 查看是否创建成功 sudo -u hdfs hadoop fs -ls -R /
g. 创建目录
sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system
h. 在tasktracker节点上启动 sudo service hadoop-0.20-mapreduce-tasktracker start
i. 在jobtracker节点上启动 sudo service hadoop-0.20-mapreduce-jobtracker start