部署hdfs

Hbaes 安装部署

安装准备: 

1. jre-7-linux-x64.rpm (http://www.oracle.com/technetwork/java/javase/downloads/java-se-jre-7-download-432155.html)

2. cloudera-cdh-4-0  (http://archive.cloudera.com/cdh4/one-click-install/redhat/5/x86_64/cloudera-cdh-4-0.x86_64.rpm)     


3. 更新/etc/hosts 文件  hbase[X].test.exmple.com 本机也另写一条记录 不要使用127.0.0.1 hbaseX.test.example.com

4.  ssh key 在NameNode(secondary name node)上 生成ssh key【ssh-keygen -t rsa】 然后把公钥添加到集群的其它机器上

5. 配置NTP  * * * * * /usr/sbin/ntpdate ntpserver && /sbin/hwclock -w


HDFS安装部署

机器角色 以及所需安装的包(https://ccp.cloudera.com/display/CDH4DOC/CDH4+Installation)

                   角色                             安装包

hbase1   NameNode              hadoop-hdfs-namenode

hbase2   JobTracker             hadoop-0.20-mapreduce-jobtracker

hbase3   DataNode               hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode

hbase4   DataNode               hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode


HDFS部署(https://ccp.cloudera.com/display/CDH4DOC/Deploying+HDFS+on+a+Cluster#DeployingHDFSonaCluster-CustomizingConfigurationFiles

1. 拷贝默认的配置文件到我们自己的目录 sudo cp -r /etc/hadoop/conf.empty /etc/hadoop/conf.my_cluster

2. 设置我们的目录为配置目录  后面配置步骤中 修改的文件都在conf.my_cluster中

sudo alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50 
sudo alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster

3. 配置core-site.xml  并且scp到所有其它集群几点上去 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hbase1.test.example.com:9000/</value>
</property>

<property> 
  <name>fs.trash.interval</name>  # 如果文件通过shell被删除 则被移到.Trash中 
  <value>1440</value>   # 保留的时间
</property> 

<property> 
  <name>fs.trash.checkpoint.interval</name> 
  <value>0</value> 
</property>
</configuration> 

 4. 配置hdfs-site.xml

    a. 非datanode上: 

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
      <name>dfs.datanode.max.xcievers</name>
      <value>4096</value>
</property>

<property>
      <name>dfs.permissions.superusergroup</name>
      <value>hadoop</value>
</property>
</configuration>

    b. datanode上 a基础上多加一个属性

<property> 
 <name>dfs.datanode.data.dir</name>
 <value>/data/1/dfs/dn</value>  # 在data node上创建这个目录 目录属性 hdfs:hdfs 
</property> 

    c. NameNode a基础上多加一个属性

<property>
  <name>dfs.namenode.name.dir</name> 
  <value>/data/1/dfs/nn</value>   # 在name node上创建这个目录 目录属性 hdfs:hdfs  700
</property>    


5. 格式化NameNode

sudo -u hdfs hadoop namenode -format  (注意 使用hdfs这个用户 来格式化)

6. 部署MRv1 ( https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v1+%28MRv1%29+on+a+Cluster) 

    a. 配置mapred-site.xml

<?xml version="1.0"?> 

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<configuration> 
<property> 
  <name>mapred.job.tracker</name> 
  <value>hbase2.test.example.com:8021</value> 
</property> 

<property> 
  <name>mapred.local.dir</name> 
  <value>/data/1/mapred/local</value>   # 在job tracker上加这个目录 属性mapred:hadoop
</property> 

<property> 
  <name>mapreduce.jobtracker.restart.recover</name> 
  <value>true</value> 
</property> 
</configuration> 

    b. 把上述文件scp到其它集群节点上

    c. 启动每个节点相应的服务 for service in /etc/init.d/hadoop-hdfs-*; do sudo $service start; done

    d. 在hdfs上创建目录 /tmp

sudo -u hdfs hadoop fs -mkdir /tmp

sudo -u hdfs hadoop fs -chmod -R 1777 /tmp

     e. 创建MapReduce /var目录

sudo -u hdfs hadoop fs -mkdir /var

sudo -u hdfs hadoop fs -mkdir /var/lib

sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs

sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache

sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache/mapred

sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache/mapred/mapred

sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred

    f. 查看是否创建成功 sudo -u hdfs hadoop fs -ls -R /

    g. 创建目录 

sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system   
sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system

    h. 在tasktracker节点上启动 sudo service hadoop-0.20-mapreduce-tasktracker start

    i. 在jobtracker节点上启动 sudo service hadoop-0.20-mapreduce-jobtracker start

你可能感兴趣的:(hdfs)