centos7.4-aarch64上如何部署Hadoop2.7.4

1、Hadoop简介

centos7.4-aarch64上如何部署Hadoop2.7.4_第1张图片

Hadoop是一个由Apache基金会所开发的分布式系统基础架构。用户可以在不了解分布式底层细节的情况下,开发分布式程序。充分利用集群的威力进行高速运算和存储。

       Hadoop实现了一个分布式文件系统(Hadoop Distributed File System),简称HDFS。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(large data set)的应用程序。HDFS放宽了(relax)POSIX的要求,可以以流的形式访问(streaming access)文件系统中的数据。

        Hadoop的框架最核心的设计就是:HDFS和MapReduce。HDFS为海量的数据提供了存储,则MapReduce为海量的数据提供了计算。

优点:

  • 高可靠性:Hadoop按位存储和处理数据的能力值得人们信赖。

  • 高扩展性:Hadoop是在可用的计算机集簇间分配数据并完成计算任务的,这些集簇可以方便地扩展到数以千计的节点中。

  • 高效性:Hadoop能够在节点之间动态地移动数据,并保证各个节点的动态平衡,因此处理速度非常快。

  • 高容错性。Hadoop能够自动保存数据的多个副本,并且能够自动将失败的任务重新分配。

  • 低成本:与一体机、商用数据仓库以及QlikView、Yonghong Z-Suite等数据集市相比,hadoop是开源的,项目的软件成本因此会大大降低。

     Hadoop带有用Java语言编写的框架,因此运行在 Linux 生产平台上是非常理想的。Hadoop 上的应用程序也可以使用其他语言编写,比如 C++。

centos7.4-aarch64上如何部署Hadoop2.7.4_第2张图片

                                             HDFS写入数据流程图

centos7.4-aarch64上如何部署Hadoop2.7.4_第3张图片

                                         HDFS读出数据流程图

2、环境说明:2台arm架构的centos7.4操作系统

IP 操作系统 服务器名
10.2.151.138 centos7.4 localhost-master
10.2.151.140 centos7.4 localhost-slave1

3、主机名设置

(1)查看当前主机名

[root@localhost ~]# hostname
localhost

(2)修改主机名(根据自己需要而定)

$:hostnamectl set-hostname xxxxx

(3)配置hosts文件

$:vim /etc/hosts
ip              主机名
10.2.151.138    localhost-master
...
...

4、安装Java环境

$:mkdir /usr/java
$:tar -xvf jdk-8u161-linux-arm64-vfp-hflt.tar.gz -C /usr/java

$:vim /etc/profile
#java
export JAVA_HOME=/usr/java/jdk1.8.0_161/
export JRE_HOME=/usr/java/jdk1.8.0_161/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

生效配置文件
$:source /etc/profile

5、ssh无密码登录

(1)安装和启动ssh服务

可以通过下面命令查看结果显示如下:

$:rpm -qa | grep openssh
openssh-7.4p1-11.el7.aarch64
openssh-server-7.4p1-11.el7.aarch64
openssh-clients-7.4p1-11.el7.aarch64
$:rpm -qa | grep rsync
rsync-3.1.2-4.el7.aarch64

若未安装可使用yum安装

          yum install ssh      #安装SSH协议

          yum install rsync (rsync是一个远程数据同步工具,可通过LAN/WAN快速同步多台主机间的文件)

          service sshd restart 启动服务

(2)配置Master无密码登录所有Salve

        a、ssh无密码登录原理

       Master(NameNode | JobTracker)作为客户端,要实现无密码公钥认证,连接到服务器Salve(DataNode | Tasktracker)上时,需要在Master上生成一个密钥对,包括一个公钥和一个私钥,而后将公钥复制到所有的Slave上。当Master通过SSH连接Salve时,Salve就会生成一个随机数并用Master的公钥对随机数进行加密,并发送给Master。Master收到加密数之后再用私钥解密,并将解密数回传给Slave,Slave确认解密数无误之后就允许Master进行连接了。这就是一个公钥认证过程,其间不需要用户手工输入密码。重要过程是将客户端Master复制到Slave上。

        b、master机器上生成密码对

        在Master节点上执行以下命令:ssh-keygen –t rsa –P ''

这条命是生成其无密码密钥对,询问其保存路径时直接回车采用默认路径。生成的密钥对:id_rsa和id_rsa.pub,默认存储在执行命令的目录下。

$: ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Ubr8vGxPKdtUHUSaTgjjC80nRoOzVuPMSz1rKu7iNpc root@localhost
The key's randomart image is:
+---[RSA 2048]----+
|        .=.   .o |
|       o==+ . +  |
|       .X*oo + . |
|       +oB+oo  ..|
|      . S.. o.. .|
|         + o o   |
|        . * +    |
|     + E o.B     |
|    o.*o..+.o    |
+----[SHA256]-----+

查看是否生成秘钥:

[root@localhost ~]# ll -a
total 7237680
dr-xr-x---.  6 root root       4096 Aug  3 13:28 .
dr-xr-xr-x. 17 root root        264 Aug  3 11:36 ..
-rw-------.  1 root root       1774 Dec 30  2017 anaconda-ks.cfg
-rw-------.  1 root root       3194 Aug  3 10:05 .bash_history
-rw-r--r--.  1 root root         18 Dec 29  2013 .bash_logout
-rw-r--r--.  1 root root        176 Dec 29  2013 .bash_profile
-rw-r--r--.  1 root root        176 Dec 29  2013 .bashrc
drwx------.  3 root root         17 Aug  3 11:09 .cache
-rwxr-xr-x.  1 root root 7411329024 Mar 20 14:20 CentOS-7-aarch64-Everything.iso
-rw-r--r--.  1 root root        100 Dec 29  2013 .cshrc
-rwxrwxrwx.  1 root root         74 May 24 15:04 force-eth0-100Mbps.sh
-rwxr-xr-x.  1 root root        493 May 30 15:08 lvm-resize-sda.sh
drwx------.  2 root root         38 Aug  3 13:28 .ssh
-rw-r--r--.  1 root root        129 Dec 29  2013 .tcshrc
drwxr-xr-x.  2 root root       4096 Apr 19 10:41 updates
-rw-------.  1 root root       5360 Aug  3 12:43 .viminfo
[root@localhost ~]# cd .ssh/
[root@localhost .ssh]# ll
total 8
-rw-------. 1 root root 1679 Aug  3 13:28 id_rsa
-rw-r--r--. 1 root root  396 Aug  3 13:28 id_rsa.pub

        c、在master节点中将id_rsa.pub追加到授权的key里面去

[root@localhost .ssh]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@localhost .ssh]# ll
total 12
-rw-r--r--. 1 root root  792 Aug  3 13:35 authorized_keys
-rw-------. 1 root root 1679 Aug  3 13:28 id_rsa
-rw-r--r--. 1 root root  396 Aug  3 13:28 id_rsa.pub

        d、修改文件"authorized_keys"权限

[root@localhost .ssh]# ll
total 12
-rw-r--r--. 1 root root  792 Aug  3 13:35 authorized_keys
-rw-------. 1 root root 1679 Aug  3 13:28 id_rsa
-rw-r--r--. 1 root root  396 Aug  3 13:28 id_rsa.pub
[root@localhost .ssh]# chmod 600 ~/.ssh/authorized_keys
[root@localhost .ssh]# ll
total 12
-rw-------. 1 root root  792 Aug  3 13:35 authorized_keys
-rw-------. 1 root root 1679 Aug  3 13:28 id_rsa
-rw-r--r--. 1 root root  396 Aug  3 13:28 id_rsa.pub

e、用root用户设置"/etc/ssh/sshd_config"的内容

[root@localhost .ssh]# vim /etc/ssh/sshd_config
修改配置如下:
RSAAuthentication yes # 启用 RSA 认证
PubkeyAuthentication yes # 启用公钥私钥配对认证方式
AuthorizedKeysFile .ssh/authorized_keys # 公钥文件路径(和上面生成的文件同)

重启ssh服务使配置生效:service sshd restart

f、验证是否生效

[root@localhost ~]# ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:a1NFzC3BwML16Ic2ZDgqOjyrX9DWWFTaipmSU3AQC34.
ECDSA key fingerprint is MD5:d7:cd:5c:29:db:b0:b1:33:47:fe:9a:91:48:f1:32:5c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Last login: Fri Aug  3 12:42:53 2018 from 10.2.154.39
[root@localhost ~]# ls
anaconda-ks.cfg  CentOS-7-aarch64-Everything.iso  force-eth0-100Mbps.sh  lvm-resize-sda.sh  updates  zhaochuang
[root@localhost ~]# exit
logout
Connection to localhost closed.
[root@localhost ~]# 

g、把公钥复制所有的Slave机器上

[root@localhost ~]# scp ~/.ssh/id_rsa.pub [email protected]:~/
The authenticity of host '10.2.151.140 (10.2.151.140)' can't be established.
ECDSA key fingerprint is SHA256:a1NFzC3BwML16Ic2ZDgqOjyrX9DWWFTaipmSU3AQC34.
ECDSA key fingerprint is MD5:d7:cd:5c:29:db:b0:b1:33:47:fe:9a:91:48:f1:32:5c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.2.151.140' (ECDSA) to the list of known hosts.
[email protected]'s password: 
id_rsa.pub                                                                                                                                 100%  396   194.4KB/s   00:00    
[root@localhost ~]# 

h、登录salve节点,针对salve节点进行配置

《1》在~/创建.ssh文件

$:mkdir ~/.ssh
$:chmod 700 ~/.ssh

《2》追加到授权文件"authorized_keys"

[root@localhost-slave1 ~]# cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
[root@localhost-slave1 ~]# chmod 600 ~/.ssh/authorized_keys

《2》修改"/etc/ssh/sshd_config"

[root@localhost-slave1 ~]# vim /etc/ssh/sshd_config
RSAAuthentication yes 
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys

重启ssh服务:service sshd restart

《3》在master上测试ssh无密码登录salve是否成功

[root@localhost ~]# ssh 10.2.151.140
Last login: Fri Aug  3 12:42:57 2018 from 10.2.154.39
[root@localhost-slave1 ~]# 

把~/目录下的"id_rsa.pub"文件删除掉:rm –rf ~/id_rsa.pub

到此为止以实现master节点和slave节点的ssh无密码登录,重复上面步骤实现其他master和slave节点的无密码登录!!!

(3)实现slave节点ssh无密码登录master节点

a、创建"Slave"自己的公钥和私钥,并把自己的公钥追加到"authorized_keys"文件中

[root@localhost-slave1 ~]# rm –r ~/id_rsa.pub
rm: cannot remove ‘–r’: No such file or directory
rm: remove regular file ‘/root/id_rsa.pub’? y
[root@localhost-slave1 ~]# ssh-keygen -t rsa -P ''
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:bb17EEH7OwT1/CR/UZRCt09ZOfeW36TDdUlMrqb1aU0 root@localhost-slave1
The key's randomart image is:
+---[RSA 2048]----+
|           .o.++*|
|            .+oO*|
|            o.+=@|
|         . ..o.BX|
|        S o .*o=E|
|         .  =o=o*|
|           ...o=.|
|             .o. |
|            ..   |
+----[SHA256]-----+
[root@localhost-slave1 ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@localhost-slave1 ~]# 

b、在slave上,用命令"scp"复制"Slave"的公钥"id_rsa.pub"到"Master"的"~/"目录下,并追加到"Master"的"authorized_keys"中。

slave:

[root@localhost-slave1 ~]# scp ~/.ssh/id_rsa.pub [email protected]:~/
The authenticity of host '10.2.151.138 (10.2.151.138)' can't be established.
ECDSA key fingerprint is SHA256:a1NFzC3BwML16Ic2ZDgqOjyrX9DWWFTaipmSU3AQC34.
ECDSA key fingerprint is MD5:d7:cd:5c:29:db:b0:b1:33:47:fe:9a:91:48:f1:32:5c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.2.151.138' (ECDSA) to the list of known hosts.
[email protected]'s password: 
id_rsa.pub                                                                                                                                 100%  403    72.3KB/s   00:00    
[root@localhost-slave1 ~]# 

master:

[root@localhost ~]# cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
[root@localhost ~]# ll
total 7237648
-rw-------. 1 root root       1774 Dec 30  2017 anaconda-ks.cfg
-rwxr-xr-x. 1 root root 7411329024 Mar 20 14:20 CentOS-7-aarch64-Everything.iso
-rwxrwxrwx. 1 root root         74 May 24 15:04 force-eth0-100Mbps.sh
-rw-r--r--. 1 root root        403 Aug  3 14:21 id_rsa.pub
-rwxr-xr-x. 1 root root        493 May 30 15:08 lvm-resize-sda.sh
drwxr-xr-x. 2 root root       4096 Apr 19 10:41 updates
drwxr-xr-x. 2 root root        107 Aug  3 08:49 zhaochuang
[root@localhost ~]# rm -rf id_rsa.pub 

e、测试slave无密码登录master

从"slave"到"master"无密码登录

[root@localhost-slave1 ~]# ssh 10.2.151.138
Last login: Fri Aug  3 13:44:30 2018 from ::1
[root@localhost ~]# 

从"Master"到"Slave"无密码登录

[root@localhost ~]# ssh 10.2.151.140
Last login: Fri Aug  3 14:11:05 2018 from 10.2.151.138
[root@localhost-slave1 ~]# 

到此为止以实 现master和salve互相之间的无密码登录。(其他master和slave节点实现过程同上)

6、安装Hadoop2.7.4

(1)解压hadoop2.7.4并创建tmp目录

$:mkdir /usr/hadoop
$:tar -xvf hadoop-2.7.4-aarch64.tar.gz -C /usr/hadoop
$:cd /usr/hadoop/hadoop-2.7.4
$:mv * ../
$:cd ../
$:rm -rf hadoop-2.7.4
$:mkdir tmp

(2)修改hadoop配置文件

       1.配置"/etc/profile"

$:vim /etc/profile
#hadoop
export HADOOP_HOME=/usr/hadoop
export PATH=$PATH:$HADOOP_HOME/bin

使配置文件生效:source /etc/profile

       2.配置hadoop-env.sh

$:cd /usr/hadoop/etc/hadoop
$:vim hadoop-env.sh
末尾加
# set java environment
export JAVA_HOME=/usr/java/jdk1.8.0_161

     3.配置core-site.xml文件

$:vim core-site.xml

        
                hadoop.tmp.dir
                /usr/hadoop/tmp
                A base for other temporary directories.
        
        
        
                fs.default.name
                hdfs://10.110.151.154:9000
        

    4.配置hdfs-site.xml文件

$:vim hdfs-site.xml


        
                dfs.replication
                1
        


     5.配置mapred-site.xml文件

$:vim mapred-site.xml
        
                mapred.job.tracker
                http://10.110.151.154:9001
        

     6.配置masters文件

     去掉"localhost",加入Master机器的IP:10.110.151.154

     7.配置slaves文件(Master主机特有

     去掉"localhost",加入集群中所有Slave机器的IP,也是每行一个:10.2.151.140

7.启动及验证

1)格式化HDFS文件系统

cd  /usr/hadoop/bin

hadoop namenode -format

2)启动hadoop

启动前关闭集群中所有机器的防火墙,不然会出现datanode开后又自动关闭。

service iptables stop

cd  /usr/hadoop/sbin

start-all.sh

3)验证hadoop

      (1)jps

      (2)用"hadoop dfsadmin -report"

       用这个命令可以查看Hadoop集群的状态。

     (3)webUI访问"http:10.110.151.154:50070"

你可能感兴趣的:(运维)