hadoop完全分布式+hive+sqoop+hbase+spark+zookeeper

================

IP地址 用户名 密码 主机名
10.1.1.101 root password master
10.1.1.102 root password slave1
10.1.1.103 root password slave2
主机名 IP 服务名 用户名 密码
master 10.1.1.101 mysql root password
组件 版本 Linux版本下载地址
java 8 https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html
hadoop 2.7.1 https://archive.apache.org/dist/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz
hive 2.0.0 https://archive.apache.org/dist/hive/hive-2.0.0/apache-hive-2.0.0-bin.tar.gz
sqoop 1.4.7 https://archive.apache.org/dist/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
zookeeper 3.4.8 https://archive.apache.org/dist/zookeeper/zookeeper-3.4.8/zookeeper-3.4.8.tar.gz
spark 2.0.0 https://archive.apache.org/dist/spark/spark-2.0.0/spark-2.0.0-bin-hadoop2.7.tgz
hbase 1.2.1 https://archive.apache.org/dist/hbase/1.2.1/hbase-1.2.1-bin.tar.gz

注:所有安装包都在/h3cu下,本环境已经安装好mysql

1、虚拟机基础配置

1.1、修改主机名

  1. master

    [root@localhost ~]# hostnamectl set-hostname master 
    [root@localhost ~]# bash
    [root@master ~]# 
    
  2. slave1

    [root@localhost ~]# hostnamectl set-hostname slave1
    [root@localhost ~]# bash 
    [root@slave1 ~]#
    
  3. slave2

    [root@localhost ~]# hostnamectl set-hostname slave2
    [root@localhost ~]# bash 
    [root@slave2 ~]# 
    

1.2、配置hosts域名解析

1.2.1、查看三台主机ip地址

  1. master

    [root@master ~]# ip a
    1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
        link/ether 00:0c:29:5a:51:a2 brd ff:ff:ff:ff:ff:ff
        inet 10.1.1.101/24 brd 10.1.1.255 scope global eno16777736
           valid_lft forever preferred_lft forever
        inet6 fe80::20c:29ff:fe5a:51a2/64 scope link 
           valid_lft forever preferred_lft forever
    
  2. slave1

    [root@slave1 ~]# ip a
    1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
        link/ether 00:0c:29:c8:8c:15 brd ff:ff:ff:ff:ff:ff
        inet 10.1.1.102/24 brd 10.1.1.255 scope global eno16777736
           valid_lft forever preferred_lft forever
        inet6 fe80::20c:29ff:fec8:8c15/64 scope link 
           valid_lft forever preferred_lft forever
    
  3. slave2

    [root@slave2 ~]# ip a
    1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
        link/ether 00:0c:29:43:d4:07 brd ff:ff:ff:ff:ff:ff
        inet 10.1.1.103/24 brd 10.1.1.255 scope global eno16777736
           valid_lft forever preferred_lft forever
        inet6 fe80::20c:29ff:fe43:d407/64 scope link 
           valid_lft forever preferred_lft forever
    
    

1.2.2、master配置hosts文件

[root@master ~]# vi /etc/hosts
10.1.1.101 master
10.1.1.102 slave1
10.1.1.103 slave2

1.2.3、测试连接性

[root@master ~]# ping master -c 2
PING master (10.1.1.101) 56(84) bytes of data.
64 bytes from master (10.1.1.101): icmp_seq=1 ttl=64 time=0.137 ms
64 bytes from master (10.1.1.101): icmp_seq=2 ttl=64 time=0.084 ms

--- master ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1008ms
rtt min/avg/max/mdev = 0.084/0.110/0.137/0.028 ms
[root@master ~]# ping slave1 -c 2
PING slave1 (10.1.1.102) 56(84) bytes of data.
64 bytes from slave1 (10.1.1.102): icmp_seq=1 ttl=64 time=0.497 ms
64 bytes from slave1 (10.1.1.102): icmp_seq=2 ttl=64 time=0.388 ms

--- slave1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1003ms
rtt min/avg/max/mdev = 0.388/0.442/0.497/0.058 ms
[root@master ~]# ping slave2 -c 2
PING slave2 (10.1.1.103) 56(84) bytes of data.
64 bytes from slave2 (10.1.1.103): icmp_seq=1 ttl=64 time=0.430 ms
64 bytes from slave2 (10.1.1.103): icmp_seq=2 ttl=64 time=0.214 ms

--- slave2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1005ms
rtt min/avg/max/mdev = 0.214/0.322/0.430/0.108 ms
  1. 将hosts文件分发给slave1和slave2

    [root@master ~]# scp /etc/hosts slave1:/etc/
    The authenticity of host 'slave1 (10.1.1.102)' can't be established.
    ECDSA key fingerprint is a2:ec:2f:3a:b9:33:d3:a7:fd:51:9d:d7:cf:ce:fb:ea.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'slave1,10.1.1.102' (ECDSA) to the list of known hosts.
    root@slave1's password: 
    hosts                                                                                      100%  212     0.2KB/s   00:00    
    [root@master ~]# scp /etc/hosts slave2:/etc/
    The authenticity of host 'slave2 (10.1.1.103)' can't be established.
    ECDSA key fingerprint is a2:ec:2f:3a:b9:33:d3:a7:fd:51:9d:d7:cf:ce:fb:ea.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'slave2,10.1.1.103' (ECDSA) to the list of known hosts.
    root@slave2's password: 
    hosts                                                                                      100%  212     0.2KB/s   00:00    
    
  2. 关闭防火墙

    1. master

      [root@master ~]# systemctl status firewalld 
      ● firewalld.service - firewalld - dynamic firewall daemon
         Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
         Active: active (running) since Sat 2021-06-19 03:14:26 CST; 5h 7min ago
       Main PID: 872 (firewalld)
         CGroup: /system.slice/firewalld.service
                 └─872 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
      
      Jun 19 03:14:25 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
      Jun 19 03:14:26 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
      [root@master ~]# systemctl stop firewalld 
      [root@master ~]# systemctl disable firewalld 
      Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
      Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
      
    2. slave1

      [root@slave1 ~]# systemctl status firewalld 
      ● firewalld.service - firewalld - dynamic firewall daemon
         Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
         Active: active (running) since Sat 2021-06-19 03:14:50 CST; 6h ago
       Main PID: 875 (firewalld)
         CGroup: /system.slice/firewalld.service
                 └─875 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
      
      Jun 19 03:14:50 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
      Jun 19 03:14:50 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
      [root@slave1 ~]# systemctl stop firewalld 
      [root@slave1 ~]# systemctl disable firewalld 
      Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
      Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
      
      
    3. slave2

      [root@slave2 ~]# systemctl status firewalld 
      ● firewalld.service - firewalld - dynamic firewall daemon
         Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
         Active: active (running) since Sat 2021-06-19 03:15:00 CST; 6h ago
       Main PID: 873 (firewalld)
         CGroup: /system.slice/firewalld.service
                 └─873 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid
      
      Jun 19 03:15:00 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewall daemon...
      Jun 19 03:15:00 localhost.localdomain systemd[1]: Started firewalld - dynamic firewall daemon.
      [root@slave2 ~]# systemctl stop firewalld 
      [root@slave2 ~]# systemctl disable firewalld 
      Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
      Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.
      

2、配置SSH

2.1、生成秘钥文件

  1. master

    [root@master ~]# ssh-keygen -t rsa -P ''
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa): 
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    42:04:1c:9e:1d:ad:f2:ed:ae:4c:21:92:30:9d:3b:f8 root@master
    The key's randomart image is:
    +--[ RSA 2048]----+
    |   .ooo.         |
    | . o.+ ..        |
    |o o o o.         |
    | + o...          |
    |. = .oo.S        |
    | . o ..o.        |
    |  E   ..         |
    |     o  .        |
    |      oo.        |
    +-----------------+
    
  2. slave1

    [root@slave1 ~]# ssh-keygen -t rsa -P ''
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa): 
    Created directory '/root/.ssh'.
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    30:22:3d:06:35:2d:ac:65:33:b4:0d:43:c2:ee:d0:c1 root@slave1
    The key's randomart image is:
    +--[ RSA 2048]----+
    | oo=B.           |
    |  E+B*.          |
    | o.=*++          |
    |. +o o o         |
    | o      S        |
    |  .              |
    |                 |
    |                 |
    |                 |
    +-----------------+
    
  3. slave2

    [root@slave2 ~]# ssh-keygetn -t rsa -P ''
    bash: ssh-keygetn: command not found
    [root@slave2 ~]# ssh-keygen -t rsa -P ''
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa): 
    Created directory '/root/.ssh'.
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    a3:01:98:3c:a6:ec:20:d0:57:0b:4b:56:0c:27:cf:c3 root@slave2
    The key's randomart image is:
    +--[ RSA 2048]----+
    |    *++          |
    | o = X..         |
    |. B + E          |
    |oo o . .         |
    |+.    . S        |
    |+      o .       |
    | .    .          |
    |                 |
    |                 |
    +-----------------+
    

2.2、master分发公钥

[root@master ~]# ssh-copy-id -i master 
The authenticity of host 'master (10.1.1.101)' can't be established.
ECDSA key fingerprint is a2:ec:2f:3a:b9:33:d3:a7:fd:51:9d:d7:cf:ce:fb:ea.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@master's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'master'"
and check to make sure that only the key(s) you wanted were added.

[root@master ~]# ssh-copy-id -i slave1 
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave1's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave1'"
and check to make sure that only the key(s) you wanted were added.

[root@master ~]# ssh-copy-id -i slave2
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave2's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave2'"
and check to make sure that only the key(s) you wanted were added.

2.3、测试master登录slave1

[root@master ~]# ssh slave1
Last login: Sat Jun 19 03:21:18 2021 from 10.1.1.1

3、配置java环境

3.1、卸载openjdk

[root@master ~]# rpm -qa |grep openjdk
通过rpm -e --nodeps "查询出来的rpm包" 去卸载

3.2、安装java

  1. 将/h3cu下面的java安装到/usr/local/src下

    [root@master ~]# tar -xzf /h3cu/jdk-8u144-linux-x64.tar.gz -C /usr/local/src/
    
  2. 将解压后的java文件重命名为java

    [root@master ~]# mv /usr/local/src/jdk1.8.0_144 /usr/local/src/java
    
  3. 配置java环境变量,仅使当前用户生效

    [root@master ~]# vi /root/.bash_profile
    export JAVA_HOME=/usr/local/src/java
    export PATH=$PATH:$JAVA_HOME/bin
    
  4. 加载环境变量,查看java的版本信息

    [root@master ~]# source /root/.bash_profile 
    [root@master ~]# java -version 
    java version "1.8.0_144"
    Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
    Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
    
  5. 将java分发给slave1和slave2

    [root@master ~]# scp -r /usr/local/src/java slave1:/usr/local/src/
    [root@master ~]# scp -r /usr/local/src/java slave2:/usr/local/src/
    [root@master ~]# scp /root/.bash_profile slave1:/root/
    [root@master ~]# scp /root/.bash_profile slave2:/root/
    

4、配置hadoop完全分布式

  1. 将/h3cu下的hadoop解压到/usr/lcoal/src下(master上操作)

    [root@master ~]# tar -xzf /h3cu/hadoop-2.7.1.tar.gz -C /usr/local/src/
    
  2. 将解压后的hadoop文件重命名为hadoop

    [root@master ~]# ll /usr/local/src/
    total 8
    drwxr-xr-x. 9 10021 10021 4096 Jun 29  2015 hadoop-2.7.1
    drwxr-xr-x. 8    10   143 4096 Jul 22  2017 java
    [root@master ~]# mv /usr/local/src/hadoop-2.7.1 /usr/local/src/hadoop
    [root@master ~]# ll /usr/local/src/
    total 8
    drwxr-xr-x. 9 10021 10021 4096 Jun 29  2015 hadoop
    drwxr-xr-x. 8    10   143 4096 Jul 22  2017 java
    
  3. 配置hadoop环境变量,仅当前用户生效

    [root@master ~]# vi /root/.bash_profile 
    export HADOOP_HOME=/usr/local/src/hadoop
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
  4. 加载环境变量,查看hadoop版本

    [root@master ~]# source /root/.bash_profile 
    [root@master ~]# hadoop version 
    Hadoop 2.7.1
    Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a
    Compiled by jenkins on 2015-06-29T06:04Z
    Compiled with protoc 2.5.0
    From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a
    This command was run using /usr/local/src/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar
    
  5. 配置hadoop-env.sh

    [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hadoop-env.sh 
    export JAVA_HOME=/usr/local/src/java
    
  6. 配置core-site.xml

    1. 命令

      [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/core-site.xml 
      
    2. 配置文件添加内容

      <property>
        
        <name>fs.defaultFSname>
        <value>hdfs://master:9000value>
      property>
      <property>
        
        <name>io.file.buffer.sizename>
        <value>131072value>
      property>
      <property>
        
        <name>hadoop.tmp.dirname>
        <value>/usr/local/src/hadoop/dfs/tmpvalue>
      property>
      
  7. 配置hdfs-site.xml

    1. 命令

      [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/hdfs-site.xml 
      
    2. 配置文件添加内容

      <property>
        
        <name>dfs.replicationname>
        <value>3value>
      property>
      <property>
        
        <name>dfs.namenode.name.dirname>
        <value>/usr/local/src/hadoop/dfs/namevalue>
      property>
      <property>
        
        <name>dfs.datanode.data.dirname>
        <value>/usr/local/src/hadoop/dfs/datavalue>
      property>
      <property>
        
        <name>dfs.namenode.handler.countname>
        <value>100value>
      property>
      
  8. 配置mapred-site.xml

    1. 命令

      [root@master ~]# cp /usr/local/src/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/src/hadoop/etc/hadoop/mapred-site.xml
      [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/mapred-site.xml
      
    2. 配置文件添加内容

      <property>
        <name>mapreduce.framework.namename>
        <value>yarnvalue>
      property>
      
  9. 配置yarn-site.xml

    1. 命令

      [root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/yarn-site.xml 
      
    2. 配置文件添加内容(全部必须写入)

      <property>
        <name>yarn.nodemanager.aux-servicesname>
        <value>mapreduce_shufflevalue>
      property>
      <property>  
          <name>yarn.resourcemanager.addressname>  
          <value>master:8032value>  
      property> 
      <property>
          <name>yarn.resourcemanager.scheduler.addressname>  
          <value>master:8030value>  
      property>
      <property>
          <name>yarn.resourcemanager.resource-tracker.addressname>  
          <value>master:8031value>  
      property>
      
  10. 配置slaves

[root@master ~]# vi /usr/local/src/hadoop/etc/hadoop/slaves
master
slave1
slave2
  1. 将hadoop文件和环境变量文件分发给slave1和slave2

    [root@master ~]# scp -r /usr/local/src/hadoop slave1:/usr/local/src/
    [root@slave2 ~]# scp -r /usr/local/src/hadoop slave2:/usr/local/src/
    [root@slave1 ~]# scp /root/.bash_profile slave1:/root
    [root@slave1 ~]# scp /root/.bash_profile slave2:/root
    
  2. namenode进行格式化

    [root@master ~]# hdfs namenode -format 
    21/06/19 08:27:08 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = master/10.1.1.101
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 2.7.1
    STARTUP_MSG:   classpath = /usr/local/src/hadoop/etc/hadoo(此处省略)
    STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a; compiled by 'jenkins' on 2015-06-29T06:04Z
    STARTUP_MSG:   java = 1.8.0_144
    ************************************************************/
    ...省略
    21/06/19 08:27:09 INFO namenode.FSImage: Allocated new BlockPoolId: BP-83508879-10.1.1.101-1624062429594
    21/06/19 08:27:09 INFO common.Storage: Storage directory /usr/local/src/hadoop/dfs/name has been successfully formatted.
    21/06/19 08:27:09 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
    21/06/19 08:27:09 INFO util.ExitUtil: Exiting with status 0
    21/06/19 08:27:09 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at master/10.1.1.101
    ************************************************************/
    
  3. 启动hadoop集群,查看守护进程

    1. 启动

      [root@master ~]# start-all.sh 
      This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
      Starting namenodes on [master]
      master: starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-root-namenode-master.out
      slave1: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-root-datanode-slave1.out
      slave2: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-root-datanode-slave2.out
      master: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-root-datanode-master.out
      Starting secondary namenodes [0.0.0.0]
      0.0.0.0: starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-root-secondarynamenode-master.out
      starting yarn daemons
      starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-root-resourcemanager-master.out
      slave1: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-root-nodemanager-slave1.out
      slave2: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-root-nodemanager-slave2.out
      master: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-root-nodemanager-master.out
      
    2. jps查看

      1. master

        [root@master ~]# jps
        10802 SecondaryNameNode
        11042 NodeManager
        10948 ResourceManager
        10535 NameNode
        10651 DataNode
        11326 Jps
        
      2. slave1

        [root@slave1 ~]# jps
        2660 NodeManager
        2777 Jps
        2555 DataNode
        

5、部署hive组件

  1. 将/h3cu下的hive解压到/usr/lcoal/src下(master上操作)

    [root@master ~]# tar -xzvf /h3cu/apache-hive-2.0.0-bin.tar.gz -C /usr/local/src/
    
  2. 将解压后的hadoop文件重命名为hadoop

    [root@master ~]# mv /usr/local/src/apache-hive-2.0.0-bin /usr/local/src/hive 
    
  3. 配置hadoop环境变量,仅当前用户生效

    [root@master ~]# vi /root/.bash_profile 
    export HIVE_HOME=/usr/local/src/hive
    export PATH=$PATH:$HIVE_HOME/bin
    [root@master ~]# source /root/.bash_profile 
    
  4. 解压mysql-connect驱动包,将jar复制到hive的lib目录下

    [root@master ~]# tar -xzf /h3cu/mysql-connector-java-5.1.27.tar.gz -C /usr/local/src/
    [root@master ~]# cp /usr/local/src/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /usr/local/src/hive/lib/
    
  5. 配置hive-site.xml

    1. 命令

      [root@master ~]# cp /usr/local/src/hive/conf/hive-default.xml.template /usr/local/src/hive/conf/hive-site.xml
      [root@master ~]# vi /usr/local/src/hive/conf/hive-site.xml 
      
    2. 内容

      <property>
        
        <name>javax.jdo.option.ConnectionURLname>
        <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&useSSL=falsevalue>
      property>
      <property>
        
        <name>javax.jdo.option.ConnectionDriverNamename>
        <value>com.mysql.jdbc.Drivervalue>
      property>
      <property>
        
        <name>javax.jdo.option.ConnectionUserNamename>
        <value>rootvalue>
      property>
      <property>
        
        <name>javax.jdo.option.ConnectionPasswordname>
        <value>passwordvalue>
      property>
      <property>
        
        <name>hive.metastore.schema.verificationname>
        <value>falsevalue>
      property>
      <property>
        
        <name>hive.exec.scratchdirname>
        <value>/hive/warehouse/tmpvalue>
      property>
      <property>
        
        <name>hive.metastore.warehouse.dirname>
        <value>/hive/warehouse/homevalue>
      property>
      <property>
        
        <name>hive.cli.print.headername>
        <value>truevalue>
      property>
      <property>
        
        <name>hive.cli.print.current.dbname>
        <value>truevalue>
      property>
      <property>
        
        <name>hive.support.quoted.identifiersname>
        <value>nonevalue>
      property>
      
  6. hive初始化

    [root@master ~]# schematool -dbType mysql -initSchema
    which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/mysql/bin:/root/bin:/usr/local/src/java/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/root/bin:/usr/local/src/java/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hive/bin)
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    Metastore connection URL:	 jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&useSSL=false
    Metastore Connection Driver :	 com.mysql.jdbc.Driver
    Metastore connection User:	 root
    Starting metastore schema initialization to 2.0.0
    Initialization script hive-schema-2.0.0.mysql.sql
    Initialization script completed
    schemaTool completed
    
  7. 进入hive shell

    [root@master ~]# hive 
    which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/mysql/bin:/root/bin:/usr/local/src/java/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/root/bin:/usr/local/src/java/bin:/usr/local/src/hadoop/bin:/usr/local/src/hadoop/sbin:/usr/local/src/hive/bin)
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    
    Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.0.0.jar!/hive-log4j2.properties
    cHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
    hive (default)> show databases;
    OK
    database_name
    default
    Time taken: 0.707 seconds, Fetched: 1 row(s)
    hive (default)> create database test;
    OK
    Time taken: 0.178 seconds
    hive (default)> show databases;
    OK
    database_name
    default
    test
    Time taken: 0.021 seconds, Fetched: 2 row(s)
    

6、部署sqoop组件

  1. 将/h3cu下的sqoop解压到/usr/local/src下

    [root@master ~]# tar -xzvf /h3cu/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C   /usr/local/src/
    
  2. 将解压后的sqoop文件重命名为sqoop

    [root@master ~]# mv /usr/local/src/sqoop-1.4.7.bin__hadoop-2.6.0 /usr/local/src/sqoop
    
  3. 配置sqoop环境变量

    [root@master ~]# vi /root/.bash_profile 
    export SQOOP_HOME=/usr/local/src/sqoop
    export PATH=$PATH:$SQOOP_HOME/bin
    
  4. 加载sqoop环境变量,查看sqoop版本信息

    [root@master ~]# source /root/.bash_profile 
    [root@master ~]# sqoop version 
    Warning: /usr/local/src/sqoop/../hbase does not exist! HBase imports will fail.
    Please set $HBASE_HOME to the root of your HBase installation.
    Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /usr/local/src/sqoop/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    21/06/21 02:13:30 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Sqoop 1.4.7
    git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
    Compiled by maugli on Thu Dec 21 15:59:58 STD 2017
    
  5. 添加mysql驱动

    [root@master ~]# cp /usr/local/src/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /usr/local/src/sqoop/lib/
    
  6. 配置sqoop-env.sh

    [root@master ~]# cp /usr/local/src/sqoop/conf/sqoop-env-template.sh /usr/local/src/sqoop/conf/sqoop-env.sh
    [root@master ~]# vi /usr/local/src/sqoop/conf/sqoop-env.sh
    export HADOOP_COMMON_HOME=/usr/local/src/hadoop
    export HADOOP_MAPRED_HOME=/usr/local/src/hadoop
    export HIVE_HOME=/usr/local/src/hive
    
  7. 测试sqoop

    [root@master ~]# sqoop list-databases --connect jdbc:mysql://master:3306 --username root --password password
    Warning: /usr/local/src/sqoop/../hbase does not exist! HBase imports will fail.
    Please set $HBASE_HOME to the root of your HBase installation.
    Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /usr/local/src/sqoop/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    21/06/21 02:20:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    21/06/21 02:20:05 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    21/06/21 02:20:05 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    information_schema
    hive
    mysql
    performance_schema
    sys
    test
    

7、部署zookeeper集群

  1. 将/h3cu下的zookeeper解压到/usr/local/src

    [root@master ~]# tar -xvzf /h3cu/zookeeper-3.4.8.tar.gz -C /usr/local/src/
    
  2. 将解压后文件重命名为zookeeper

    [root@master ~]# mv /usr/local/src/zookeeper-3.4.8 /usr/local/src/zookeeper
    
  3. 配置zookeeper环境变量,加载环境变量,仅对当前用户生效

    [root@master ~]# vi /root/.bash_profile 
    export ZOOKEEPER_HOME=/usr/local/src/zookeeper
    export PATH=$PATH:$ZOOKEEPER_HOME/bin
    [root@master ~]# source /root/.bash_profile 
    
  4. 配置zoo.cfg配置文件

    dataDir进行修改,server三行写入进去

    [root@master ~]# cp /usr/local/src/zookeeper/conf/zoo_sample.cfg /usr/local/src/zookeeper/conf/zoo.cfg
    [root@master ~]# vi /usr/local/src/zookeeper/conf/zoo.cfg 
    dataDir=/usr/local/src/zookeeper/data
    server.1=master:2888:3888
    server.2=slave1:2888:3888
    server.3=slave2:2888:3888
    
  5. 配置myid文件

    [root@master ~]# mkdir /usr/local/src/zookeeper/data
    [root@master ~]# echo "1" > /usr/local/src/zookeeper/data/myid
    
  6. 将文件分发给slave1和slave2

    [root@master ~]# scp -r /usr/local/src/zookeeper slave1:/usr/local/src/ 
    [root@master ~]# scp -r /usr/local/src/zookeeper slave2:/usr/local/src/
    [root@master ~]# scp /root/.bash_profile slave1:/root/
    [root@master ~]# scp /root/.bash_profile slave2:/root/
    
  7. 修改slave1和slave2的myid文件

    • slave1

      [root@slave1 ~]# echo 2 > /usr/local/src/zookeeper/data/myid 
      
    • slave2

      [root@slave2 ~]# echo 3 > /usr/local/src/zookeeper/data/myid 
      
  8. 分别启动zk集群

    • master

      [root@master ~]# source /root/.bash_profile 
      [root@master ~]# zkServer.sh start 
      ZooKeeper JMX enabled by default
      Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
      Starting zookeeper ... STARTED
      
    • slave1

      [root@slave1 ~]# source /root/.bash_profile 
      [root@slave1 ~]# zkServer.sh start 
      ZooKeeper JMX enabled by default
      Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
      Starting zookeeper ... STARTED
      
    • slave2

      [root@slave2 ~]# source /root/.bash_profile 
      [root@slave2 ~]# zkServer.sh start 
      ZooKeeper JMX enabled by default
      Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
      Starting zookeeper ... STARTED
      
  9. 分别查看zk集群的状态

    注意:leader和follower是选举出来不是固定在某台机器上

    [root@master ~]# zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
    Mode: follower
    [root@master ~]# jps
    17120 QuorumPeerMain
    16449 SecondaryNameNode
    16594 ResourceManager
    16695 NodeManager
    16297 DataNode
    17230 Jps
    16175 NameNode
    [root@slave1 ~]# zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
    Mode: leader
    [root@slave1 ~]# jps
    15587 NodeManager
    15481 DataNode
    15721 Jps
    15050 QuorumPeerMain
    [root@slave2 ~]# zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /usr/local/src/zookeeper/bin/../conf/zoo.cfg
    Mode: follower
    [root@slave2 ~]# jps
    14965 QuorumPeerMain
    15480 NodeManager
    15374 DataNode
    15647 Jps
    

8、部署hbase完全分布式

  1. 将/h3cu下的hbase解压到/usr/local/src

    [root@master ~]# tar -xvzf /h3cu/hbase-1.2.1-bin.tar.gz -C /usr/local/src/
    
  2. 将解压后的文件重命名为hbase

    [root@master ~]# mv /usr/local/src/hbase-1.2.1/ /usr/local/src/hbase
    
  3. 配置hbase的环境变量

    [root@master ~]# vi /root/.bash_profile 
    export HBASE_HOME=/usr/local/src/hbase
    export PATH=$PATH:$HBASE_HOME/bin
    [root@master ~]# source /root/.bash_profile 
    
  4. 配置hbase-site.xml

    • 命令

      
      
    • 配置文件内容

      <property>
        
        <name>hbase.cluster.distributedname>
        <value>truevalue>
      property>
      <property>
        
        <name>hbase.rootdirname>
        <value>hdfs://master:9000/hbasevalue>
      property>
      <property>
        
        <name>hbase.zookeeper.property.dataDirname>
        <value>/usr/local/src/zookeeper/ZKdatavalue>
      property>
      <property>
        
        <name>hbase.zookeeper.property.clientPortname>
        <value>2181value>
      property>
      <property>
        
        <name>hbase.zookeeper.quorumname>
        <value>master,slave1,slave2value>
      property>
      <property>
        
        <name>hbase.master.info.portname>
        <value>16010value>
      property>
      
  5. 配置hbase-env.sh

    [root@master ~]# vim /usr/local/src/hbase/conf/hbase-env.sh 
    export JAVA_HOME=/usr/local/src/java
    export HBASE_MANAGES_ZK=false
    
  6. 配置regionservers

    [root@master ~]# vim /usr/local/src/hbase/conf/regionservers 
    master
    slave1
    slave2
    
  7. 配置备用backup-master

    [root@master ~]# vim /usr/local/src/hbase/conf/backup-masters
    slave1
    
  8. 分发给slave1和slave1

    [root@master ~]# scp /root/.bash_profile slave1:/root/
    [root@master ~]# scp /root/.bash_profile slave2:/root/
    [root@master ~]# scp -r /usr/local/src/hbase slave1:/usr/local/src/ 
    [root@master ~]# scp -r /usr/local/src/hbase slave2:/usr/local/src/ 
    
  9. 启动hbase

    [root@master ~]# start-hbase.sh 
    starting master, logging to /usr/local/src/hbase/logs/hbase-root-master-localhost.localdomain.out
    Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
    Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
    slave2: starting regionserver, logging to /usr/local/src/hbase/bin/../logs/hbase-root-regionserver-slave2.out
    slave1: starting regionserver, logging to /usr/local/src/hbase/bin/../logs/hbase-root-regionserver-slave1.out
    master: starting regionserver, logging to /usr/local/src/hbase/bin/../logs/hbase-root-regionserver-master.out
    slave1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
    slave1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
    slave2: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
    slave2: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
    master: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
    master: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
    slave1: starting master, logging to /usr/local/src/hbase/bin/../logs/hbase-root-master-slave1.out
    slave1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
    slave1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
    [root@master ~]# jps
    17120 QuorumPeerMain
    18273 SecondaryNameNode
    18114 DataNode
    19602 HRegionServer
    17987 NameNode
    18423 ResourceManager
    18537 NodeManager
    19465 HMaster
    19803 Jps
    [root@slave1 ~]# jps
    15587 NodeManager
    16099 HRegionServer
    16184 HMaster
    15481 DataNode
    15050 QuorumPeerMain
    16412 Jps
    [root@slave2 ~]# jps
    14965 QuorumPeerMain
    15480 NodeManager
    16090 Jps
    15374 DataNode
    15935 HRegionServer
    

9、部署spark完全分布式

  1. 将/h3cu下的spark解压到/usr/local/src

    [root@master ~]# tar -xvzf /h3cu/spark-2.0.0-bin-hadoop2.7.tgz -C /usr/local/src/
    
  2. 将解压后的文件重名为spark

    [root@master ~]# mv /usr/local/src/spark-2.0.0-bin-hadoop2.7 /usr/local/src/spark 
    
  3. 配置spark-env.sh

    [root@master ~]# cp  /usr/local/src/spark/conf/spark-env.sh.template  /usr/local/src/spark/conf/spark-env.sh
    [root@master ~]# vi /usr/local/src/spark/conf/spark-env.sh
    # java位置
    export JAVA_HOME=/usr/local/src/java
    # master节点IP或域名
    export SPARK_MASTER_IP=master
    # worker内存大小
    export SPARK_WORKER_MEMORY=1G
    # Worker的cpu核数
    SPARK_WORKER_CORES=1
    # hadoop配置文件路径
    export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc/hadoop
    
  4. 配置slaves

    [root@master ~]# cp /usr/local/src/spark/conf/slaves.template /usr/local/src/spark/conf/slaves
    [root@master ~]# vi /usr/local/src/spark/conf/slaves
    master
    slave1
    slave2
    
  5. 分发文件

    [root@master ~]# scp -r /usr/local/src/spark slave1:/usr/local/src/  
    [root@master ~]# scp -r /usr/local/src/spark slave2:/usr/local/src/
    
  6. 启动spark集群

    [root@master ~]# /usr/local/src/spark/sbin/start-all.sh 
    starting org.apache.spark.deploy.master.Master, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out
    slave2: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out
    slave1: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
    master: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/src/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-master.out
    [root@master ~]# jps
    17120 QuorumPeerMain
    18273 SecondaryNameNode
    18114 DataNode
    19602 HRegionServer
    17987 NameNode
    20949 Worker
    18423 ResourceManager
    20999 Jps
    18537 NodeManager
    19465 HMaster
    20860 Master
    [root@slave1 ~]# jps
    15587 NodeManager
    16099 HRegionServer
    17091 Worker
    17141 Jps
    16184 HMaster
    15481 DataNode
    15050 QuorumPeerMain
    
    

你可能感兴趣的:(hadoop,hive,大数据,sqoop,运维)