019 Hadoop 2.6 多节点集群设置及 Hadoop 安装

019 Hadoop 2.6 Multi Node Cluster Setup and Hadoop Installation

1. Hadoop 2.6 Multi Node Cluster Setup Tutorial – Objective

Hadoop 1. 2.6 多节点群集设置教程-目标

In this tutorial on Install** Hadoop** 2.6 Multi node cluster setup on Ubuntu, we will learn how to install a Hadoop 2.6 multi-node cluster setup with YARN. We will learn various steps for Hadoop 2.6 installing on Ubuntu to setup Hadoop multi-node cluster. We will start with platform requirements for Hadoop 2.6 Multi Node Cluster Setup on Ubuntu, prerequisites to install Hadoop on master and slave, various software required for installing Hadoop, how to start Hadoop cluster and how to stop Hadoop cluster. It will also cover how to install Hadoop CDH5 to help you in programming in Hadoop.

在安装教程中Hadoop2.6 个多节点集群安装 Ubuntu 的，我们将学习如何安装 Hadoop 2.6 多节点群集设置纱线.我们将学习在 Ubuntu 上安装 Hadoop 2.6 来设置 Hadoop 多节点集群的各种步骤.我们将从 Ubuntu 上 Hadoop 2.6 多节点集群设置的平台要求、在主服务器和从服务器上安装 Hadoop 的先决条件、安装 Hadoop 所需的各种软件开始, 如何启动 Hadoop 集群，如何停止 Hadoop 集群.它还将介绍如何安装 Hadoop CDH5，以帮助您在 Hadoop 中编程.

Hadoop 2.6 Multi Node Cluster Setup and Hadoop Installation

2. Hadoop 2.6 Multi Node Cluster Setup

Let us now start with steps to setup Hadoop multi-node cluster in Ubuntu. Let us first understand the recommended platform for installing Hadoop on the multi-node cluster in Ubuntu.

现在让我们从在 Ubuntu 中设置 Hadoop 多节点集群的步骤开始.让我们先了解一下在 Ubuntu 多节点集群上安装 Hadoop 的推荐平台.

2.1. Recommended Platform for Hadoop 2.6 Multi Node Cluster Setup

2.1.Hadoop 2.6 多节点集群设置推荐平台

OS: Linux is supported as a development and production platform. You can use Ubuntu 14.04 or 16.04 or later (you can also use other Linux flavors like CentOS, Redhat, etc.)
Hadoop: Cloudera Distribution for Apache Hadoop CDH5.x (you can use Apache Hadoop 2.x)
操作系统:Linux作为开发和生产平台得到支持.您可以使用 Ubuntu 14.04 或 16.04 或更高版本 (您也可以使用其他 Linux 版本，如 CentOS 、 Redhat 等).))
Hadoop:Apache Hadoop CDH5.x 的 Cloudera 发行版 (您可以使用 Apache Hadoop 2.X)

2.2. Install Hadoop on Master

2.2.在 Master 上安装 Hadoop

Let us now start with installing Hadoop on master node in the distributed mode.

现在让我们从在分布式模式下的主节点上安装 Hadoop 开始.

I. Prerequisites for Hadoop 2.6 Multi Node Cluster Setup

一、 Hadoop 2.6 多节点集群设置的前提条件

Let us now start with learning the prerequisites to install Hadoop:
a. Add Entries in hosts file
Edit hosts file and add entries of master and slaves:

现在让我们从学习安装 Hadoop 的先决条件开始:
A.在 hosts 文件中添加条目
编辑 hosts 文件并添加 master 和 slaves 的条目:

sudo nano /etc/hosts
 MASTER-IP master
 SLAVE01-IP slave01
 SLAVE02-IP slave02

(NOTE: In place of MASTER-IP, SLAVE01-IP, SLAVE02-IP put the value of the corresponding IP)
b. Install Java 8 (Recommended Oracle Java)

Sudo nano/etc/主机
 MASTER-IP 大师:
 斯拉夫 01 SLAVE01-IP
 斯拉夫 02 SLAVE02-IP

(注: MASTER-IP 、 SLAVE01-IP 代替，SLAVE02-IP 放对应 IP 的值)
安装 Java 8 (推荐使用 Oracle Java)

Install Python Software Properties
安装 Python 软件属性

sudo apt-get install python-software-properties

安装 python-软件-属性

Add Repository
添加存储库

sudo add-apt-repository ppa:webupd8team/java

Sudo add-apt-repository ppa: webupd8team/java

Update the source list
更新源列表

sudo apt-get update

Sudo apt-获取更新

Install Java
安装 Java

sudo apt-get install oracle-java8-installer
c. Configure SSH

Sudo apt-get 安装 oracle-java8-installer
配置 SSH.

Install Open SSH Server-Client
安装打开的 SSH 服务器客户端

sudo apt-get install openssh-server openssh-client

更新源安装 openssh openssh 服务器-客户端

Generate Key Pairs
生成密钥对

ssh-keygen -t rsa -P ""

Ssh-keygen-t rsa-P"

Configure passwordless SSH
配置无密码 SSH

Copy the content of .ssh/id_rsa.pub (of master) to .ssh/authorized_keys (of all the slaves as well as master)

的内容复制.到的 ssh/id _ rsa.Pub (主).Ssh/authorized_keys (在所有从属服务器和主服务器中)

Check by SSH to ****all the Slaves
通过 SSH 检查到****所有的奴隶

ssh slave01
 ssh slave02

II. Install Apache Hadoop in distributed mode

II.在分布式模式下安装 Apache Hadoop

Let us now learn how to download and install Hadoop?
a. Download Hadoop
Below is the link to download Hadoop 2.x.
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.2.tar.gz
b. Untar Tarball
tar xzf hadoop-2.5.0-cdh5.3.2.tar.gz
(Note: All the required jars, scripts, configuration files, etc. are available in HADOOP_HOME directory (hadoop-2.5.0-cdh5.3.2))

现在就让我们来学习下 Hadoop 怎么下载安装？
A.下载 Hadoop
下面是下载 2. 2.x 的链接.
Http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.2.tar.gz
Untar Tarball
Tar xzf hadoop-2.5.0-cdh5.3.2.tar.gz
(注意: HADOOP_HOME 目录 (hadoop-2.5.0-cdh5.3.2) 中提供了所有必需的 jar 、脚本、配置文件等)

III. Hadoop multi-node cluster setup Configuration

三、 Hadoop 多节点集群设置配置

Let us now learn how to setup Hadoop configuration while installing Hadoop?
a. Edit .bashrc
Edit .bashrc file located in user’s home directory and add following environment variables:

现在让我们来学习如何在安装 Hadoop 的同时设置 Hadoop 配置？
A.编辑.Bashrc
编辑.位于用户主目录中的 bashrc 文件，并添加以下环境变量:

export HADOOP_PREFIX="/home/ubuntu/hadoop-2.5.0-cdh5.3.2"
 export PATH=HADOOP_PREFIX/bin
 export PATH=HADOOP_PREFIX/sbin
 export HADOOP_MAPRED_HOME={HADOOP_PREFIX}
 export HADOOP_HDFS_HOME={HADOOP_PREFIX}

(Note: After above step restart the Terminal/Putty so that all the environment variables will come into effect)
b. Check environment variables
Check whether the environment variables added in the .bashrc file are available:

bash
 hdfs

(It should not give error: command not found)
c. Edit hadoop-env.sh
Edit configuration file hadoop-env.sh (located in HADOOP_HOME/etc/hadoop) and set JAVA_HOME:
export JAVA_HOME= (eg: /usr/lib/jvm/java-8-oracle/)
d. Edit core-site.xml
Edit configuration file core-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


 
 fs.defaultFS
 hdfs://master:9000
 
 
 hadoop.tmp.dir
 /home/ubuntu/hdata

Note: /home/ubuntu/hdata is a sample location; please specify a location where you have Read Write privileges
e. Edit hdfs-site.xml
Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


 
 dfs.replication
 2

f. Edit mapred-site.xml
Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


 
 mapreduce.framework.name
 yarn

g. Edit yarn-site.xml
Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add following entries:


 
 yarn.nodemanager.aux-services
 mapreduce_shuffle
 
 
 yarn.nodemanager.aux-services.mapreduce.shuffle.class
 org.apache.hadoop.mapred.ShuffleHandler
 
 
 yarn.resourcemanager.resource-tracker.address
 master:8025
 
 
 yarn.resourcemanager.scheduler.address
 master:8030
 
 
 yarn.resourcemanager.address
 master:8040

h. Edit salves
Edit configuration file slaves (located in HADOOP_HOME/etc/hadoop) and add following entries:

slave01
 slave02

“Hadoop is set up on Master, now setup Hadoop on all the Slaves”
Refer this guide to learn Hadoop Features and design principles.

导出 hadoop _ 前缀 = "/home/ubuntu/hadoop-2.5.0-cdh5.3.2"
 导出路径 =  hadoop _ 前缀/bin
 导出路径 =  HADOOP_PREFIX/sbin
 出口 HADOOP_MAPRED_HOME ={ HADOOP_PREFIX}
 出口 HADOOP_HDFS_HOME ={ HADOOP_PREFIX}

(注意: 在上述步骤后，重新启动终端/Putty，以便所有环境变量生效)
检查环境变量.
中添加的环境变量.Bashrc 文件可供选择:

Bash
 Hdfs

(不应该给出错误: 找不到命令)
编辑 hadoop-env.sh
编辑配置文件 hadoop-env.sh (位于 hadoop _ home/etc/hadoop 中) 并设置 java _ home:
导出 Java _ home = (例如:/usr/lib/jvm/Java-8-oracle/)
编辑 core-site.xml
编辑配置文件 core-site.xml (位于 hadoop _ home/etc/hadoop 中) 并添加以下条目:

<配置>
 <物业>
  fs.defaultFS 
  hdfs://master: 9000 
 
 <物业>
  hadoop.tmp.dir 
 /home/ubuntu/hdata

注意:/home/ubuntu/hdata 是一个示例位置; 请指定具有读写权限的位置
E.编辑 hdfs-site.xml
编辑配置文件 hdfs-site.xml (位于 hadoop _ home/etc/hadoop 中) 并添加以下条目:

<配置>
 <物业>
 Dfs.复制 
 <值> 2

编辑 mapred-site.xml
编辑配置文件 mapred-site.xml (位于 hadoop _ home/etc/hadoop 中) 并添加以下条目:

<配置>
 <物业>
  mapreduce.framework.name 
 纱线

编辑 yarn-site.xml
编辑配置文件 mapred-site.xml (位于 hadoop _ home/etc/hadoop 中) 并添加以下条目:

<配置>
 <物业>
 Aux.nodemanager.aux-服务 
 <<价值> mapreduce_shuffle/超值>
 
 <物业>
 Yarn.nodemanager.Aux-services.mapreduce.shuffle.class 
  org.apache.hadoop.mapred.ShuffleHandler 
 
 <物业>
 纱线.资源管理器.Resource-tracker.address 
 主: 8025 
 
 <物业>
  纱.resourcemanager.调度.
 主: 8030 
 
 <物业>
 Yarn.Resource cemanager.地址 
 主: 8040

编辑 salves
编辑配置文件从属文件 (位于 hadoop _ home/etc/hadoop 中)，并添加以下条目:

Slave01
 Slave02

“Hadoop 是在 Master 上设置的，现在在所有 Slaves 上设置 Hadoop”
请参考本指南了解 Hadoop 的特性和设计原则.

2.3. Install Hadoop On Slaves

2.3.在 Slaves 上安装 Hadoop

I. Setup Prerequisites on all the slaves

I.为所有奴隶设置先决条件

Run following steps on all the slaves:

对所有从属服务器运行以下步骤:

Add Entries in hosts file
Install Java 8 (Recommended Oracle Java)
在 hosts 文件中添加条目
安装 Java 8 (Oracle Java 推荐)

II. Copy configured setups from master to all the slaves

II.将配置好的设置从 master 复制到所有 slaves

a. Create tarball of configured setup
tar czf hadoop.tar.gz hadoop-2.5.0-cdh5.3.2
(NOTE: Run this command on Master)
b. Copy the configured tarball on all the slaves
scp hadoop.tar.gz slave01:~
(NOTE: Run this command on Master)
scp hadoop.tar.gz slave02:~
(NOTE: Run this command on Master)
c. Un-tar configured Hadoop setup on all the slaves
tar xzf hadoop.tar.gz
(NOTE: Run this command on all the slaves)
“Hadoop is set up on all the Slaves. Now Start the Cluster”

A.创建已配置设置的 tarball
焦油 czf hadoop.tar.gz hadoop-2.5.0-cdh5.3.2
(注意: 在 Master 上运行此命令)
将配置好的 tarball 复制到所有从属设备上
Scp hadoop.tar.gz slave01: ~
(注意: 在 Master 上运行此命令)
Scp hadoop.tar.gz slave02: 〜
(注意: 在 Master 上运行此命令)
在所有从属服务器上配置的 Hadoop 设置
Tar xzf hadoop.Tar.gz
(注意: 对所有 slaves 运行此命令)
“所有从属服务器上都设置了 Hadoop.现在启动集群”

2.4. Start the Hadoop Cluster

2.4.启动 Hadoop 集群

Let us now learn how to start Hadoop cluster?

现在让我们来学习一下如何启动 Hadoop 集群？

I. Format the name node

I.格式化名称节点

bin/hdfs namenode -format
(Note: Run this command on Master)
(NOTE: This activity should be done once when you install Hadoop, else it will delete all the data from HDFS)

Bin/hdfs 名称节点格式
(注意: 在 Master 上运行此命令)
(注意: 当你安装 Hadoop 时，这个活动应该完成一次，否则它会从HDFS)

II. Start HDFS Services

二、启动 HDFS 服务

sbin/start-dfs.sh
(Note: Run this command on Master)

/Sbin/start-dfs.sh
(注意: 在 Master 上运行此命令)

III. Start YARN Services

三、开展纱线服务

sbin/start-yarn.sh
(Note: Run this command on Master)

/Sbin/start-yarn.sh
(注意: 在 Master 上运行此命令)

IV. Check for Hadoop services

四、检查 Hadoop 服务

a. Check daemons on Master

A.检查 Master 上的守护进程

jps

NameNode
ResourceManager

b. Check daemons on Slaves

jps

DataNode
NodeManager

Jps

南德
资源管理器

检查奴隶的守护进程

Jps

DataNode
NodeManager

2.5. Stop The Hadoop Cluster

2.5.停止 Hadoop 集群

Let us now see how to stop the Hadoop cluster?

现在让我们来看看如何停止 Hadoop 集群？

I. Stop YARN Services

停止纱线服务

sbin/stop-yarn.sh
(Note: Run this command on Master)

/Sbin/stop-yarn.sh
(注意: 在 Master 上运行此命令)

II. Stop HDFS Services

二、停止 HDFS 服务

sbin/stop-dfs.sh
(Note: Run this command on Master)
This is how we do Hadoop 2.6 multi node cluster setup on Ubuntu.
After learning how to do Hadoop 2.6 multi node cluster setup, follow this comparison guide to get the feature wise comparison between Hadoop 2.x vs Hadoop 3.x.
If you like this tutorial on Hadoop Multinode Cluster Setup, do let us know in the comment section. Our support team is happy to help you regarding any queries in Hadoop 2.6 multi node cluster setup.

/Sbin/stop-dfs.sh
(注意: 在 Master 上运行此命令)
这是我们在 Ubuntu 上如何进行 Hadoop 2.6 多节点集群设置的.
在学习了如何进行 Hadoop 2.6 多节点集群设置后，按照本比较指南获取Hadoop 2.x 和 Hadoop 3..x 之间的功能比较
如果您喜欢 Hadoop 多节点集群设置的本教程，请在评论部分告诉我们.我们的支持团队很乐意帮助您了解 Hadoop 2.6 多节点集群设置中的任何查询.

**Related Topic – **

相关话题-

Installation of Hadoop 3.x on Ubuntu on Single Node Cluster

在单节点集群上安装 Ubuntu 上的 Hadoop 3.X

Hadoop NameNode Automatic Failover

自动故障切换

Reference

参考

https://data-flair.training/blogs/hadoop-2-6-multinode-cluster-setup