004 详细介绍 Hadoop 架构-HDFS 、 Yarn 和 MapReduce

Hadoop now has become a popular solution for today’s world needs. The design of Hadoop keeps various goals in mind. These are fault tolerance, handling of large datasets, data locality, portability across heterogeneous hardware and software platforms etc. In this blog, we will explore the Hadoop Architecture in detail. Also, we will see Hadoop Architecture Diagram that helps you to understand it better.

Hadoop 现在已经成为流行的大数据解决方案 为了当今世界的需要 Hadoop 的设计考虑到了各种各样的目标. 这些都是容错,大数据集的处理,数据的局部性,异构软硬件平台的可移植性等,在这篇博客中,我们将详细探讨 Hadoop 架构.此外,我们还将看到帮助您更好地理解 Hadoop 体系结构图.

So, let’s explore Hadoop Architecture.

那么,让我们来探讨一下 Hadoop 架构.

Hadoop Architecture in Detail - HDFS, Yarn & MapReduce

What is Hadoop Architecture?

Hadoop has a master-slave topology. In this topology, we have* one master node and multiple slave nodes*. Master node’s function is to assign a task to various slave nodes and manage resources. The slave nodes do the actual computing. Slave nodes store the real data whereas on master we have metadata. This means it stores data about data. What does metadata comprise that we will see in a moment?

Hadoop 具有主从式拓扑结构. 在这个拓扑中 一个主节点和多个从节点. 主节点的功能是给各个从节点分配一个任务,并对资源进行管理.从节点做实际的计算. 从节点存储真实的数据,而在 master 上我们有元数据. 这意味着它存储关于数据的数据. 元数据包括什么,我们一会儿就会看到?

Hadoop Application Architecture in Detail

详细介绍 Hadoop 应用架构

Hadoop Architecture comprises three major layers. They are:-

Hadoop 架构包括三个主要层.他们是:-

  • HDFS (Hadoop Distributed File System)
  • Yarn
  • MapReduce

1. HDFS

HDFS stands for Hadoop Distributed File System. It provides for data storage of Hadoop. HDFS splits the data unit into smaller units called blocks and stores them in a distributed manner. It has got two daemons running. One for master node – NameNode and other for slave nodes – DataNode.

HDFS 代表 分布式文件系统 . 它提供了 Hadoop 的数据存储. HDFS 将数据单元拆分成称为块的较小单元,并以分布式方式存储它们. 它有两个守护进程在运行.一个用于主节点-NameNode,另一个用于从节点-DataNode.

a. NameNode and DataNode

HDFS has a Master-slave architecture. The daemon called NameNode runs on the master server. It is responsible for Namespace management and regulates file access by the client. DataNode daemon runs on slave nodes. It is responsible for storing actual business data. Internally, a file gets split into a number of data blocks and stored on a group of slave machines. Namenode manages modifications to file system namespace. These are actions like the opening, closing and renaming files or directories. NameNode also keeps track of mapping of blocks to DataNodes. This DataNodes serves read/write request from the file system’s client. DataNode also creates, deletes and replicates blocks on demand from NameNode.

HDFS 有 主从式架构 . NameNode 的守护进程在主服务器上运行. 它负责命名空间管理,并管理客户端的文件访问. DataNode 守护进程在从属节点上运行. 负责实际业务数据的存储. 在内部,文件被拆分成许多数据块,并存储在一组从属机器上.Namenode 管理对文件系统命名空间的修改.这些操作包括打开、关闭和重命名文件或目录.、复制指令还跟踪的block DataNodes 映射.此 DataNodes 服务文件系统客户端的读/写请求. DataNode 还根据需要从 NameNode 创建、删除和复制块.

Hadoop Architecture Diagram

Java is the native language of HDFS. Hence one can deploy DataNode and NameNode on machines having Java installed. In a typical deployment, there is one dedicated machine running NameNode. And all the other nodes in the cluster run DataNode. The NameNode contains metadata like the location of blocks on the DataNodes. And arbitrates resources among various competing DataNodes.

HDFS 使用 Java 语言开发. 因此,可以在安装了 Java 的机器上部署 DataNode 和 NameNode.在典型的部署中,有一台运行 NameNode 的专用机器.集群中的所有其他节点都运行 DataNode.NameNode 包含元数据,如数据节点上块的位置.在各种竞争的数据节点之间仲裁资源.

You must read about Hadoop High Availability Concept

您必须阅读 Hadoop 高可用性概念

b. Block in HDFS

Block is nothing but the smallest unit of storage on a computer system. It is the smallest contiguous storage allocated to a file. In Hadoop, we have a default block size of 128MB or 256 MB.

Block 只是计算机系统上最小的存储单元.它是分配给文件的最小连续存储.在 Hadoop 中我们的默认块大小为 128 MB 或 256 MB.

One should select the block size very carefully. To explain why so let us take an example of a file which is 700MB in size. If our block size is 128MB then HDFS divides the file into 6 blocks. Five blocks of 128MB and one block of 60MB. What will happen if the block is of size 4KB? But in HDFS we would be having files of size in the order terabytes to petabytes. With 4KB of the block size, we would be having numerous blocks. This, in turn, will create huge metadata which will overload the NameNode. Hence we have to choose our HDFS block size judiciously.

你应该非常仔细地选择块的大小.为了解释为什么,让我们举一个大小为 700 MB 的文件的例子.如果我们的块大小是 128 MB,那么 HDFS 将文件分成 6 个块.5 块 128 MB,1 块 60 MB.如果块大小为 4KB,会发生什么?但是在 HDFS 中,我们将拥有大小为 tb 到 pb 的文件.有了 4KB 的块大小,我们就有了许多块.这反过来将创建巨大的元数据,从而使 NameNode 过载.因此,我们必须明智地选择 HDFS 块大小.

c. Replication Management

To provide fault tolerance HDFS uses a replication technique. In that, it makes copies of the blocks and stores in on different DataNodes. Replication factor decides how many copies of the blocks get stored. It is 3 by default but we can configure to any value.

要提供容错 HDFS使用复制技术.在这一点上,它制作块的副本,并存储在不同的 DataNodes 上.复制因子决定了存储块的副本数量.默认情况下是 3,但是我们可以配置为任何值.

Hadoop Replication Factor

The above figure shows how the replication technique works. Suppose we have a file of 1GB then with a replication factor of 3 it will require 3GBs of total storage.

上图显示了复制技术是如何工作的.假设我们有一个 1gb 的文件,那么复制因子为 3 的文件将需要总存储容量的 3GBs.

To maintain the replication factor NameNode collects block report from every DataNode. Whenever a block is under-replicated or over-replicated the NameNode adds or deletes the replicas accordingly.

为了保持复制因子,NameNode 从每个 DataNode 收集块报告.每当块被复制不足或复制过多时,NameNode 都会相应地添加或删除副本.

d. What is Rack Awareness?

Hadoop Architecture

A rack contains many DataNode machines and there are several such racks in the production. HDFS follows a rack awareness algorithm to place the replicas of the blocks in a distributed fashion. This rack awareness algorithm provides for low latency and fault tolerance. Suppose the replication factor configured is 3. Now rack awareness algorithm will place the first block on a local rack. It will keep the other two blocks on a different rack. It does not store more than two blocks in the same rack if possible.

一个机架包含许多 DataNode 机器,在生产中有几个这样的机架.Rack 遵循机架感知算法以分布式方式放置块的副本.这种机架感知算法具有低延迟和容错能力.假设配置的复制因子 3..现在,机架感知算法将在本地机架上放置第一个块.它将其他两个块放在不同的机架上.如果可能的话,它不会在同一个机架中存储超过两个块.

2. MapReduce

MapReduce is the data processing layer of Hadoop. It is a software framework that allows you to write applications for processing a large amount of data. MapReduce runs these applications in parallel on a cluster of low-end machines. It does so in a reliable and fault-tolerant manner.

MapReduce 是 Hadoop 的数据处理层..它是一个软件框架,允许你编写处理大量数据的应用程序.MapReduce 在低端机器集群上并行运行这些应用程序.它以可靠和容错的方式这样做.

MapReduce job comprises a number of map tasks and reduces tasks. Each task works on a part of data. This distributes the load across the cluster. The function of Map tasks is to load, parse, transform and filter data. Each reduce task works on the sub-set of output from the map tasks. Reduce task applies grouping and aggregation to this intermediate data from the map tasks.

MapReduce job 包含多个 map 任务,减少了任务.每个任务都处理数据的一部分.这将负载分布在整个集群中.Map 任务的功能是对数据进行加载、解析、转换和过滤.每个 reduce 任务都在地图任务的输出子集上工作.Reduce task 对来自 map 任务的中间数据应用分组和聚合.

The input file for the MapReduce job exists on HDFS. The inputformat decides how to split the input file into input splits. Input split is nothing but a byte-oriented view of the chunk of the input file. This input split gets loaded by the map task. The map task runs on the node where the relevant data is present. The data need not move over the network and get processed locally.

MapReduce 作业的输入文件存在于 HDFS 上.Inputformat 决定如何将输入文件拆分为输入拆分.输入拆分只是输入文件块的面向字节的视图.Map 任务会加载此输入拆分.Map 任务在存在相关数据的节点上运行.数据不需要在网络上移动,也不需要在本地处理.

Hadoop Architecture - MapReduce

i. Map Task

The Map task run in the following phases:-

Map 任务在以下阶段运行:-

a. RecordReader

The **recordreader **transforms the input split into records. It parses the data into records but does not parse records itself. It provides the data to the mapper function in key-value pairs. Usually, the key is the positional information and value is the data that comprises the record.

的**记录阅读器 **将输入拆分成记录.它将数据解析为记录,但不解析记录本身.它在键值对中为 mapper 函数提供数据.通常,关键是位置信息,值是包含记录的数据.

b. Map

In this phase, the mapper which is the user-defined function processes the key-value pair from the recordreader. It produces zero or multiple intermediate key-value pairs.

在这个阶段,映射这是用户定义的函数处理记录阅读器中的键值对.它产生零个或多个中间键值对.

The decision of what will be the key-value pair lies on the mapper function. The key is usually the data on which the reducer function does the grouping operation. And value is the data which gets aggregated to get the final result in the reducer function.

键-值对的决定取决于 mapper 函数.密钥通常是 reducer 函数进行分组操作的数据.Value 是在 reducer 函数中聚合以获得最终结果的数据.

c. Combiner

The combiner is actually a localized reducer which groups the data in the map phase. It is optional. Combiner takes the intermediate data from the mapper and aggregates them. It does so within the small scope of one mapper. In many situations, this decreases the amount of data needed to move over the network. For example, moving (Hello World, 1) three times consumes more network bandwidth than moving (Hello World, 3). Combiner provides extreme performance gain with no drawbacks. The combiner is not guaranteed to execute. Hence it is not of overall algorithm.

组合是局部的减速将地图阶段的数据分组.它是可选的.Combiner 从 mapper 中获取中间数据并对其进行聚合.它在一个映射器的小范围内这样做.在许多情况下,这减少了通过网络移动所需的数据量.例如,移动 (Hello World,1) 比移动 (Hello World,3) 消耗的网络带宽多三倍.Combiner 在没有缺点的情况下提供了极致的性能提升.不能保证组合器执行.因此,它不是整体算法.

d. Partitioner

Partitioner pulls the intermediate key-value pairs from the mapper. It splits them into shards, one shard per reducer. By default, partitioner fetches the hashcode of the key. The partitioner performs modulus operation by a number of reducers: key.hashcode()%(number of reducers). This distributes the keyspace evenly over the reducers. It also ensures that key with the same value but from different mappers end up into the same reducer. The partitioned data gets written on the local file system from each map task. It waits there so that reducer can pull it.

分区器拉动中间键值对从地图上它将它们分成碎片,每个减速器一个碎片.默认情况下,分区程序获取密钥的 hashcode.分区程序通过多个减速器执行模数运算: key.hashcode () % (减速器数量).这将键空间均匀地分布在减速器上.它还确保具有相同值但来自不同地图绘制器的密钥最终会被放入相同的 reducer 中.从每个地图任务中,分区数据被写入本地文件系统.它在那里等待,这样减速机就可以拉它了.

b. Reduce Task

The various phases in reduce task are as follows:

Reduce 任务中的各个阶段如下:

i. Shuffle and Sort

The reducer starts with shuffle and sort step. This step downloads the data written by partitioner to the machine where reducer is running. This step sorts the individual data pieces into a large data list. The purpose of this sort is to collect the equivalent keys together. The framework does this so that we could iterate over it easily in the reduce task. This phase is not customizable. The framework handles everything automatically. However, the developer has control over how the keys get sorted and grouped through a comparator object.

减速器从洗牌和排序步骤开始.此步骤将分区程序写入的数据下载到减速机运行的机器上.此步骤将单个数据块分类为一个大数据列表.这种类型的目的是将等效键收集在一起.框架这样做是为了我们可以在 reduce 任务中轻松地迭代它.此阶段不可自定义.框架会自动处理所有事情.然而,开发人员可以控制如何通过 comparator 对象对键进行排序和分组.

ii. Reduce

The reducer performs the reduce function once per key grouping. The framework passes the function key and an iterator object containing all the values pertaining to the key.

We can write reducer to filter, aggregate and combine data in a number of different ways. Once the reduce function gets finished it gives zero or more key-value pairs to the outputformat. Like map function, reduce function changes from job to job. As it is the core logic of the solution.

我们可以编写 reducer,以多种不同的方式过滤、聚合和组合数据.Reduce 函数完成后,它会为 output 格式提供零个或多个键值对.像 map 函数一样,减少从作业到作业的功能变化.因为这是解决方案的核心逻辑.

iii. OutputFormat

This is the final step. It takes the key-value pair from the reducer and writes it to the file by recordwriter. By default, it separates the key and value by a tab and each record by a newline character. We can customize it to provide richer output format. But none the less final data gets written to HDFS.

这是最后一步.它从 reducer 获取键值对,并通过 recordwriter 将其写入文件.默认情况下,它用 tab 分隔键和值,用换行符分隔每个记录.我们可以自定义它,以提供更丰富的输出格式.但是最终数据还是被写入了 HDFS.

Hadoop MapReduce Architecture Diagram

3. YARN

YARN or Yet Another Resource Negotiator is the resource management layer of Hadoop. The basic principle behind YARN is to separate resource management and job scheduling/monitoring function into separate daemons. In YARN there is one global ResourceManager and per-application ApplicationMaster. An Application can be a single job or a DAG of jobs.

Hadoop 或另一个资源协商者是 Hadoop 的资源管理层.的纱线背后的基本原理是将资源管理和作业调度/监控功能分离成单独的守护进程.在 YARN 中,有一个全局资源管理器和每个应用程序管理器.应用程序可以是单个作业或作业的 DAG.

Inside the YARN framework, we have two daemons ResourceManager and NodeManager. The ResourceManager arbitrates resources among all the competing applications in the system. The job of NodeManger is to monitor the resource usage by the container and report the same to ResourceManger. The resources are like CPU, memory, disk, network and so on.

在 YARN 框架中,我们有两个守护进程资源管理器还有 NodeManager资源管理器在系统中所有竞争应用程序之间仲裁资源.NodeManger 的工作是监控容器的资源使用情况,并向资源管理人员报告.资源有 CPU 、内存、磁盘、网络等.

The ApplcationMaster negotiates resources with ResourceManager and works with NodeManger to execute and monitor the job.

ApplcationMaster 与资源管理器协商资源,并适用于 NodeManger执行和监控作业.

Hadoop Architecture

The ResourceManger has two important components – Scheduler and ApplicationManager

资源管理器有两个重要的组件: 调度器和应用程序管理器.

i. Scheduler

Scheduler is responsible for allocating resources to various applications. This is a pure scheduler as it does not perform tracking of status for the application. It also does not reschedule the tasks which fail due to software or hardware errors. The scheduler allocates the resources based on the requirements of the applications.

调度器负责为各种应用程序分配资源.这是一个纯粹的调度程序,因为它不为应用程序执行状态跟踪.它也不会重新安排由于软件或硬件错误而失败的任务.调度器根据应用程序的要求分配资源.

ii. Application Manager

Following are the functions of ApplicationManager

以下是 applicmanager 的功能

  • Accepts job submission.

  • Negotiates the first container for executing ApplicationMaster. A container incorporates elements such as CPU, memory, disk, and network.

  • Restarts the ApplicationMaster container on failure.

  • 接受工作提交.

  • 协商执行应用程序 master 的第一个容器.容器包含 CPU 、内存、磁盘和网络等元素.

  • 失败时重新启动 applicymaster 容器.

Functions of ApplicationMaster:-

  • Negotiates resource container from Scheduler.

  • Tracks the resource container status.

  • Monitors progress of the application.

  • 从调度器协商资源容器.

  • 跟踪资源容器状态.

  • 监控应用程序的进度.

We can scale the YARN beyond a few thousand nodes through YARN Federation feature. This feature enables us to tie multiple YARN clusters into a single massive cluster. This allows for using independent clusters, clubbed together for a very large job.

我们可以通过纱线联盟功能将纱线扩展到几千个节点之外.这个功能使我们能够将多个纱线簇成一个巨大的集群.这允许使用独立的集群,为一个非常大的工作聚集在一起.

iii. Features of Yarn

YARN has the following features:-

a. Multi-tenancy

A.多租户

YARN allows a variety of access engines (open-source or propriety) on the same Hadoop data set. These access engines can be of batch processing, real-time processing, iterative processing and so on.

YARN 允许在同一台设备上使用各种访问引擎 (开源或适当)Hadoop 数据集.这些访问引擎可以是批处理、实时处理、迭代处理等.

b. Cluster Utilization

With the dynamic allocation of resources, YARN allows for good use of the cluster. As compared to static map-reduce rules in previous versions of Hadoop which provides lesser utilization of the cluster.

通过资源的动态分配,YARN 可以很好地利用集群.与静态 map-reduce 规则相比,Hadoop 的早期版本这使得集群的利用率更低.

c. Scalability

Any data center processing power keeps on expanding. YARN’s ResourceManager focuses on scheduling and copes with the ever-expanding cluster, processing petabytes of data.

数据中心的处理能力不断扩大.YARN 的资源管理器专注于调度和处理不断扩展的集群,处理 pb 级数据.

d. Compatibility

MapReduce program developed for Hadoop 1.x can still on this YARN. And this is without any disruption to processes that already work.

为 Hadoop 1.X 开发的 MapReduce 程序仍然可以在这个纱线.这对已经开始工作的流程没有任何干扰.

Hadoop Quiz

Best Practices For Hadoop Architecture Design

Hadoop 架构设计的最佳实践

i. Embrace Redundancy Use Commodity Hardware

I. 商用硬件实现冗余

Many companies venture into Hadoop by business users or analytics group. The infrastructure folks peach in later. These people often have no idea about Hadoop. The result is the over-sized cluster which increases the budget many folds. Hadoop was mainly created for availing cheap storage and deep data analysis. To achieve this use JBOD i.e. Just a Bunch Of Disk. Also, use a single power supply.

许多公司冒险进入 Hadoop业务用户或分析组.基础设施的人在后面.这些人往往对 Hadoop 一无所知.结果是超大的集群增加了许多倍的预算.Hadoop 主要是为了利用廉价的存储和深度数据分析.为了实现这一点,使用 jbot,即一堆磁盘.此外,使用单个电源.

ii. Start Small and Keep Focus

Ii.从小做起,保持专注

Many projects fail because of their complexity and expense. To avoid this start with a small cluster of nodes and add nodes as you go along. Start with a small project so that infrastructure and development guys can understand the internal working of Hadoop.

许多项目因为其复杂性和费用而失败.为了避免这种情况,从一个小的节点集群开始,并在前进的过程中添加节点.从一个小项目开始,这样基础设施和开发人员就可以理解Hadoop 的内部工作.

iii. Create Procedure For Data Integration

Iii.数据集成的创建过程

One of the features of Hadoop is that it allows dumping the data first. And we can define the data structure later. We can get data easily with tools such as Flume and Sqoop. But it is essential to create a data integration process. This includes various layers such as staging, naming standards, location etc. Make proper documentation of data sources and where they live in the cluster.

其中一个Hadoop 的特点它允许首先转储数据.我们可以在后面定义数据结构.我们可以通过 Flume 和 Sqoop 等工具轻松获取数据.但是,创建数据集成过程是至关重要的.这包括各种层,如临时、命名标准、位置等.对数据源及其在集群中的位置进行适当的文档记录.

iv. Use Compression Technique

四、使用压缩技术

Enterprise has a love-hate relationship with compression. There is a trade-off between performance and storage. Although compression decreases the storage used it decreases the performance too. But Hadoop thrives on compression. It can increase storage usage by 80%.

企业与压缩有着爱恨的关系.性能和存储之间存在权衡.虽然压缩减少了存储空间,但它也降低了性能.但是 Hadoop 是靠压缩来发展的.它可以使存储使用量增加 80%.

v. Create Multiple Environments

创建多个环境

It is a best practice to build multiple environments for development, testing, and production. As** Apache Hadoop has a wide ecosystem**, different projects in it have different requirements. Hence there is a need for a non-production environment for testing upgrades and new functionalities.

为开发、测试和生产构建多个环境是最佳实践.作为Apache Hadoop 拥有广泛的生态系统不同的项目有不同的要求.因此,需要一个非生产环境来测试升级和新功能.

Summary

总结

Hence, in this Hadoop Application Architecture, we saw the design of Hadoop Architecture is such that it recovers itself whenever needed. Its redundant storage structure makes it fault-tolerant and robust. We are able to scale the system linearly. The MapReduce part of the design works on the principle of data locality. The Map-Reduce framework moves the computation close to the data. Therefore decreasing network traffic which would otherwise have consumed major bandwidth for moving large datasets. Thus overall architecture of Hadoop makes it economical, scalable and efficient big data technology.

因此,在这个 Hadoop 应用程序架构中,我们看到 Hadoop 架构的设计是这样的,它可以在需要的时候自行恢复.它的冗余存储结构使得它具有容错能力和健壮性.我们可以线性地扩展这个系统.设计的 MapReduce 部分工作在数据局部性原理.Map-Reduce 框架将计算移动到数据附近.因此,减少网络流量,否则移动大型数据集会消耗大量带宽.因此,Hadoop 的整体架构使得大数据技术经济、可扩展且高效.

Hadoop Architecture is a very important topic for your Hadoop Interview. We recommend you to once check most asked Hadoop Interview questions. You will get many questions from Hadoop Architecture.

Hadoop 架构对于你的 Hadoop 面试来说是一个非常重要的话题.我们向您推荐一次查看最常被问到的 Hadoop 面试问题.您将从 Hadoop 架构中获得许多问题.

Did you enjoy reading Hadoop Architecture? Do share your thoughts with us.

你喜欢看 Hadoop 架构吗?请与我们分享你的想法.

https://data-flair.training/blogs/hadoop-architecture

你可能感兴趣的:(004 详细介绍 Hadoop 架构-HDFS 、 Yarn 和 MapReduce)