Windows Azure HDInsight

Azure HDInsight 是在云中部署并设置 Apache™ Hadoop™ 群集,从而提供旨在管理、分析大数据以及对大数据进行报告的软件框架的服务。它使 HDFS/MapReduce 软件框架和相关项目(如 Pig、Hive 和 Sqoop)能够在更简单、更灵活和经济高效的环境中使用。

Azure HDInsight 是云中100%的基于Hadoop的服务:

1)按需扩展,能够处理PB级的数据

2)处理结构化和非结构化的数据

3)使用Java或.NET进行开发

4)无需购买和维护硬件

5)在Windows或Linux平台皆可部署

6)几分钟内即可创建Hadoop群集

7)在Excel中既可展示

8)轻松集成本地Hadoop群集

1.Whatis Hadoop in HDInsight?

Azure HDInsight deploys and provisions Apache Hadoop clusters in the cloud, providing a software framework designed to manage, analyze, and report on big data with high reliability and availability.

Azure HDInsight 通过云中部署和配置Apache Hadoop 群集,提供一个基于大数据的高可靠性、高可用性管理,分析和报告软件框架。

Hadoop often refers to the entire Hadoop ecosystem of components, which includes Stormand HBase clusters, as well as other technologies under the Hadoop umbrella.

Hadoop 通常指的是整个Hadoop 组件系统,包括Stormand HBase群集,同时也包括其他Hadoop技术体系下的组件。

It includes implementations of Storm, HBase, Pig, Hive, Sqoop, Oozie, Ambari, andso on. HDInsight also integrates with business intelligence (BI) tools such asExcel, SQL Server Analysis Services, and SQL Server Reporting Services.

它包含了Storm, HBase, Pig, Hive, Sqoop, Oozie, Ambari等的实现,HDInsight 还集成了BI工具,比如Excel,SQL Server 数据分析服务,SQL Server 报表服务。

AzureHDInsight deploys and provisions Hadoop clusters in the cloud, by using either Linux or Windows as the underlying OS.

*)HDInsighton Linux (Preview) - A Hadoop cluster on Ubuntu. Use this if you are familiarwith Linux or Unix, are migrating from an existing Linux-based Hadoop solution,or want easy integration with Hadoop ecosystem components built for Linux.

*)HDInsighton Windows - A Hadoop cluster on Windows Server. Use this if you are familiarwith Windows, are migrating from an existing Windows-based Hadoop solution, orwant to integrate with .NET or other Windows capabilities.

The following table compares the two:

CATEGORY HADOOP ON LINUX HADOOP ON WINDOWS
Cluster OS Ubuntu 12.04 Long Term Support (LTS) Windows Server 2012 R2
Cluster Type Hadoop Hadoop, HBase, Storm
Deployment Azure Portal, Azure CLI, Azure PowerShell Azure Portal, Azure CLI, Azure PowerShell, HDInsight .NET SDK
Cluster UI Ambari Cluster Dashboard
Remote Access Secure Shell (SSH) Remote Desktop Protocol (RDP)

2.Whatare the Hadoop components

Inaddition to the previous overall configurations, the following individualcomponents are also included on HDInsight clusters.

 1)Ambari:Cluster provisioning, management, and monitoring.

 2)Avro(Microsoft .NET Library for Avro): Data serialization for the Microsoft .NETenvironment.

 3)Hive& HCatalog: Structured Query Language (SQL)-like querying, and a table andstorage management layer.

 4)Mahout:Machine learning.

 5)MapReduceand YARN: Distributed processing and resource management.

 6)Oozie:Workflow management.

 7)Phoenix:Relational database layer over HBase.

 8)Pig:Simpler scripting for MapReduce transformations.

 9)Sqoop:Data import and export.

 10)Tez:Allows data-intensive processes to run efficiently at scale.

 11)ZooKeeper:Coordination of processes in distributed systems.

3.Advantagesof Hadoop in the cloud

As part of the Azurecloud ecosystem, Hadoop in HDInsight offers a number of benefits, among them:

·  Automatic provisioning ofHadoop clusters. HDInsight clusters are much easier to create than manuallyconfiguring Hadoop clusters. For details, see Provision Hadoopclusters in HDInsight.

·  State-of-the-art Hadoop components. For details, see What's new inthe Hadoop cluster versions provided by HDInsight?.

·  High availability andreliability of clusters. See Availability andreliability of Hadoop clusters in HDInsight for details.

·  Efficient and economical datastorage with Azure Blob storage, a Hadoop-compatible option. See Use Azure Blobstorage with Hadoop in HDInsight fordetails.

·  Integration with other Azureservices, including Web apps and SQL Database.

·  Low entry cost. Start a free trial,or consult HDInsightpricing details.

Toread more about the advantages on Hadoop in HDInsight, see the Azure featurespage for HDInsight.





你可能感兴趣的:(azure,HDinsight)