hadoopsdk使用_使用 .NET SDK 管理 HDInsight 中的 Apache Hadoop 群集

您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

使用 .NET SDK 管理 HDInsight 中的 Apache Hadoop 群集Manage Apache Hadoop clusters in HDInsight by using .NET SDK

05/14/2018

本文内容

了解如何使用 HDInsight.NET SDK 管理 HDInsight 群集。Learn how to manage HDInsight clusters using HDInsight.NET SDK.

先决条件Prerequisites

在开始阅读本文前,必须具有:Before you begin this article, you must have the following:

Azure 订阅。An Azure subscription.

连接到 Azure HDInsightConnect to Azure HDInsight

需要以下 NuGet 包:You need the following NuGet packages:

Install-Package Microsoft.Rest.ClientRuntime.Azure.Authentication -Pre

Install-Package Microsoft.Azure.Management.ResourceManager -Pre

Install-Package Microsoft.Azure.Management.HDInsight

以下代码示例演示了连接到 Azure 以管理 Azure 订阅下面的 HDInsight 群集的方法。The following code sample shows you how to connect to Azure before you can administer HDInsight clusters under your Azure subscription.

using System;

using Microsoft.Azure;

using Microsoft.Azure.Management.HDInsight;

using Microsoft.Azure.Management.HDInsight.Models;

using Microsoft.Azure.Management.ResourceManager;

using Microsoft.IdentityModel.Clients.ActiveDirectory;

using Microsoft.Rest;

using Microsoft.Rest.Azure.Authentication;

namespace HDInsightManagement

{

class Program

{

private static HDInsightManagementClient _hdiManagementClient;

// Replace with your AAD tenant ID if necessary

private const string TenantId = UserTokenProvider.CommonTenantId;

private const string SubscriptionId = "";

// This is the GUID for the PowerShell client. Used for interactive logins in this example.

private const string ClientId = "1950a258-227b-4e31-a9cf-717495945fc2";

static void Main(string[] args)

{

// Authenticate and get a token

var authToken = Authenticate(TenantId, ClientId, SubscriptionId);

// Flag subscription for HDInsight, if it isn't already.

EnableHDInsight(authToken);

// Get an HDInsight management client

_hdiManagementClient = new HDInsightManagementClient(authToken);

// insert code here

System.Console.WriteLine("Press ENTER to continue");

System.Console.ReadLine();

}

///

/// Authenticate to an Azure subscription and retrieve an authentication token

///

static TokenCloudCredentials Authenticate(string TenantId, string ClientId, string SubscriptionId)

{

var authContext = new AuthenticationContext("https://login.microsoftonline.com/" + TenantId);

var tokenAuthResult = authContext.AcquireToken("https://management.core.windows.net/",

ClientId,

new Uri("urn:ietf:wg:oauth:2.0:oob"),

PromptBehavior.Always,

UserIdentifier.AnyUser);

return new TokenCloudCredentials(SubscriptionId, tokenAuthResult.AccessToken);

}

///

/// Marks your subscription as one that can use HDInsight, if it has not already been marked as such.

///

/// This is essentially a one-time action; if you have already done something with HDInsight

/// on your subscription, then this isn't needed at all and will do nothing.

/// An authentication token for your Azure subscription

static void EnableHDInsight(TokenCloudCredentials authToken)

{

// Create a client for the Resource manager and set the subscription ID

var resourceManagementClient = new ResourceManagementClient(new TokenCredentials(authToken.Token));

resourceManagementClient.SubscriptionId = SubscriptionId;

// Register the HDInsight provider

var rpResult = resourceManagementClient.Providers.Register("Microsoft.HDInsight");

}

}

}

运行此程序时,会出现提示。You shall see a prompt when you run this program.

列出群集List clusters

以下代码段列出了群集和一些属性:The following code snippet lists clusters and some properties:

var results = _hdiManagementClient.Clusters.List();

foreach (var name in results.Clusters) {

Console.WriteLine("Cluster Name: " + name.Name);

Console.WriteLine("\t Cluster type: " + name.Properties.ClusterDefinition.ClusterType);

Console.WriteLine("\t Cluster location: " + name.Location);

Console.WriteLine("\t Cluster version: " + name.Properties.ClusterVersion);

}

删除群集Delete clusters

使用以下代码段以同步或异步方式删除群集:Use the following code snippet to delete a cluster synchronously or asynchronously:

_hdiManagementClient.Clusters.Delete("", "");

_hdiManagementClient.Clusters.DeleteAsync("", "");

缩放群集Scale clusters

使用群集缩放功能可更改 Azure HDInsight 中运行的群集使用的工作节点数,而无需重新创建群集。The cluster scaling feature allows you to change the number of worker nodes used by a cluster that is running in Azure HDInsight without having to re-create the cluster.

备注

只支持使用 HDInsight 3.1.3 或更高版本的群集。Only clusters with HDInsight version 3.1.3 or higher are supported. 如果不确定群集的版本,可以查看“属性”页。If you are unsure of the version of your cluster, you can check the Properties page.

更改 HDInsight 支持的每种类型的群集所用数据节点数的影响:The impact of changing the number of data nodes for each type of cluster supported by HDInsight:

Apache HadoopApache Hadoop

可以顺利地增加正在运行的 Hadoop 群集中的辅助节点数,而不会影响任何挂起或运行中的作业。You can seamlessly increase the number of worker nodes in a Hadoop cluster that is running without impacting any pending or running jobs. 还可以在操作进行中提交新作业。New jobs can also be submitted while the operation is in progress. 系统会正常处理失败的缩放操作,让群集始终保持正常运行状态。Failures in a scaling operation are gracefully handled so that the cluster is always left in a functional state.

减少数据节点数目以缩减 Hadoop 群集时,系统会重新启动群集中的某些服务。When a Hadoop cluster is scaled down by reducing the number of data nodes, some of the services in the cluster are restarted. 这会导致所有正在运行和挂起的作业在缩放操作完成时失败。This causes all running and pending jobs to fail at the completion of the scaling operation. 但是,可以在操作完成后重新提交这些作业。You can, however, resubmit the jobs once the operation is complete.

Apache HBaseApache HBase

可以顺利地在 HBase 群集运行时对其添加或删除节点。You can seamlessly add or remove nodes to your HBase cluster while it is running. 在完成缩放操作后的几分钟内,区域服务器就能自动平衡。Regional Servers are automatically balanced within a few minutes of completing the scaling operation. 不过,也可以手动平衡区域服务器,方法是登录到群集的头节点,并在命令提示符窗口中运行以下命令:However, you can also manually balance the regional servers by logging into the headnode of cluster and running the following commands from a command prompt window:

>pushd %HBASE_HOME%\bin

>hbase shell

>balancer

Apache StormApache Storm

可以顺利地在 Storm 群集运行时对其添加或删除数据节点。You can seamlessly add or remove data nodes to your Storm cluster while it is running. 但是,在缩放操作成功完成后,需要重新平衡拓扑。But after a successful completion of the scaling operation, you will need to rebalance the topology.

可以使用两种方法来完成重新平衡操作:Rebalancing can be accomplished in two ways:

Storm Web UIStorm web UI

命令行界面 (CLI) 工具Command-line interface (CLI) tool

Please refer to the Apache Storm documentation for more details.

HDInsight 群集上提供了 Storm Web UI:The Storm web UI is available on the HDInsight cluster:

以下是有关如何使用 CLI 命令重新平衡 Storm 拓扑的示例:Here is an example how to use the CLI command to rebalance the Storm topology:

## Reconfigure the topology "mytopology" to use 5 worker processes,

## the spout "blue-spout" to use 3 executors, and

## the bolt "yellow-bolt" to use 10 executors

$ storm rebalance mytopology -n 5 -e blue-spout=3 -e yellow-bolt=10

以下代码片段会显示如何以同步或异步方式调整群集的大小:The following code snippet shows how to resize a cluster synchronously or asynchronously:

_hdiManagementClient.Clusters.Resize("", "", );

_hdiManagementClient.Clusters.ResizeAsync("", "", );

授予/撤消访问权限Grant/revoke access

HDInsight 群集提供以下 HTTP Web 服务(所有这些服务都有 REST 样式的终结点):HDInsight clusters have the following HTTP web services (all of these services have RESTful endpoints):

ODBCODBC

JDBCJDBC

Apache AmbariApache Ambari

Apache OozieApache Oozie

Apache TempletonApache Templeton

默认情况下,将授权这些服务进行访问。By default, these services are granted for access. 可以撤消/授予访问权限。You can revoke/grant the access. 若要撤消:To revoke:

var httpParams = new HttpSettingsParameters

{

HttpUserEnabled = false,

HttpUsername = "admin",

HttpPassword = "*******",

};

_hdiManagementClient.Clusters.ConfigureHttpSettings(", , httpParams);

若要授予:To grant:

var httpParams = new HttpSettingsParameters

{

HttpUserEnabled = enable,

HttpUsername = "admin",

HttpPassword = "*******",

};

_hdiManagementClient.Clusters.ConfigureHttpSettings(", , httpParams);

备注

授予/撤消访问权限时,将重设群集用户的用户名和密码。By granting/revoking the access, you will reset the cluster user name and password.

更新 HTTP 用户凭据Update HTTP user credentials

此过程与授予/撤销 HTTP 访问权限相同。It is the same procedure as Grant/revoke HTTP access. 如果已授予群集 HTTP 访问权限,必须先撤销该权限。If the cluster has been granted the HTTP access, you must first revoke it. 然后再使用新的 HTTP 用户凭据授予访问权限。And then grant the access with new HTTP user credentials.

查找默认存储帐户Find the default storage account

以下代码段演示如何获取群集的默认存储帐户名称和默认存储帐户密钥。The following code snippet demonstrates how to get the default storage account name and the default storage account key for a cluster.

var results = _hdiManagementClient.Clusters.GetClusterConfigurations(, , "core-site");

foreach (var key in results.Configuration.Keys)

{

Console.WriteLine(String.Format("{0} => {1}", key, results.Configuration[key]));

}

提交作业Submit jobs

提交 MapReduce 作业To submit MapReduce jobs

提交 Apache Hive 作业To submit Apache Hive jobs

提交 Apache Sqoop 作业To submit Apache Sqoop jobs

提交 Apache Oozie 作业To submit Apache Oozie jobs

将数据上传到 Azure Blob 存储Upload data to Azure Blob storage

另请参阅See Also

你可能感兴趣的:(hadoopsdk使用)