Clouder CDH3B3开始后hadoop.job.ugi不再生效

Clouder CDH3B3开始后hadoop.job.ugi不再生效!


困扰了我好几天的,终于找到了原因。以前公司用的原版hadoop-0.20.2,使用java设置 hadoop.job.ugi为正确的hadoop用户和组即可正常访问hdfs并可创建删除等。

更新到CDH3B4后,再这样搞不成,找了很多资料,无有原因。终于找到了 请看:


The hadoop.job.ugi configuration no longer has any effect. Instead, please use the UserGroupInformation.doAs API to impersonate other users on a non-secured cluster. (As of CDH3b3)

hadoop.job.ugi配置不再生效。取而代之的,请使用UserGroupInformation.doAs 方法 来使用其他用户操作,这时集群不认为是安全的。


与之前不兼容的更改:


  • The TaskTracker configuration parameter mapreduce.tasktracker.local.cache.numberdirectories has been renamed to mapreduce.tasktracker.cache.local.numberdirectories. (As of CDH3u0)
  • The Job-level configuration parameters mapred.max.maps.per.node, mapred.max.reduces.per.node,mapred.running.map.limit, and mapred.running.reduce.limit configurations have been removed. (As of CDH3b4)
  • CDH3 no longer contains packages for Debian Lenny, Ubuntu Hardy, Jaunty, or Karmic. Checkout these upgrade instructions if you are using an Ubuntu release past its end of life. If you are using a release for which Cloudera's Debian or RPM packages are not available, you can always use the tarballs from the CDH download page. (As of CDH3b4)
  • The hadoop.job.ugi configuration no longer has any effect. Instead, please use theUserGroupInformation.doAs API to impersonate other users on a non-secured cluster. (As of CDH3b3)
  • The UnixUserGroupInformation class has been removed. Please see the new methods in theUserGroupInformation class. (As of CDH3b3)
  • The resolution of groups for a user is now performed on the server side. For a user's group membership to take effect, it must be visible on the NameNode and JobTracker machines. (As of CDH3b3)
  • The mapred.tasktracker.procfsbasedprocesstree.sleeptime-before-sigkill configuration has been renamed to mapred.tasktracker.tasks.sleeptime-before-sigkill. (As of CDH3b3)
  • The HDFS and MapReduce daemons no longer run as a single shared hadoop user. Instead, the HDFS daemons run as hdfs and the MapReduce daemons run as mapred. See Changes in User Accounts and Groups in CDH3. (As of CDH3b3)
  • Due to a change in the internal compression APIs, CDH3 is incompatible with versions of the hadoop-lzo open source project prior to 0.4.9. (As of CDH3b3)
  • CDH3 changes the wire format for Hadoop's RPC mechanism. Thus, you must upgrade any existing client software at the same time as the cluster is upgraded. (All versions)
  • Zero values for the dfs.socket.timeout and dfs.datanode.socket.write.timeout configuration parameters are now respected. Previously zero values for these parameters resulted in a 5 second timeout. (As of CDH3u1)
  • When Hadoop's Kerberos integration is enabled, it is now required that either kinit be on the path for user accounts running the Hadoop client, or that the hadoop.kerberos.kinit.command configuration option be manually set to the absolute path to kinit. (As of CDH3u1)

Hive

  • The upgrade of Hive from CDH2 to CDH3 requires several manual steps. Please be sure to follow the upgrade guide closely. See Upgrading Hive and Hue in CDH3.
地址:  https://ccp.cloudera.com/display/CDHDOC/Incompatible+Changes


继续那个问题  ,如何使用 UserGroupInformation.doAs呢?

加入oozie想访问hdfs,但是只有joe可以正常访问hdfs。这是oozie就需要扮成joe。

......
UserGroupInformation ugi = 
                     UserGroupInformation.createProxyUser(user, UserGroupInformation.getLoginUser());
             ugi.doAs(new PrivilegedExceptionAction<Void>() {
               public Void run() throws Exception {
                 //Submit a job
                 JobClient jc = new JobClient(conf);
                 jc.submitJob(conf);
                 //OR access hdfs
                 FileSystem fs = FileSystem.get(conf);
                 fs.mkdir(someFilePath); 
               }
             }

需要在 namenode and jobtracker 上配置如下:

             <property>
               <name>hadoop.proxyuser.oozie.groups</name>
               <value>group1,group2</value>
               <description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>
             </property>
             <property>
               <name>hadoop.proxyuser.oozie.hosts</name>
               <value>host1,host2</value>
               <description>The superuser can connect only from host1 and host2 to impersonate a user</description>
             </property>
如果没有配置的话,不会成功。

Caveats

The superuser must have kerberos credentials to be able to impersonate another user. It cannot use delegation tokens for this feature. It would be wrong if superuser adds its own delegation token to the proxy user ugi, as it will allow the proxy user to connect to the service with the privileges of the superuser.

However, if the superuser does want to give a delegation token to joe, it must first impersonate joe and get a delegation token for joe, in the same way as the code example above, and add it to the ugi of joe. In this way the delegation token will have the owner as joe.



Secure Impersonation using UserGroupInformation.doAs详细讲解 请见

http://hadoop.apache.org/common/docs/stable/Secure_Impersonation.html

按照上面的话,javacode 访问hadoop 去正常操作,需要实现kerberos 认证,且配置,采用UserGroupInformation.doAs 方式。

 如果不这样做,应用必须要在hadoop用户下才可以正常操作了?!

你可能感兴趣的:(mapreduce,hadoop,Debian,Parameters,token,credentials)