hive权限管理

《Hive编程指南》第18章 安全有相关内容可参考。

环境:

HDP2.4 ,hive-1.2.0, ambari统一管理

目前hive支持简单的权限管理,默认情况下是不开启,这样所有的用户都具有相同的权限,同时也是超级管理员,也就对hive中的所有表都有查看和改动的权利,这样是不符合一般数据仓库的安全原则的。下面来介绍HIVE的权限管理。

Hive用户

1. Hive作为表存储层。
    使用对象:Hive's HCatalog API如Apache Pig, MapReduce 和一些大量的并行数据库, 有直接操作HDFS和元数据server的权限
2. Hive作为SQL查询引擎。
    a. hive cli: 同样有HDFS和Hive metastore操作权限
    b. ODBC/JDBC和其他HiveServer2 API users (Beeline CLI is an example),通过HiveServer2操作。
Hive三种授权模型

1. Storage Based Authorization in the Metastore Server(SBA)

 通常用于Metastore Server API的授权;hive用户1和2a, Hive配置不控制权限,通过HDFS文件进行 权限控制;hive用户2b 使用需要hive.server2.enable.doAs =true

注意:Hive 0.12.0版本之后开始支持;使用HDFS ACL(Hadoop 2.4以后版本支持)灵活控制
虽然能够保护Metastore中的元数据不被恶意用户破坏,可控制数据库、表和分区但是没有提供细粒度的访问控制(列级别、行级别)

hive-site.xml配置:

roperty>
hive.metastore.pre.event.listeners
 org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
turns on metastore-side security



hive.security.metastore.authorization.manager
  org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
This tells Hive which metastore-side authorization provider to use. The default setting uses DefaultHiveMetastoreAuthorizationProvider, which implements the standard Hive grant/revoke model. To use an HDFS permission-based model (recommended) to do your authorization, use StorageBasedAuthorizationProvider as instructed above.



hive.security.metastore.authenticator.manager
 org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
  authenticator manager class name to be used in the metastore for authentication.
  The user defined authenticator should implement interface 
  org.apache.hadoop.hive.ql.security.HiveAuthenticationProvider.
  



hive.security.metastore.authorization.auth.reads
 true
default=true,When this is set to true, Hive metastore authorization also checks for read access.

2. SQL Standards Based Authorization in HiveServer2(SSBA)

基于SQL标准的Hive授权为Hive授权提供了第三个选择,它完全兼容SQL的授权模型,不会给现在的用户带来向后兼容的问题,因此被推荐使用。一旦用户迁移到这种更加安全的授权机制后,默认的授权机制可以被遗弃掉。

基于SQL的授权模型可以和基于存储的授权模型(Hive Metastore Server)结合使用。

授权确认时是以提交SQL指令的用户身份为依据的,但SQL指令是以Hive Server用户身份(即Hive Server的进程用户)被执行的,因此Hive Server用户必须拥有相应目录(文件)的权限(根据SQL指令的不同,所需权限也不同)。

基于HiveServer2可提供行列级别的细粒度权限控制
在这种授权模型控制下,拥有权限使用Hive CLI、HDFS commands、Pig command line、'hadoop jar' 等工具(指令)的用户被称为特权用户。在一个组织(团队)内,仅仅一些需要执行ETL相关工作的团队需要这些特殊权限,这些工具的访问不经过HiveServer2,因此它们不受这种授权模型的控制。对于需要通过Hive CLI、Pig和MapReduce访问Hive表的用户,可以通过在Hive Metastore Server中启用Storage Based Authorization来进行相应的权限控制(否则hive cli无法进行权限控制)其它情况则可能需要结合Hadoop的安全机制进行。
限制:

(1)当授权enbled, 命令dfs, add, delete, compile和reset  disabled.

(2)transform clause被禁用。

(3)改变Hive配置的指令集合被限制为仅某些用户可以执行,可以通过hive.security.authorization.sqlstd.confwhitelist(hive-site.xml)进行配置。
(4)添加或删除函数、宏的权限被限制为仅具有admin角色的用户可以执行。
(5) 为了用户可以使用(自定义)函数,创建永久函数(create permanent functions)的功能被添加。拥有角色admin的用户可以执行该指令添加函数,所有添加的函数可以被所有的用户使用。

hive-site.xml配置:


hive.security.authorization.enabled
true



    hive.server2.enable.doAs
    false



    hive.users.in.admin.role
    hive



hive.security.authorization.manager 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory



hive.security.authenticator.manager 
org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator


注意:

(1)拥有admin角色的用户需要运行命令“set role admin"去获取admin角色的权限;

配置完这个可以在元数据表中查询到


    hive.users.in.admin.role
    hive
查询: select * from Role_Map;

a. [Important] Before restarting HiveServer2, firstly grant admin role to the user in Beeline.

grant admin to user mapr;
This is to make sure the specified admin user has the admin role.
If we ignore this step in Hive 0.13, then later we can not set the role to admin even if the user is specified in  hive.users.in.admin.role.
For example:
0: jdbc:hive2://xxx:10000/default> set hive.users.in.admin.role;
+----------------------------------------------+
|                     set                      |
+----------------------------------------------+
| hive.users.in.admin.role=mapr                |
+----------------------------------------------+
1 row selected (0.05 seconds)

0: jdbc:hive2://xxx:10000/default> set role admin;    
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. mapr doesn't belong to role admin (state=08S01,code=1)
b. Start HiveServer2 with the following additional command-line options.

-hiveconf hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
-hiveconf hive.security.authorization.enabled=true
-hiveconf hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
c.  Test admin role.

0: jdbc:hive2://xxx:xxx/default> set role admin;                                           
No rows affected (0.824 seconds)
0: jdbc:hive2://xxx:xxx/default> show current roles;
+--------+
|  role  |
+--------+
| admin  |
|        |
+--------+
2 rows selected (0.391 seconds)

(2)HiveServer2可以配置了用嵌入的metastore( 这个没有测试)

hiveserver2-site.xml的配置:


hive.security.authorization.enabled
true



hive.security.authorization.manager 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory



hive.security.authenticator.manager 
org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator



hive.metastore.uris 
thrift://localhost:9083

3. Hive Default Authorization

 缺陷:类似于关系型的授权模式,但用户的授权许可没有定义,任何用户都可以授权和撤回
 类似2,提供grant/revoke控制权限,Hive Cli支持

Hive授权定义在不同级别: Users Groups Roles, 其中之一的权限checks通过, hive操作即可执行

默认情况,metastore使用HadoopDefaultAuthenticator

hive-site.xml配置:

  
  hive.security.authorization.enabled   
  true  
  Enable or disable the hive client authorization  
 

  
  hive.security.authorization.createtable.owner.grants  
  ALL  
  The privileges automatically granted to the owner whenever  
  a table gets created.An example like "select,drop" will grant select  
  and drop privilege to the owner of the table  
 


  
  hive.security.authorization.createtable.user.grants  
  ALL  
 


  
  hive.security.authorization.createtable.group.grants  
  ALL  
 


  
  hive.security.authorization.createtable.role.grants  
  ALL  



hive.security.authorization.manager 
org.apache.hadoop.hive.ql.security.authorization.DefaultHiveMetastoreAuthorizationProvider
The hive client authorization manager class name.  



hive.security.authenticator.manager 
org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator 

我的配置:


Ambari配置按上面配置会有一个报错:hive.security.authorization.enabled= true后hive_security_authorization should not be None if hive.security.authorization.enabled is set。

如何配置:https://issues.apache.org/jira/browse/AMBARI-11575

Privileges, Users 和Roles

(1) Privileges可以被授权给Users和Roles;

(2) Users可以有一个或多个角色

默认角色都是public, 所有的用户都有public角色。只有Admin角色可以create/drop/set/show roles.

元数据表:

Db_privs:记录了User/Role在DB上的权限
Tbl_privs:记录了User/Role在table上的权限
Tbl_col_privs:记录了User/Role在table column上的权限
Roles:记录了所有创建的role
Role_map:记录了User与Role的对应关系

hive> set role admin;
hive> show roles;
OK
admin
public
role_test1
hive> create role role_test2;OK
Time taken: 0.644 seconds
hive> drop  role role_test1;
hive> show roles;
OK
admin
public
role_test2

Grant/Revoke Roles

SHOW GRANT principal_specification
[ON object_specification [(column_list)]]
 
principal_specification:
    USER user
  | GROUP group
  | ROLE role
 
object_specification:
    TABLE tbl_name
  | DATABASE db_name

eg:GRANT UPDATE  ON table test to user hive;
hive> GRANT UPDATE  ON table test to user hive;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Permission denied: Principal [name=hive, type=USER] does not have following privileges for operation GRANT_PRIVILEGE [[UPDATE with grant] on Object [type=TABLE_OR_VIEW, name=default.test]]
hive> set role admin;
OK
Time taken: 0.037 seconds
hive> show current roles;
OK
admin
Time taken: 0.13 seconds, Fetched: 1 row(s)
hive> GRANT UPDATE  ON table test to user hive;
OK
Time taken: 1.006 seconds
hive> alter table test rename to test01;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. No privilege 'Alter' found for outputs { database:default, table:test}
hive> create database test02;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:No privilege 'Create' found for outputs { database:test02})
这个不明白,为什么不能正常的授权??
0: jdbc:hive2://XXXX:10000> GRANT select ON DATABASE test TO USER hive;
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error getting object from metastore for Object [type=DATABASE, name=test] (state=08S01,code=1)
hive> GRANT SELECT ON TABLE test TO USER hive;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Permission denied: Principal [name=hive, type=USER] does not have following privileges for operation GRANT_PRIVILEGE [[SELECT with grant] on Object [type=TABLE_OR_VIEW, name=default.test]]
hive> set role admin;
OK
Time taken: 0.06 seconds
hive> GRANT SELECT ON TABLE test TO USER hive;
OK
Time taken: 0.566 seconds
hive> GRANT ALL ON TABLE test TO USER hive; #######这里不明白#######
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error granting privileges: null 
 
  
hive> revoke insert on table test_hive from user hive;
OK
Time taken: 0.191 seconds
hive> show grant user hive on table test_hive;
OK
default    test_hive            hive    USER    DELETE    true    1467708923000    hive
default    test_hive            hive    USER    SELECT    true    1467708923000    hive
default    test_hive            hive    USER    UPDATE    true    1467708923000    hive
Time taken: 0.031 seconds, Fetched: 3 row(s)
hive> insert into test_hive values(5,"rows5");
Query ID = hive_20160707015802_fadf5833-7367-4538-b2a9-71230cd10efd
Total jobs = 1
Launching Job 1 out of 1
Tez session was closed. Reopening...
Session re-established.


Status: Running (Executing on YARN cluster with App id application_1467857886938_0008)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 56.18 s    
--------------------------------------------------------------------------------
Loading data to table default.test_hive
Failed with exception Unable to alter table. No privilege 'Alter' found for outputs { database:default, table:test_hive}
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
hivemetastore.log报错
2016-07-06 23:46:52,613 ERROR [pool-3-thread-161]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(159)) - MetaException(message:No privilege 'Alter' found for outputs { database:default, table:test}) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2026)

一个很常见的错误:

[root@l1035lab ~]# hive
Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.4.2-2/hive/lib/hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException:
Permission denied: user=root, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)

解决办法:You need to have a user home directory on HDFS. Log as the HDFS user and create a home dir for root.
# su  hdfs
$ hdfs dfs -mkdir /user/root
$ hdfs dfs -chown root:root /user/root
如果已经有了/user/root ,查看下文件的权限
[root@ambari-2 ~]# hdfs dfs -ls /user
Found 6 items
drwxrwx---   - ambari-qa hdfs          0 2016-07-01 04:56 /user/ambari-qa
drwxr-xr-x   - hcat      hdfs          0 2016-06-29 00:13 /user/hcat
drwxr-xr-x   - hdfs      hdfs          0 2016-06-30 18:26 /user/hdfs
drwxr-xr-x   - hive      hdfs          0 2016-07-05 04:21 /user/hive
drwxr-xr-x   - root      root          0 2016-07-05 02:07 /user/root
drwxr-xr-x   - tom       tom           0 2016-07-06 20:20 /user/tom

参考文献:

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization#LanguageManualAuthorization-1StorageBasedAuthorizationintheMetastoreServer

http://m.blog.csdn.net/article/details?id=51312153

http://hadooptutorial.info/hive-authorization-models-and-hive-security/

http://www.cnblogs.com/yurunmiao/p/4441735.html

点击打开链接



你可能感兴趣的:(Hadoop,hive权限管理)