目录
1. 创建Spark Resource
2. 分配资源权限
进入正文之前,欢迎订阅专题、对博文点赞、评论、收藏,关注IT贫道,获取高质量博客内容!
宝子们订阅、点赞、收藏不迷路!抓紧订阅专题!
Spark 作为一种外部计算资源在 Doris 中用来完成ETL工作,因此我们引入 resource management 来管理 Doris 使用的这些外部资源。在Doris中提交Spark Load任务之前需要创建执行ETL任务的Spark Resource ,创建Spark Resource的语法如下:
-- create spark resource
CREATE EXTERNAL RESOURCE resource_name
PROPERTIES
(
type = spark,
spark_conf_key = spark_conf_value,
working_dir = path,
broker = broker_name,
broker.property_key = property_value
)
-- drop spark resource
DROP RESOURCE resource_name
-- show resources
SHOW RESOURCES
SHOW PROC "/resources"
这里我们创建Spark Resource,Spark Resource 可以指定成Spark Standalone client模式、cluster模式,也可以指定成Yarn Client、Yarncluster模式,下面以Yarn Cluster 模式对Spark Resource进行演示。
-- spark standalone client 模式,注意:目前测试standalone client和cluster有问题,不能获取执行任务对应的appid导致后续不能向doris加载数据
CREATE EXTERNAL RESOURCE "spark0"
PROPERTIES
(
"type" = "spark",
"spark.master" = "spark://node1:7077",
"spark.submit.deployMode" = "client",
"working_dir" = "hdfs://node1:8020/tmp/doris-standalone",
"broker" = "broker_name"
);
-- yarn cluster 模式
CREATE EXTERNAL RESOURCE "spark1"
PROPERTIES
(
"type" = "spark",
"spark.master" = "yarn",
"spark.submit.deployMode" = "cluster",
"spark.executor.memory" = "1g",
"spark.hadoop.yarn.resourcemanager.address" = "node1:8032",
"spark.hadoop.fs.defaultFS" = "hdfs://node1:8020",
"working_dir" = "hdfs://node1:8020/tmp/doris-yarn",
"broker" = "broker_name"
);
注意:
当Spark Resource创建完成之后,可以通过以下命令查看和删除Resources:
#查看Resources
mysql> show resources;
+--------+--------------+-------------------------+---------------------------------------+
| Name | ResourceType | Item | Value |
+--------+--------------+-------------------------+---------------------------------------+
| spark0 | spark | spark.master | spark://node1:8088 |
| spark0 | spark | spark.submit.deployMode | client |
| spark0 | spark | working_dir | hdfs://mycluster/tmp/doris-standalone |
| spark0 | spark | broker | broker_name |
+--------+--------------+-------------------------+---------------------------------------+
#删除Resources
mysql> drop resource spark0;
Query OK, 0 rows affected (0.03 sec)
重新创建以上spark0,spark1 resource资源,方便后续Spark Load使用。
普通账户只能看到自己有 USAGE_PRIV 使用权限的资源,root和admin 账户可以看到所有的资源。资源权限通过 GRANT REVOKE 来管理,目前仅支持 USAGE_PRIV 使用权限。可以将 USAGE_PRIV 权限赋予某个用户,操作如下:
-- 授予spark0资源的使用权限给用户user0
GRANT USAGE_PRIV ON RESOURCE "spark0" TO "user0"@"%";
-- 授予所有资源的使用权限给用户user0
GRANT USAGE_PRIV ON RESOURCE * TO "user0"@"%";
-- 撤销用户user0的spark0资源使用权限
REVOKE USAGE_PRIV ON RESOURCE "spark0" FROM "user0"@"%";
这里我们使用的用户为root,所以不必再进行资源权限赋权。