第一章 dolphinscheduler基础环境搭建

  • 官方链接
https://dolphinscheduler.apache.org

1、准备工作

(1)解压安装包

tar -xzvf apache-dolphinscheduler-1.3.9-bin.tar.gz -C /opt/module/
apache-dolphinscheduler-1.3.9-bin

(2)将JDK软连接到/usr/bin目录下

sudo ln -s /opt/module/jdk1.8.0_212/bin/java /usr/bin/java

2、mysql操作

(1)将mysql的驱动包放到/lib驱动目录下

cp /opt/software/3_mysql/mysql-connector-java-5.1.37.jar  /opt/module/dolphinscheduler-bin/lib/

(2)在mysql中建立dolphinscheduler数据库

CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'atguigu'@'%' IDENTIFIED BY '123456';

GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'atguigu'@'localhost' IDENTIFIED BY '123456';

flush privileges;

(3)修改配置datasource.properties

vi conf/datasource.properties

#修改如下配置
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://192.168.6.102:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&allowMultiQueries=true

spring.datasource.username=atguigu
spring.datasource.password=123456

(4)将mysql驱动拷贝到lib目录下

cp mysql-connector-java-5.1.37.jar /opt/module/dolphinscheduler-bin/lib/

(5)使配置文件生效

sh script/create-dolphinscheduler.sh

3、环境配置

(1)修改conf/config/install_config.conf配置文件

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# ---------------------------------------------------------
# INSTALL MACHINE
# ---------------------------------------------------------
# A comma separated list of machine hostname or IP would be installed DolphinScheduler,
# including master, worker, api, alert. If you want to deploy in pseudo-distributed
# mode, just write a pseudo-distributed hostname
# Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IP: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5"
# 在哪些机器上部署 DS 服务,本机选 localhost、集群则按照如下选择
ips="hadoop102,hadoop103,hadoop104"

# Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
# modify it if you use different ssh port
# ssh 端口,默认22
sshPort="22"

# A comma separated list of machine hostname or IP would be installed Master server, it
# must be a subset of configuration `ips`.
# Example for hostnames: ips="ds1,ds2", Example for IP: ips="192.168.8.1,192.168.8.2"
# master 服务部署在哪台机器上,本机local,集群则如下
masters="hadoop102"

# A comma separated list of machine : or :.All hostname or IP must be a
# subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
# Example for hostnames: ips="ds1:default,ds2:default,ds3:default", Example for IP: ips="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
# worker 服务部署在哪台机器上,并指定此 worker 属于哪一个 worker 组,下面示例的 default 即为组名
workers="hadoop102:default,hadoop103:default,hadoop104:default"

# A comma separated list of machine hostname or IP would be installed Alert server, it
# must be a subset of configuration `ips`.
# Example for hostnames: ips="ds3", Example for IP: ips="192.168.8.3"
# 报警服务部署在哪台机器上
alertServer="hadoop103"

# A comma separated list of machine hostname or IP would be installed API server, it
# must be a subset of configuration `ips`.
# Example for hostnames: ips="ds1", Example for IP: ips="192.168.8.1"
# 后端 api 服务部署在在哪台机器上
apiServers="hadoop102"

# The directory to install DolphinScheduler for all machine we config above. It will automatically created by `install.sh` script if not exists.
# **DO NOT** set this configuration same as the current path (pwd)
# 将 DS 安装到哪个目录,如: /opt/soft/dolphinscheduler,不同于现在的目录
installPath="/opt/module/dolphinscheduler"

# The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before run `install.sh`
# script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs
# to be created by this user
# 使用哪个用户部署,集群用户之间必须配置免密登录
deployUser="atguigu"

# The directory to store local data for all machine we config above. Make sure user `deployUser` have permissions to read and write this directory.
# 资源上传根路径,主持 HDFS 和 S3,由于 hdfs支持本地文件系统,需要确保本地文件夹存在且有读写权限
dataBasedirPath="/tmp/dolphinscheduler"

# ---------------------------------------------------------
# DolphinScheduler ENV
# ---------------------------------------------------------
# JAVA_HOME, we recommend use same JAVA_HOME in all machine you going to install DolphinScheduler
# and this configuration only support one parameter so far.
# java目录配置
javaHome="/opt/module/jdk1.8.0_212"

# DolphinScheduler API service port, also this your DolphinScheduler UI component's URL port, default values is 12345
apiServerPort="12345"

# ---------------------------------------------------------
# Database
# NOTICE: If database value has special characters, such as `.*[]^${}\+?|()@#&`, Please add prefix `\` for escaping.
# ---------------------------------------------------------
# The type for the metadata database
# Supported values: ``postgresql``, ``mysql``.
# 配置存储元数据的数据库,mysql pg两种类型,选一个即可
dbtype="mysql"

# The : connection pair DolphinScheduler connect to the metadata database
# 数据库连接地址
dbhost="hadoop102:3306"

# The username DolphinScheduler connect to the metadata database
# mysql登录用户名,需要修改为上面设置的 {user} 具体值
username="atguigu"

# The password DolphinScheduler connect to the metadata database
# 数据库密码,如果有特殊字符,请使用 \ 转义,需要修改为上面设置的 {password} 具体值
password="123456"

# The database DolphinScheduler connect to the metadata database
# 数据库名,即databases
dbname="dolphinscheduler"

# ---------------------------------------------------------
# Registry Server
# ---------------------------------------------------------
# Registry Server plugin dir. DolphinScheduler will find and load the registry plugin jar package from this dir.
# For now default registry server is zookeeper, so the default value is `lib/plugin/registry/zookeeper`.
# If you want to implement your own registry server, please see https://dolphinscheduler.apache.org/en-us/docs/dev/user_doc/registry_spi.html
registryPluginDir="lib/plugin/registry/zookeeper"

# Registry Server plugin name, should be a substring of `registryPluginDir`, DolphinScheduler use this for verifying configuration consistency
registryPluginName="zookeeper"

# Registry Server address.
registryServers="hadoop102:2181,hadoop103:2181,hadoop104:2181"

# The root of zookeeper, for now DolphinScheduler default registry server is zookeeper.
zkRoot="/dolphinscheduler"

# ---------------------------------------------------------
# Alert Server
# ---------------------------------------------------------
# Alert Server plugin dir. DolphinScheduler will find and load the alert plugin jar package from this dir.
alertPluginDir="lib/plugin/alert"

# ---------------------------------------------------------
# Worker Task Server
# ---------------------------------------------------------
# Worker Task Server plugin dir. DolphinScheduler will find and load the worker task plugin jar package from this dir.
taskPluginDir="lib/plugin/task"

# resource storage type: HDFS, S3, NONE
# 如果上传资源保存想保存在 hadoop 上,hadoop 集群的 NameNode 启用了 HA 的话,需要将 hadoop 的配置文件 core-site.xml 和 hdfs-site.xml 放到安装路径的 conf 目录下,本例即是放到 /opt/soft/dolphinscheduler/conf 下面,并配置 namenode cluster 名称;如果 NameNode 不是 HA,则只需要将 mycluster 修改为具体的 ip 或者主机名即可
resourceStorageType="HDFS"

# resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
resourceUploadPath="/user/dolphinscheduler"

# if resourceStorageType is HDFS,defaultFS write namenode address,HA you need to put core-site.xml and hdfs-site.xml in the conf directory.
# if S3,write S3 address,HA,for example :s3a://dolphinscheduler,
# Note,s3 be sure to create the root directory /dolphinscheduler
# 如果上传资源保存想保存在 hadoop 上,hadoop 集群的 NameNode 启用了 HA 的话,需要将 hadoop 的配置文件 core-site.xml 和 hdfs-site.xml 放到安装路径的 conf 目录下,本例即是放到 /opt/soft/dolphinscheduler/conf 下面,并配置 namenode cluster 名称;如果 NameNode 不是 HA,则只需要将 mycluster 修改为具体的 ip 或者主机名即可
defaultFS="hdfs://hadoop102:8020"

# if resourceStorageType is S3, the following three configuration is required, otherwise please ignore
# s3Endpoint="http://192.168.xx.xx:9010"
# s3AccessKey="xxxxxxxxxx"
# s3SecretKey="xxxxxxxxxx"

# resourcemanager port, the default value is 8088 if not specified
resourceManagerHttpAddressPort="8088"

# if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
# 如果没有使用到 Yarn,保持以下默认值即可;如果 ResourceManager 是 HA,则配置为 ResourceManager 节点的主备 ip 或者 hostname,比如 "192.168.xx.xx,192.168.xx.xx";如果是单 ResourceManager 请配置 yarnHaIps="" 即可
yarnHaIps="hadoop102,hadoop104"

# if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostname
# 如果 ResourceManager 是 HA 或者没有使用到 Yarn 保持默认值即可;如果是单 ResourceManager,请配置真实的 ResourceManager 主机名或者 ip
singleYarnIp="hadoop102"

# who have permissions to create directory under HDFS/S3 root path
# Note: if kerberos is enabled, please config hdfsRootUser=
# 具备权限创建 resourceUploadPath的用户
hdfsRootUser="hdfs"

# kerberos config
# whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore
kerberosStartUp="false"
# kdc krb5 config file path
krb5ConfPath="$installPath/conf/krb5.conf"
# keytab username,watch out the @ sign should followd by \\
keytabUserName="hdfs-mycluster\\@ESZ.COM"
# username keytab path
keytabPath="$installPath/conf/hdfs.headless.keytab"
# kerberos expire time, the unit is hour
kerberosExpireTime="2"

# use sudo or not
sudoEnable="true"

# worker tenant auto create
workerTenantAutoCreate="false"

# 邮件配置,以 qq 邮箱为例
# 邮件协议
mailProtocol="SMTP"

# 邮件服务地址
mailServerHost="smtp.qq.com"

# 邮件服务端口
mailServerPort="25"

# mailSender和 mailUser 配置成一样即可
# 邮箱配置比较麻烦,稍后讲述
# 发送者
mailSender="[email protected]"

# 发送用户
mailUser="[email protected]"

# 邮箱密码
mailPassword="xxx"

# TLS 协议的邮箱设置为 true,否则设置为 false
starttlsEnable="true"

# 开启 SSL 协议的邮箱配置为 true,否则为 false。注意: starttlsEnable 和 sslEnable 不能同时为 true
sslEnable="false"

# 邮件服务地址值,参考上面 mailServerHost
sslTrust="smtp.qq.com"

(2)将hadoop相关配置文件软连接到dolphinscheduler/conf目录下

[root@hadoop102 conf]# ln -s /opt/module/hadoop-3.1.3/etc/hadoop/core-site.xml core-site.xml
[root@hadoop102 conf]# ln -s /opt/module/hadoop-3.1.3/etc/hadoop/hdfs-site.xml hdfs-site.xml

(3)在部署用户名下切到安装目录运行install.sh进行部署

[atguigu@hadoop102 dolphinscheduler-bin]$ sh install.sh

4、运行程序

(1)运行程序

myhadoop.sh start
zk.sh start

# 一键开启集群所有服务
bin/start-all.sh

(2)jpsall查看进程

第一章 dolphinscheduler基础环境搭建_第1张图片

(3)WEB段界面查看

http://hadoop102:12345/dolphinscheduler

第一章 dolphinscheduler基础环境搭建_第2张图片

(4)登录

用户名:admin
密码:dolphinscheduler123  后记备忘:admin123

5、疑难杂症

  • 常见问题及解决方式收集
https://blog.csdn.net/samz5906/article/details/106434430/?utm_term=dolphinscheduler%E4%B8%ADZookeeper%E7%8A%B6%E6%80%81&utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~sobaiduweb~default-0-106434430&spm=3001.4430
5.1、zk监控全部显示为-1

(1)问题现状
第一章 dolphinscheduler基础环境搭建_第3张图片

(2)解决方案

1、打开zkServer.sh在最后添加
 
ZOOMAIN="-Dzookeeper.4lw.commands.whitelist=* ${ZOOMAIN}"

2、在zk服务端文件zoo.cfg文件中里配置

4lw.commands.whitelist=*

注意事项:所有字典都需要添加
5.2、资源中心无法上传文件

(1)问题现状

第一章 dolphinscheduler基础环境搭建_第4张图片

(2)解决方案

  • 赋予ds部署用户HDFS的文件目录权限
sudo -u hdfs hadoop fs -chown -R dolphinscheduler:dolphinscheduler /data/dolphinscheduler
5.3、部署安装时集群节点之间传文件显示权限否认

(1)问题现状

scp:权限否认(大概是这样的意思)

(2)解决方案

  • 分别在每个节点上创建部署目录,并赋予部署用户文件操作权限
#hadoop103
mkdir /opt/dolphinscheduler
chown -R dolphinscheduler:dolphinscheduler  /opt/dolphinscheduler


#hadoop104
mkdir /opt/dolphinscheduler
chown -R dolphinscheduler:dolphinscheduler  /opt/dolphinscheduler

你可能感兴趣的:(#,java,调度工具,dolphinschedule)