Apache Druid安装部署手册

一 Apache Druid架构

1. Coordinator

监控Historical处理,负责分配segments到指定的服务,确保存在HIstorical中是自平衡的

2. Overlord

监控MiddleManager处理和控制数据加载进druid集群;对分配给MiddleManager的摄取任务和协调segments的发布负责

3. Broker

处理来自客户端的查询,解析将查询重定向到Historical和MiddleManager,Broker接收到数据从这个子查询中,合并这些结果然后返回给查询者

4. Router

提供了Brokers,Overlords,Coordinator的统一路由网关

5. Historical

是一个处理存储和历史数据查询查询到工作站,Historical处理从deep storage加载过来的segments,对这些segments从broker发出的历史数据的查询做出回应

6. MiddleManager

摄取新数据到集群中;它负责度额外的数据源(新的实时的数据)和发布新的druid segments,是一个执行提交任务的工作节点;提交任务到peon上在一个独立的JVMs,因为任务的资源和日志的隔离,每一个Peon使用了隔离的JVMS,每一个Peon同时每次只能运行一个task,一个MiddleManager有多个peon

7. 额外依赖

Deep storage:一个被druid可访问的共享的文件存储;比如分布式文件系统HDFS、S3、一个网络挂在的文件系统;用它来存储已经陪摄入的任何数据;
Metadata store:一个共享的元数据存储,典型的关系型数据库PostgreSql和Mysql;
Zookeeper:一个被用来了额外服务发现、协调、领导选举的;

二 准备工作

1. mysql(作为Druid的 Metadata Storage)

创建数据库:

CREATE DATABASE druid DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;

授权:

grant all privileges on druid.* to druid@'%' identified by 'druid';

2. 集群节点规划

IP地址 节点功能 部署服务
10.0.111.140 Master Server Coordinator、Overlord
10.0.111.141~143 Data Server Historical、MiddleManager
10.0.111.144 Query Server Broker 、router

3. druid访问地址

  • Coordinator
    http://10.0.111.140:8081
  • Router
    http://10.0.111.144:8888

三 集群配置

1. 下载安装包

在规划的master server节点下载安装包,并在/opt/tools目录下解压

wget https://mirrors.tuna.tsinghua.edu.cn/apache/druid/0.18.1/apache-druid-0.18.1-bin.tar.gz

以下操作均需cd /opt/tools/apache-druid-0.18.1/目录下进行
注以下参数设置要求:

druid.processing.numMergeBuffers = max(2, druid.processing.numThreads / 4)
druid.processing.numThreads =  Number of cores - 1 (or 1)
druid.server.http.numThreads = max(10, (Number of cores * 17) / 16 + 2) + 30
MaxDirectMemorySize >= druid.processing.buffer.sizeByte *(druid.processing.numMergeBuffers + druid.processing.numThreads + 1) 

2. 修改common配置

  • common.runtime.properties配置文件
    druid集群以mysql作为元数据存储、HDFS为深度存储
vim conf/druid/cluster/_common/common.runtime.properties
druid.extensions.loadList=["druid-hdfs-storage", "druid-kafka-indexing-service", "druid-datasketches","mysql-metadata-storage"]
druid.extensions.hadoopDependenciesDir=/opt/tools/druid/hadoop-dependencies
druid.host=hadoop10
druid.startup.logging.logProperties=true
druid.zk.service.host=hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181
druid.zk.paths.base=/druid
druid.metadata.storage.type=mysql
druid.metadata.storage.connector.connectURI=jdbc:mysql://10.0.111.134:3306/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=druid
# For HDFS:
druid.indexer.task.hadoopWorkingPath=/tmp/druid-indexing
druid.storage.type=hdfs
druid.storage.storageDirectory=hdfs://nameservice1/druid/segments
# For HDFS:
druid.indexer.logs.type=hdfs
druid.indexer.logs.directory=hdfs://nameservice1/druid/indexing-logs
druid.selectors.indexing.serviceName=druid/overlord
druid.selectors.coordinator.serviceName=druid/coordinator
druid.monitoring.monitors=["org.apache.druid.java.util.metrics.JvmMonitor"]
druid.emitter=logging
druid.emitter.logging.logLevel=info
druid.indexing.doubleStorage=double
druid.server.hiddenProperties=["druid.s3.accessKey","druid.s3.secretKey","druid.metadata.storage.connector.password"]
druid.sql.enable=true
druid.lookup.enableLookupSyncOnStartup=false

拷贝hdfs相关配置文件至conf/druid/cluster/_common/目录下
修改各个节点druid.host配置

3. 同步druid目录到其他的节点上

4. 修改master节点配置

  • jvm.config配置文件
vim conf/druid/cluster/master/coordinator-overlord/jvm.config
-server
-Xms8g
-Xmx8g
-XX:+ExitOnOutOfMemoryError
-XX:+UseG1GC
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
-Dderby.stream.error.file=var/druid/derby.log
  • runtime.properties配置文件
vim conf/druid/cluster/master/coordinator-overlord/runtime.properties
druid.service=druid/coordinator
druid.plaintextPort=8081
druid.coordinator.startDelay=PT10S
druid.coordinator.period=PT5S
druid.coordinator.asOverlord.enabled=true
druid.coordinator.asOverlord.overlordService=druid/overlord
druid.indexer.queue.startDelay=PT5S
druid.indexer.runner.type=remote
druid.indexer.storage.type=metadata

5. 修改query节点配置文件

  • broker-jvm.config配置文件
vim conf/druid/cluster/query/broker/jvm.config
-server
-Xms6g
-Xmx6g
-XX:MaxDirectMemorySize=8g
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  • broker-runtime.properties配置文件
vim conf/druid/cluster/query/broker/runtime.properties
druid.service=druid/broker
druid.plaintextPort=8082
druid.server.http.numThreads=60
druid.broker.http.numConnections=50
druid.broker.http.maxQueuedBytes=10000000
druid.processing.buffer.sizeBytes=500000000
druid.processing.numMergeBuffers=6
druid.processing.numThreads=1
druid.processing.tmpDir=var/druid/processing
druid.broker.cache.useCache=false
druid.broker.cache.populateCache=false
  • router-jvm.config配置文件
vim conf/druid/cluster/query/router/jvm.config
-server
-Xms1g
-Xmx1g
-XX:+UseG1GC
-XX:MaxDirectMemorySize=128m
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  • router-runtime.properties配置文件
vim conf/druid/cluster/query/router/runtime.properties
druid.service=druid/router
druid.plaintextPort=8888
druid.router.http.numConnections=50
druid.router.http.readTimeout=PT5M
druid.router.http.numMaxThreads=100
druid.server.http.numThreads=100
druid.router.defaultBrokerServiceName=druid/broker
druid.router.coordinatorServiceName=druid/coordinator
druid.router.managementProxy.enabled=true

6. 修改data server节点配置文件

  • historical-jvm.config配置文件
vim conf/druid/cluster/data/historical/jvm.config
-server
-Xms2g
-Xmx2g
-XX:MaxDirectMemorySize=8g
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  • historical-runtime.properties配置文件
vim conf/druid/cluster/data/historical/runtime.properties
druid.service=druid/historical
druid.plaintextPort=8083
druid.server.http.numThreads=57
druid.processing.buffer.sizeBytes=200000000
druid.processing.numMergeBuffers=5
druid.processing.numThreads=23
druid.processing.tmpDir=var/druid/processing
druid.segmentCache.locations=[{"path":"var/druid/segment-cache","maxSize":3000000000}]
druid.server.maxSize=3000000000
druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true
druid.cache.type=caffeine
druid.cache.sizeInBytes=256000000
  • middleManager-jvm.config
vim conf/druid/cluster/data/middleManager/jvm.config
-server
-Xms128m
-Xmx128m
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
  • middleManager-runtime.properties
vim conf/druid/cluster/data/middleManager/jvm.config
druid.service=druid/middleManager
druid.plaintextPort=8091
druid.worker.capacity=5
druid.indexer.runner.javaOpts=-server -Xms1g -Xmx1g -XX:MaxDirectMemorySize=1g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+ExitOnOutOfMemoryError -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
druid.indexer.task.baseTaskDir=var/druid/task
druid.server.http.numThreads=57
druid.indexer.fork.property.druid.processing.numMergeBuffers=5
druid.indexer.fork.property.druid.processing.buffer.sizeBytes=100000000
druid.indexer.fork.property.druid.processing.numThreads=1
druid.indexer.task.hadoopWorkingPath=/tmp/druid-indexing

四 集群启动关闭

1. 后台启动关闭

nohup ./bin/start-cluster-master-no-zk-server start >/dev/null 2>&1 & 
nohup ./bin/start-cluster-query-server start  >/dev/null 2>&1 &
nohup ./bin/start-cluster-data-server start >/dev/null 2>&1 &
./bin/service --down

2. 服务方式启动关闭

  • centos 7服务启动
vim /lib/systemd/system/druidmaster.service
[Unit]
Description=druidmaster
After=network.target
[Service]
Type=forking
EnvironmentFile=/home/path
WorkingDirectory=/opt/tools/apache-druid-0.20.2/
ExecStart=/opt/tools/apache-druid-0.20.2/bin/start-cluster-master-no-zk-server start
ExecStop=/opt/tools/apache-druid-0.20.2/bin/service --down
Restart=1

[Install]
WantedBy=multi-user.target
vim /home/path
JAVA_HOME=/opt/tools/jdk
PATH=/opt/tools/jdk/bin:/opt/tools/jdk/jre/bin:/usr/local/jdk/bin:/usr/local/jdk/jre/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
source /home/path
chmod 764 /lib/systemd/system/druidmaster.service
chmod 764 /usr/lib/systemd/system/druidmaster.service
systemctl start druidmaster.service

五 遇到问题

1. hdfs深度存储没有生效

解决方法:拷贝hdfs相关配置文件至conf/druid/cluster/_common/目录下

2. 服务启动失败,运行目录下生成大量的hs_err_pid*.log日志

查看日志tailf -100 var/sv/historical.log
查看java报错日志tailf -100 hs_err_pid2416.log
解决方法:查看服务器可使用内存比角色jvm配置低,修改jvm配置文件,启动成功

3. 所有druid服务后台启动,退出xshell,druid进程被杀死

druid服务器所在机器是通过公网服务器跳转ssh登陆,当退出xshell时,无法向服务器发送正确的终止信号,nohup程序没有收到正常的退出指令
解决方法:正常exit退出xshell

4. 防止日志过爆

每个角色jvm.config添加如下配置:
-Ddruidrole=coordinator-overlord
修改log4j2.xml文件配置日志策略:

<Configuration status="WARN" monitorInterval="30">
    <Properties>
        <Property name="baseDir">var/log/druidProperty>
        <Property name="filename">${sys:druidrole}Property>
    Properties>
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
      <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
    Console>
    <RollingFile name="RollingFile"
                 fileName="${baseDir}/${filename}.log"
                 filePattern="${baseDir}/${filename}.%i.log.gz">
        <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
        <Policies>
            <SizeBasedTriggeringPolicy size="200 MB"/>
        Policies>
        <DefaultRolloverStrategy max="5"/>
    RollingFile>
  Appenders>
  <Loggers>
    <Root level="info">
      <AppenderRef ref="RollingFile"/>
    Root>

你可能感兴趣的:(apache,数据库,big,data)