1.安装OEL7.2 64bit版,创建用户bdd
设置无密码sudo
有的时候你的将用户设了nopasswd,但是不起作用,原因是被后面的group的设置覆盖了,需要把group的设置也改为nopasswd。
bdd ALL=(ALL) NOPASSWD: ALL
%wheel 如果bdd不好用,则修改%wheel
设置SSH互信
$ssh-keygen -t rsa
$ssh-keygen -t dsa
$ssh myserver2 cat ~/.ssh/id_rsa.pub >> authorized_keys
$ ssh myserver2 cat ~/.ssh/id_dsa.pub >> authorized_keys
$cat ~/.ssh/id_dsa.pub >> authorized_keys
$ cat ~/.ssh/id_rsa.pub >> authorized_keys
$ scp authorized_keys bdd@myserver2:~/.ssh
chmod 600 authorized_keys
安装Oracle JDK7 hive有的版本需要,不识别8
vi /etc/hosts 设置机器名和ip地址映射
2.各个主机之间建立互信,无密码登陆
3.找一个主机安装mysql
sudo rpm -Uvh http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm
sudo yum -y install mysql-community-server
sudo /usr/bin/systemctl enable mysqld 启动时自动启动
sudo /usr/bin/systemctl start mysqld
sudo /usr/bin/mysql_secure_installation mysql启动状态下,设置root密码等,需要mysql启动
4.CM5.9安装
tar -xvf xxx 将解压后的内容放入到/opt,/opt/cloudera和/opt/cm-5.9.0
将parcel和parcel.sh放到/opt/cloudera/parcel-repo目录下,安装过程就不用下载了
CDH-5.9.0-1.cdh5.9.0.p0.23-el7.parcel
CDH-5.9.0-1.cdh5.9.0.p0.23-el7.parcel.sha
CDH-5.9.0-1.cdh5.9.0.p0.23-el7.parcel.torrent
5.使用cloudera manager的方式安装
安装配置mysql保存信息
拷贝mysql jdbc jar 到cm-5.9.0/share/cmf/lib/下
$ /opt/cm-5.9.0/share/cmf/schema/scm_prepare_database.sh mysql cm -hlocalhost -uroot -pwonder --scm-host localhost scm scm scm 执行启动脚本 root 权限
创建操作系统用户 sudo useradd --system --home=/opt/cloudera/cm-5.9.0/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm关闭防火墙和SELINUX
$sudo systemctl stop firewalld
$sudo systemctl mask firewalld 禁用防火墙
$sestatus 查看se状态
修改/etc/selinux/config
修改/etc/hosts,确保可以找到机器名,注意要符合安装OS时的机器名,我的是myserver2.localdomain
启动cm server和agent
$sudo ./cloudera-scm-server start
$sudo ./cloudera-scm-agent start
启动后如果报错,可以查看log目录下的日志
等十几秒,打开浏览器,登陆http://ip:7180/cmf admin/admin 第一次启动scm server会比较慢,需要更新schema
按照提示安装,根据parcel的内容选择CDH-5.9.0-1.cdh5.9.0.p0.23,不用安装jdk
使用bdd/wonder用户连接所有服务器
4.2.安装CDH报错
ERROR Error, CM server guid updated, expected 26e2c7d5-dd47-4368-811f-a7d1d13e1b9a, received 24171d15-06a4-43f1-b237-cb0e0540017
解决办法:
删除/opt/cm-5.9.1/lib/cloudera-scm-agent/cm_guid
4.3.出现errno111 连接拒绝的错误提示
检查hosts
ip bdd.localdomain bdd
5.创建mysql数据库
hive/oozie/hue
CREATE DATABASE hive DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
CREATE DATABASE studio DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci; -- 给bdd studio使用
CREATE DATABASE workflow DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci; -- 给bdd workflow 1.4版本使用
create user 'hive'@'%' identified by 'hive';
create user 'oozie'@'%' identified by 'oozie';
create user 'hue'@'%' identified by 'hue';
create user 'bdd'@'%' identified by 'bdd';
grant all on hive.* to 'hive'@'%';
grant all on oozie.* to 'oozie'@'%';
grant all on hue.* to 'hue'@'%';
grant all on studio.* to 'bdd'@'%';
grant all on workflow.* to 'bdd'@'%';
5.在安装CDH时,download没有问题,在distributed的时候报错,这个时候要注意,选定的集群中的机器名,ssh 过去看是否能免密码登陆。
5.1.ScmActive: Unable to retrieve non-local non-loopback IP address. Seeing address: localhost/127.0.0.1
修改cm5.9.1/etc/cloudera-scm-agent/config。ini, 修改里面的hostname为非localhost
或者检查hosts里面的内容,去掉localhost6的行
6.在创建集群过程中会报错
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver at org.apache.hive.beeline.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:79)
Hive 拷贝:# cp /opt/cm-5.8.3/share/cmf/lib/mysql-connector-Java-5.1.33-bin.jar /opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.3/lib/hive/lib/
[bdd@mypc1 soft]$ sudo cp mysql-connector-java-5.1.39-bin.jar /usr/share/java/mysql-connector-java.jar
[bdd@mypc1 soft]$ sudo chmod 777 /usr/share/java/mysql-connector-java.jar
oozie拷贝:# cp /opt/cm-5.8.3/share/cmf/lib/mysql-connector-java-5.1.33-bin.jar /var/lib/oozie/
Hue不用拷贝如果在安装过程中,出现寻找mysql驱动的情况,注意检查是否是权限的问题 chmod 777 mysql-xxx.jar
====================
如果在安装过程中,下载100%但一直无法分发,检查相应的agent的log cm-5.9.0/log/cloudera-manager中
Traceback (most recent call last):
File "/opt/cm-5.9.0/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/agent.py", line 758, in start
self._init_after_first_heartbeat_response(resp_data)
File "/opt/cm-5.9.0/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/agent.py", line 938, in _init_after_first_heartbeat_response
self.client_configs.load()
File "/opt/cm-5.9.0/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 682, in load
new_deployed.update(self._lookup_alternatives(fname))
File "/opt/cm-5.9.0/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 432, in _lookup_alternatives
return self._parse_alternatives(alt_name, out)
File "/opt/cm-5.9.0/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.9.0-py2.7.egg/cmf/client_configs.py", line 444, in _parse_alternatives
path, _, _, priority_str = line.rstrip().split(" ")
ValueError: too many values to unpack
一种解决方法是去掉OpenJDK,使用Oracle JDK
下面是另外一种做法
===============================
另外一个坑 agent 的日志一直显示 Deleting unmanaged parcel, 页面不动了,这时候咋办
删除/opt/cloudera/parcel-cache目录下的内容,删除parcels目录下的内容,包括.flood
重启agent,再试
###############################################################################
############################# Required Settings ###############################
# These settings are required. #
# Review and update them according to your environment, as the defaults may #
# not be appropriate. #
###############################################################################
############################# Must Set ########################################
## Root folder path of the installed product components on all servers
## E.g. /localdisk/Oracle/Middleware
ORACLE_HOME=/opt/bdd140
## Oracle inventory pointer file
## If any Oracle software is already installed, this file should exist on the di
sk.
## Please change the default value to the existing file.
## Otherwise provide a path where the file will be created.
## Cannot be placed under ORACLE_HOME
## E.g. /localdisk/Oracle/oraInst.loc
ORACLE_INV_PTR=/home/bdd/Oracle/oraInst.log
## Path to the software packages downloaded from Oracle Software Delivery Cloud
## If not set, installer will prompt for it at runtime
## This value can be overwritten by an environment variable
## E.g. /localdisk/packages
INSTALLER_PATH=/opt/soft/bdd140/packages
## Path of shared Dgraph index files
## Do not set the index folder under ORACLE_HOME, otherwise it will be deleted d
uring uninstall.
## E.g. /share/bdd_dgraph_index
DGRAPH_INDEX_DIR=/opt/share/bdd_dgraph_index
## Cloudera Manager/Ambari Web UI host server
## Value: host.domain
HADOOP_UI_HOST=myserver2:7180
## JDBC URL for the Studio database
STUDIO_JDBC_URL=jdbc:mysql://myserver2:3306/studio?useUnicode=true&characterEnco
ding=UTF-8&useFastDataParsing=false
## Template for MySQL 5.5.3 or later - driver will be com.mysql.jdbc.Driver
#STUDIO_JDBC_URL=jdbc:mysql://localhost:3306/studio?useUnicode=true&characterEnc
oding=UTF-8&useFastDateParsing=false
## Template for Oracle DB 11g or 12c - driver will be oracle.jdbc.OracleDriver
#STUDIO_JDBC_URL=jdbc:oracle:thin:@localhost:1521:orcl
## Template for Hypersonic - not supported in production environments, driver wi
ll be org.hsqldb.jdbcDriver
#STUDIO_JDBC_URL=jdbc:hsqldb:${ORACLE_HOME}/user_projects/domains/${WEBLOGIC_DOM
AIN_NAME}/config/studio/data/hsql/lportal
## JDBC URL for the Workflow Manager database
WORKFLOW_MANAGER_JDBC_URL=
## Template for MySQL 5.5.3 or later - driver will be com.mysql.jdbc.Driver
#WORKFLOW_MANAGER_JDBC_URL=jdbc:mysql://localhost:3306/workflow?useUnicode=true&
characterEncoding=UTF-8&useFastDateParsing=false
## Template for Oracle DB 11g or 12c - driver will be oracle.jdbc.OracleDriver
#WORKFLOW_MANAGER_JDBC_URL=jdbc:oracle:thin:@localhost:1521:orcl
############################# General #########################################
## See value of ORACLE_HOME, ORACLE_INV_PTR, INSTALLER_PATH in ### Must Set ###
## Values: BDA, OPC, CDH, HW, MAPR
## BDA = Oracle Big Data Appliance
## CDH = general purpose hardware with CDH
## HW = general purpose hardware with Hortonworks
## OPC = Oracle Public Cloud
## MAPR = general purpose hardware with MapR
INSTALL_TYPE=CDH
## Path to JDK on all servers
## Requires 1.7.0_67 and above
JAVA_HOME=/usr/java/jdk1.8.0_111
## Temp folder used during the installation on each server
## Requries 10GB free space on Weblogic and Dgraph servers and 3GB for the rest
TEMP_FOLDER_PATH=/tmp
############################# CDH/HW ##########################################
## See value of HADOOP_UI_HOST in ### Must Set ###
## Cloudera Manager/Ambari/MapR Control System Web UI port
## For Cloudera Manager, default value is 7180
## For Ambari, default value is 8080
## For MapR Control System, default value is 8443
HADOOP_UI_PORT=7180
## Cloudera/Hortonworks/MapR cluster name, can be found on the Cloudera Manager/
Ambari/MapR Control System Web UI.
## Space char need to be URL encoded to %20
HADOOP_UI_CLUSTER_NAME=Cluster%201
## Hue server and port in a Hortonworks cluster
## If not set, Hue feature in Studio will be disabled.
## This is only required when INSTALL_TYPE is HW because Ambari does not manage
Hue.
## Values: http://server:port/ or https://server:port/
HUE_URI=http://myserver2:8888/
## A list of folder paths to the Hadoop client lib files.
## These must be on the ADMIN SERVER before installation
## Values: folder1[,folderN]
## See the BDD documentation for more details
## If left empty when installing on CDH, installer will try downloads the necess
ary packages from Cloudera website and set this value automatically
HADOOP_CLIENT_LIB_PATHS=/opt/soft/client_packages
## Template for CDH 5.7.1 and above
## Download the Spark, Hive, Avro and Hadoop tgz packages from Cloudera Website
and unpack to ADMIN SERVER and replace with the absolute pat
h to the component's client library.
#HADOOP_CLIENT_LIB_PATHS=,/lib,/lib,/share/hadoop/common,/share/hadoop/common/lib,/share/hadoop/hdfs,/share/hadoop/yarn,/share/hadoop/mapreduce2,/dist/java
## Template for HDP 2.4.2 and above
## Copy the folders shown in the template from the HDP servers to the ADMIN SERV
ER. Update the paths to point to the location you copied the libraries to.
#HADOOP_CLIENT_LIB_PATHS=/usr/hdp/2.4.2.0-212/hive/lib,/usr/hdp/2.4.2.0-212/spar
k/lib,/usr/hdp/2.4.2.0-212/hadoop,/usr/hdp/2.4.2.0-212/hadoop/lib,/usr/hdp/2.4.2
.0-212/hadoop-hdfs,/usr/hdp/2.4.2.0-212/hadoop-hdfs/lib,/usr/hdp/2.4.2.0-212/had
oop-yarn,/usr/hdp/2.4.2.0-212/hadoop-yarn/lib,/usr/hdp/2.4.2.0-212/hadoop-mapred
uce,/usr/hdp/2.4.2.0-212/hadoop-mapreduce/lib
## Template for MapR 5.1
## Copy the folders shown in the template from the MapR servers to the ADMIN SER
VER. Update the paths to point to the location you copied the libraries to.
## In /opt/mapr/lib, only copy the maprfs-5.1.0-mapr.jar file
#HADOOP_CLIENT_LIB_PATHS=/opt/mapr/spark/spark-1.6.1/lib,/opt/mapr/hive/hive-1.2
/lib,/opt/mapr/zookeeper/zookeeper-3.4.5,/opt/mapr/zookeeper/zookeeper-3.4.5/lib
,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common,/opt/mapr/hadoop/hadoop-2.7.0
/share/hadoop/common/lib,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs,/opt/ma
pr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/lib,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/lib,/opt/
mapr/hadoop/hadoop-2.7.0/share/hadoop/tools/lib,/opt/mapr/hadoop/hadoop-2.7.0/sh
are/hadoop/yarn,/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/lib
## TLS/SSL certificates folder path on ADMIN_SERVER
## If TLS/SSL is enabled for either HDFS,YARN,HIVE or KMS, export all the certif
icates from the Keystore of corresponding Hadoop service hosts.
HADOOP_CERTIFICATES_PATH=
############################# Kerberos ########################################
## Enable Kerberos in BDD cluster
## When set to TRUE, KERBEROS_PRINCIPAL, KERBEROS_KEYTAB_PATH and KRB5_CONF_PATH
must also be set
## Value: TRUE, FALSE
ENABLE_KERBEROS=FALSE
## Kerberos principal
KERBEROS_PRINCIPAL=
## Path to Kerberos keytab file on ADMIN SERVER
KERBEROS_KEYTAB_PATH=
## Path to Kerberos config file
## Default value is /etc/krb5.conf
KRB5_CONF_PATH=/etc/krb5.conf
############################# Weblogic (BDD Server) ###########################
## See value of STUDIO_JDBC_URL in ### Must Set ###
## Weblogic Admin server
## Provide the hostname that will be installed as Weblogic Admin server
## If left empty, it will default to the hostname of localhost
## Value: host.domain
ADMIN_SERVER=myserver2
## Weblogic Managed servers
## This list must also contain the Admin server
## Value: hostname1.domain[,hostnameN.domain]
MANAGED_SERVERS=${ADMIN_SERVER}
############################# Dgraph and HDFS Agent ###########################
## See value of DGRAPH_INDEX_DIR in ### Must Set ###
## List of the Dgraph and HDFS Agent servers
## Value: hostname1.domain[,hostnameN.domain]
DGRAPH_SERVERS=${ADMIN_SERVER}
## The number of threads for the Dgraph. The following are recommended:
## If the machine will run only the Dgraph, the number of threads = number of CP
U cores
## If the machine will run the Dgraph + other BDD components, the number of thre
ads = number of CPU cores - 2
## If the machine will run the Dgraph + CDH/HW services, the number of threads =
number of CPU cores - cores needed by CDH/HW
## Please make sure the number specified is in compliance with the license agree
ment
## It is not recommended to run the Dgraph on machines running Spark
## If left empty, it will default to number of CPU cores - 2
DGRAPH_THREADS=
## The amount of cache for the Dgraph, in MB.
## For enhanced performance, this should be 50% of available memory
## If left empty, it will default to 50% of available memory
DGRAPH_CACHE=
## Dgraph cluster name in Zookeeper
ZOOKEEPER_INDEX=cluster1
############################# Data Processing #################################
## HDFS sub-directory under /user used by Data Processing
## This will contain Data Processing jar files and a sandbox used during file up
loading
## Installer will create this directory if it doesn't exist
HDFS_DP_USER_DIR=bdd
## The YARN queue the EDP job is submitted to
YARN_QUEUE=default
## Name of Hive database used by Data Processing
HIVE_DATABASE_NAME=default
## The path to the spark_on_yarn jar
SPARK_ON_YARN_JAR=/opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar
## Template for CDH 5.5
#SPARK_ON_YARN_JAR=/opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar
## Template for HDP 2.3.4
#SPARK_ON_YARN_JAR=/usr/hdp/2.3.4.17-1/hive/lib/hive-metastore.jar:/usr/hdp/2.3.
4.17-1/hive/lib/hive-exec.jar:/usr/hdp/2.3.4.17-1/spark/lib/spark-assembly-1.5.2
.2.3.4.17-1-hadoop2.7.1.2.3.4.17-1.jar
## Template for MapR 5.1
#SPARK_ON_YARN_JAR=/opt/mapr/spark/spark-1.6.1/lib/spark-assembly-1.6.1-mapr-160
5-hadoop2.7.0-mapr-1602.jar
############################# Micro Service ###################################
## List of the Transform service servers
## Value: hostname1.domain[,hostnameN.domain]
TRANSFORM_SERVICE_SERVERS=${ADMIN_SERVER}
## TRANSFORM service port
## Default value is 7203
TRANSFORM_SERVICE_PORT=7203
## Enable Clustering Service at install time
## Value: TRUE, FALSE
ENABLE_CLUSTERING_SERVICE=FALSE
## List of the Clustering service servers
## Value: hostname1.domain[,hostnameN.domain]
CLUSTERING_SERVICE_SERVERS=${ADMIN_SERVER}
## Clustering service port
## Default value is 7204
CLUSTERING_SERVICE_PORT=7204
## List of the Workflow Manager servers
## Value: hostname1.domain[,hostnameN.domain]
WORKFLOW_MANAGER_SERVERS=${ADMIN_SERVER}
## Workflow Manager port
## Default value is 7207
WORKFLOW_MANAGER_PORT=7207
###############################################################################
############################# Optional Settings ###############################
# These settings are optional. #
# The default values should work in most cases. Review and update them as #
# necessary. #
###############################################################################
############################# General #########################################
## If set to TRUE, installer will remove any previous installations under ORACLE
_HOME
## If set to FALSE, installer will quit if ORACLE_HOME exists
## Value: FALSE, TRUE
FORCE=FALSE
## Enable the BDD cluster to autostart after reboot
## This will automatically restart WebLogic (which includes the Dgraph Gateway a
nd Studio), the Dgraph, and the HDFS Agent
## Value: TRUE, FALSE
ENABLE_AUTOSTART=TRUE
## Temp folder path on Admin server used during backup and restore
BACKUP_LOCAL_TEMP_FOLDER_PATH=${TEMP_FOLDER_PATH}
## Temp folder path on HDFS server used during backup and restore
BACKUP_HDFS_TEMP_FOLDER_PATH=${TEMP_FOLDER_PATH}
############################# WebLogic (BDD Server) ###########################
## Start WebLogic in prod mode or dev mode
## Prod mode requires username and password when starting
## Value: prod, dev
WLS_START_MODE=prod
## Set to TRUE if WebLogic will be installed on servers with no swap space confi
gured
## This will disable system prerequisite checks during WebLogic installation
## Value: TRUE, FALSE
WLS_NO_SWAP=FALSE
## Name of the WebLogic domain
WEBLOGIC_DOMAIN_NAME=bdd-${BDD_VERSION}_domain
## WebLogic ports
## Default values are 7001 for Admin Server and 7003 for Managed Server
ADMIN_SERVER_PORT=7001
MANAGED_SERVER_PORT=7003
## Enable SSL for Studio and Admin Server
## Value: TRUE, FALSE
WLS_SECURE_MODE=TRUE
## WebLogic secure ports
## Default values are 7002 for Admin server and 7004 for Managed server
ADMIN_SERVER_SECURE_PORT=7002
MANAGED_SERVER_SECURE_PORT=7004
## Dgraph Gateway (formerly called Endeca Server) log level
## Values: INCIDENT_ERROR, ERROR, WARNING, NOTIFICATION, TRACE.
ENDECA_SERVER_LOG_LEVEL=ERROR
## The timeout value (in milliseconds) used when responding to requests sent to
all Dgraph Gateway web services except the Data Ingest Web Service.
## A value of 0 means there is no timeout. The default value is 300000.
SERVER_TIMEOUT=300000
## The timeout value (in milliseconds) used when responding to requests sent to
the Data Ingest Web Service.
## A value of 0 means there is no timeout. The default value is 1680000.
SERVER_INGEST_TIMEOUT=1680000
## The timeout value (in milliseconds) used when checking data source availabili
ty when connections are initialized.
## A value of 0 means there is no timeout. The default value is 10000.
SERVER_HEALTHCHECK_TIMEOUT=10000
## Enable DB cache in Studio
## Value: TRUE FALSE
STUDIO_JDBC_CACHE=TRUE
## Studio administrator user's screen name
## Accepts ascii characters (a-z, A-Z), digits 0-9, period, hyphen.
STUDIO_ADMIN_SCREEN_NAME=admin
## Studio administrator user's email address
[email protected]
## Determines whether the Studio administrator must reset their initial password
## Value: TRUE FALSE
STUDIO_ADMIN_PASSWORD_RESET_REQUIRED=FALSE
## Studio administrator user's first name
STUDIO_ADMIN_FIRST_NAME=Admin
## Studio administrator user's middle name
STUDIO_ADMIN_MIDDLE_NAME=
## Studio administrator user's last name
STUDIO_ADMIN_LAST_NAME=Admin
############################# Dgraph and HDFS Agent ###########################
## Dgraph web service port
## Default value is 7010
DGRAPH_WS_PORT=7010
## Dgraph bulk load port
## Default value is 7019
DGRAPH_BULKLOAD_PORT=7019
## Dgraph stdout/stderr file
DGRAPH_OUT_FILE=${ORACLE_HOME}/BDD-${BDD_VERSION}/logs/dgraph.out
## Dgraph log level
## Defined in format: topic1 level1|topic2 level2|..|topicN levelN
DGRAPH_LOG_LEVEL=
## Additional start up parameters
## Should not include any of the above Dgraph settings
DGRAPH_ADDITIONAL_ARG=
## Whether --mount_hdfs should be specified when start Dgraph
## Value: TRUE|FALSE
DGRAPH_USE_MOUNT_HDFS=FALSE
## Where to mount the HDFS root directory on local file system
## Effective when DGRAPH_USE_MOUNT_HDFS is TRUE
## If changed after setup, user need to ensure the folder exists and is empty
DGRAPH_HDFS_MOUNT_DIR=${ORACLE_HOME}/BDD-${BDD_VERSION}/dgraph/hdfs_root
## Enable distributed EVE mode
## Value: TRUE|FALSE
DGRAPH_ENABLE_MPP=FALSE
## Dgraph MPP port
## Default value is 7029
DGRAPH_MPP_PORT=7029
## The interval in minutes that Dgraph refreshes Kerberos ticket
## Effective only when Kerberos is enabled
## Default value is 60
KERBEROS_TICKET_REFRESH_INTERVAL=60
## The Kerberos ticket lifetime used by Dgraph
## Value should be in a format recognized by the Kerberos libraries for specifyi
ng times,
## such as 10h(ten hours) or 10m(ten minutes). Known units are s,m,h, and d.
## Default value is 10h
KERBEROS_TICKET_LIFETIME=10h
# Enable Cgroup
## When set to TRUE, DGRAPH_CGROUP_NAME must also be set
## Value: TRUE, FALSE
DGRAPH_ENABLE_CGROUP=FALSE
## The name of the CGroup that controls the Dgraph process
## Default value is dgraph
DGRAPH_CGROUP_NAME=dgraph
## HDFS Agent port
## Default value is 7102
AGENT_PORT=7102
## HDFS Agent export port
## Default value is 7101
AGENT_EXPORT_PORT=7101
## HDFS Agent output file
AGENT_OUT_FILE=${ORACLE_HOME}/BDD-${BDD_VERSION}/logs/dgraphHDFSAgent.out
############################# Data Processing #################################
## Enable/disable the Hive Table Detector
## The Detector monitors the Hive table changes and auto-updates data sets in th
e Dgraph through a cron job
## Values: TRUE, FALSE
ENABLE_HIVE_TABLE_DETECTOR=FALSE
## Hive Table Detector server
## This should be one of the WebLogic Managed servers
DETECTOR_SERVER=${ADMIN_SERVER}
## Specifies the Hive database the Detector will monitor
DETECTOR_HIVE_DATABASE=default
## Maximum amount of time (in seconds) the Detector waits before submitting upda
te jobs
## Default value: 1800
DETECTOR_MAXIMUM_WAIT_TIME=1800
## Cron format schedule to run the detector
DETECTOR_SCHEDULE=0 0 * * *
## Enable Data Processing enrichments
ENABLE_ENRICHMENTS=true
## The maximum number of records included in a data set
MAX_RECORDS=1000000
## The HDFS directory that stores Avro files created when users export data
SANDBOX_PATH=/user/${HDFS_DP_USER_DIR}
## ISO-639 code used to set the language of all attributes in a data set
## Can also be set to "unknown"
LANGUAGE=unknown
## A colon-separated list of paths of jars on each cluster node
## Usually this path will simply be empty
## This will be used by end users to add custom serde's onto the classpath
## User need to copy the jars to each cluster node
DP_ADDITIONAL_JARS=
###############################################################################
############################# Internal Settings ###############################
# These settings are intended for use by Oracle Support only. #
# Do not edit them unless instructed to do so by Oracle Support. #
###############################################################################
## The maximum number of calls Studio can make to DP
DP_POOL_SIZE=50
## The maximum number of jobs Studio can add to the DP queue
DP_TASK_QUEUE_SIZE=1000
# Maximum partition size for Spark inputs, in MB
MAX_INPUT_SPLIT_SIZE=32
## Indicates whether Data Processing will dynamically compute the executor resou
rces or use static executor resource configuration.
## When set to true, resource parameters(SPARK_DRIVER_CORES, SPARK_DRIVER_MEMORY
, SPARK_EXECUTORS, SPARK_EXECUTOR_CORES and SPARK_EXECUTOR_MEMORY) are not requi
red and will be ignored even if specified.
## When set to false, set values for SPARK_DRIVER_CORES, SPARK_DRIVER_MEMORY, SP
ARK_EXECUTORS, SPARK_EXECUTOR_CORES and SPARK_EXECUTOR_MEMORY.
## Values: true, false
SPARK_DYNAMIC_ALLOCATION=true
## The number of cores used by the EDP Spark job driver
SPARK_DRIVER_CORES=2
## The maximum memory heap size for EDP Spark job driver, in the same format as
JVM memory strings (e.g. 512m, 2g)
SPARK_DRIVER_MEMORY=4g
## The total number of Spark executors to launch
SPARK_EXECUTORS=2
## The number of cores for each Spark executor
SPARK_EXECUTOR_CORES=1
## The maximum memory heap size for each Spark executor, in the same format as J
VM memory strings (e.g. 512m, 2g)
SPARK_EXECUTOR_MEMORY=4g
## The threshold length of a Dgraph attribute to enable record-search
RECORD_SEARCH_THRESHOLD=200
## The threshold length of a Dgraph attribute to enable value-search
VALUE_SEARCH_THRESHOLD=200
## BDD Version
BDD_VERSION=1.4.0.37.1301
## BDD hotfix/patch Version
BDD_RELEASE_VERSION=1.4.0.0
Can't open /opt/cm-5.9.0/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.hdfs_2617210448726206040/hadoop-conf/hive-env.sh: No such file or directory.重装解决,回到初始环境
有时候默认的系统最大文件打开数量不够大,需要设置下,如果希望永久生效,就修改/etc/security/limits.conf,然后加入以下内容,退出再重新登陆即可
* hard nofile 65535
* soft nofile 65535
需要生效的app需要重启。
另外在/etc/rc.sysinit里加入ulimit -n 65535也可以,需要重启系统
还可以在/etc/sysconfig/init里写入HARD_LIMIT_NOFILE=65535,我猜也要重启
使用OEL7,这些参数默认配置好,检测能通过
===================================================
RHEL6修改yum repo,使用Oracle Linux的源,安装6下的包。
[base]
name=base
baseurl=http://yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/
failovermethod=priority
enabled=1
gpgcheck=0
[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
mirrorlist=https://linuxmirrors.ir/pub/epel/6/x86_64/
enabled=1
gpgcheck=0
==================================================
安装cloudera 时报错
Error, CM server guid updated, expected 25cf17b3-391a-4368-848a-07118d6f11fb, received b3d4a47a-1476-4ea4-b236-24426b1b8540
删除/opt/cm-5.9.1/lib/cloudera-scm_agent/cm_guid
重新生成guid