由于Hadoop深受客户欢迎,许多公司都推出了各自版本的Hadoop,也有些公司则围绕Hadoop开发产品。在Hadoop生态系统中,规模最大、知名度最高的公司则是Cloudera。
Cloudera由来自Facebook、谷歌和雅虎的前工程师杰夫·哈默巴切(Jeff Hammerbacher)、克里斯托弗·比塞格利亚(Christophe Bisciglia)、埃姆·阿瓦达拉(Amr Awadallah)以及现任CEO、甲骨文前高管迈克·奥尔森(Mike Olson)在2008年创建。
CDH (Cloudera’s Distribution, including Apache Hadoop),是Hadoop众多分支中的一种,由Cloudera维护,基于稳定版本的Apache Hadoop构建,并集成了很多补丁,可直接用于生产环境。
Cloudera Manager则是为了便于在集群中进行Hadoop等大数据处理相关的服务安装和监控管理的组件,对集群中主机、Hadoop、Hive、Spark等服务的安装配置管理做了极大简化。
https://www.cnblogs.com/haozhengfei/p/d90e8f4da465036fabbb1d1e1eae886a.html
https://www.cnblogs.com/fujiangong/p/5620050.html
vi /etc/sysconfig/network
vi /etc/hosts (所有节点)
虚拟机的环境准备非常重要:免密钥的master需要和slave免密钥;那么cloudera manager的server需要和agent免密钥;
那么具体的软件比如hdfs的namenode所在的节点就需要和它的datanode节点免密钥;
全部会话 :[root@master ~] # ssh slave01 --> yes --> (输入密码) --> exit (退出)
[root@master ~] # cd ./.ssh
[root@master ~] # ssh-keygen -t rsa
[root@master ~] # ssh-copy-id slave01/slave02/slave03
service firewalld stop
chkconfig firewalld off
vi /etc/selinux/config ( SELINUX=disabled )
export JAVA_HOME=/usr/local/software/jdk1.7.0_79 //注意java的路径
export PATH=JAVAHOME/bin:PATH
三台机器都需要;
yum install ntp -y
设置开机启动 :chkconfig ntpd on
设置时间同步 : ntpdate -u s2c.time.edu.cn(全部会话)
https://blog.csdn.net/oschina_41140683/article/details/80613819
https://mp.csdn.net/postedit/80793547
官方参考文档:
https://www.cloudera.com/documentation/enterprise/release-notes/topics/rn_consolidated_pcm.html#concept_jpd_hpz_jdb
Cloudera Manager下载地址:
http://archive.cloudera.com/cm5/cm/5/
Download : cloudera-manager-centos7-cm5.15.0_x86_64.tar.gz
CDH安装包地址:http://archive.cloudera.com/cdh5/parcels/
Download Files:
将软件包上传到/usr/local/src目录 (所有机器都要安装)
mkdir /opt/cloudera-manager
chmod -R 777 /opt/cloudera-manager
tar xvzf cloudera-manager*.tar.gz -C /opt/cloudera-manager
注意:mysql-connector-java-xxx.jar包上传至master节点(server节点);
cp ./mysql-connector-java.jar /opt/cloudera-manager/cm-5.15.0/share/cmf/lib/
# ls /usr/local/src
CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel
CDH-5.15.1-1.cdh5.15.1.p0.4-el7.parcel.sha1
manifest.json
cloudera-manager-centos7-cm5.15.0_x86_64.tar.gz
mysql-connector-java-*.*.*.jar
mv ./cloudera-manager-centos7-cm5.15.0_x86_64.tar.gz /opt/cloudera-manager
# tar zxvf cloudera-manager-centos7-cm5.15.0_x86_64.tar.gz -C /opt/cloudera-manager/
# ls /opt/cloudera-manager/
cloudera cm-5.15.0
#useradd --system --home=/opt/cm-5.15.0/run/cloudera-scm-server --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
vi /opt/cloudera-manager/cm-5.15.0/etc/cloudera-scm-agent/config.ini
server_host=master 或者 server_host=192.168.1.100 (所有机器)
# mysql -uroot -p
Enter password: \\输入数据库密码
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 10
Server version: 5.5.47-MariaDB MariaDB Server
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
sql> create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
sql> create database amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
sql> create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
sql> create database monitor DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
sql> create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
sql> grant all on *.* to root@"%" Identified by "root";
Query OK, 0 rows affected (0.00 sec)
sql> exit
#########主节点执行########
> /opt/cloudera-manager/cm-5.15.0/share/cmf/schema/scm_prepare_database.sh mysql cm -hlocalhost -uroot -proot --scm-host localhost scm scm scm
#### 格式:./scm_prepare_database.sh 数据库类型、数据库、数据库服务器、用户名、密码、cm server服务器
Console日志:(成功!)
JAVA_HOME=/usr/local/jdk1.7.0_79
Verifying that we can write to /opt/cloudera-manager/cm-5.15.0/etc/cloudera-scm-server
Creating SCM configuration file in /opt/cloudera-manager/cm-5.15.0/etc/cloudera-scm-server
Executing: /usr/java/jdk1.7.0_79/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/opt/cloudera-manager/cm-5.15.0/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/cloudera-manager/cm-5.15.0/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
2018-08-17 13:32:20,562 [main] INFO com.cloudera.enterprise.dbutil.DbCommandExecutor - Successfully connected to database.
All done, your SCM database is configured correctly!
创建parcel目录,此目录是server和agent用来接收和发送数据的目录,server端的parcel-repo目录会把所有
的安装文件全部下载到此目录;
Agent也需要安装包,parcels就是用来存储指定的安装包的,当然需要有权限能操作
这些目录;
Server节点
– mkdir -p /opt/cloudera/parcel-repo
– chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
Agent节点
– mkdir -p /opt/cloudera/parcels
– chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
cd /opt/cloudera/parcel-repo
ls
CDH-5.15.0-1.cdh5.15.0.p0.45-el7.parcel
CDH-5.15.0-1.cdh5.15.0.p0.45-el7.parcel.sha1
manifest.json
mv CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha1 CDH-5.7.0-1.cdh5.7.0.p0.45-el7.parcel.sha
init 0 || power off
保证Mysql启动:mysql service mysqld start
全部会话:cd /opt/cloudera-manager/cm-5.15.0/etc/init.d/ ;
Server 执行: ./cloudera-scm-server start ;
Server、Agent 执行:./cloudera-scm-agent start ;
观察启动server的日志
cd /opt/cloudera-manager/cm-5.4.3/log/cloudera-scm-server
tail -f xxx.log(log的文件)
进入web浏览器进行访问,端口号:7180 ;用户名和密码admin、admin ;
解决"Cloudera 建议将 /proc/sys/vm/swappiness 设置为最大值 10。当前设置为 30"问题:
## 所有节点
vi /etc/sysctl.conf
vm.swappiness=10
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
# vi /etc/rc.local
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
/usr/bin/mkdir -p /usr/java
/usr/bin/ln -s /usr/local/jdk1.7.0_79 /usr/java/default
解决方法 一、
[root@cdh01 ~]# ps -ef
clouder+ 27338 1688 2 Sep05 ? 00:22:31 /usr/local/software/jdk1.7.0_79/bin/java -serve
clouder+ 27340 1688 1 Sep05 ? 00:17:12 /usr/local/software/jdk1.7.0_79/bin/java -serve
clouder+ 27342 1688 0 Sep05 ? 00:04:27 /usr/local/software/jdk1.7.0_79/bin/java -serve
clouder+ 27358 1688 4 Sep05 ? 00:42:09 /usr/local/software/jdk1.7.0_79/bin/java -serve
zookeep+ 27535 1688 0 Sep05 ? 00:03:20 /usr/local/software/jdk1.7.0_79/bin/java -cp /o
zookeep+ 27542 27535 0 Sep05 ? 00:00:00 python2.7 /opt/cloudera-manager/cm-5.15.0/lib64
hdfs 27760 1688 0 Sep05 ? 00:03:57 /usr/local/software/jdk1.7.0_79/bin/java -Dproc
hdfs 27765 1688 0 Sep05 ? 00:10:16 /usr/local/software/jdk1.7.0_79/bin/java -Dproc
hdfs 27782 27760 0 Sep05 ? 00:00:00 python2.7 /opt/cloudera-manager/cm-5.15.0/lib64
hdfs 27855 27765 0 Sep05 ? 00:00:00 python2.7 /opt/cloudera-manager/cm-5.15.0/lib64
root 27902 2 0 01:26 ? 00:00:00 [kworker/0:3]
root 29111 2 0 01:34 ? 00:00:00 [kworker/0:1]
root 29747 2 0 01:39 ? 00:00:00 [kworker/0:0]
root 30113 2 0 01:42 ? 00:00:00 [kworker/0:2]
root 30304 1211 0 01:44 pts/1 00:00:00 ps -ef
[root@cdh01 ~]# su - hdfs
[hdfs@cdh01 ~]$ hdfs dfsadmin -safemode leave
Safe mode is OFF
解决方法 二、
到服务器上修改hadoop的配置文件:conf/hdfs-core.xml, 找到 dfs.permissions 的配置项 , 将value值改为 false
在conf/hdfs-site.xml中加入
dfs.permissions
false
解决办法三、
clouder manager 在web上操作流程:
1)增加角色:HDFS->配置->检查HDFS 权限dfs.permissions这项不要打勾->保存更改->重启hdfs ,之后就可以用root用户;
【问题分析】
安装Hive,或oozie的时,因为我们使用了MySql作为hive/oozie的元数据存储,hive/oozie默认没有带mysql的驱动,需要添加 mysql驱动文件;
【解决方法】
cp /usr/local/src/mysql-connector-java.jar /opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/lib/hive/lib
cp /usr/local/src/mysql-connector-java.jar /opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4/lib/oozie/lib/
cp /usr/local/src/mysql-connector-java.jar /var/lib/oozie/
从中很容易看出是因为当前执行Spark Application的用户没有Hdfs“/user”目录的写入权限。这个问题无论是在Windows下还是Linux下提交Spark Application都经常会遇到。常见的解决方法有以下几种。
Hdfs的用户权限是与本地文件系统的用户权限绑定在一起的,根据错误中的
Permission denied: user=root, access=
EXECUTE, inode="/
tmp":root:supergroup:
d-wx------
我们可以发现,Hdfs中的/user目录是属于supergroup组里的root用户的。因此我们可以想到用两种方法解决这个问题:
修改执行操作的用户为该目录所属的用户。但是这种方法的弊端在于,与Hdfs进行交互的用户可能有很多,这会导致经常修改执行类似操作的用户。
因此,个人推荐使用第三种方法:
如果是Linux环境,将执行操作的用户添加到supergroup用户组。
groupadd supergroup
usermod -a -G supergroup root
adduser Administrator
groupadd supergroup
usermod -a -G supergroup Administrator
解决方法:hdfs dfsadmin -safemode leave ;
解决方法:
# 报错主机执行:
# rm -rf /opt/cloudera-manager/cm-5.15.0/lib/cloudera-scm-agent/*
1> 删除Agent节点的UUID
# rm -rf /opt/cloudera-manager/cm-5.15.0/lib/cloudera-scm-agent/*
2> 清空主节点CM数据库
进入主节点的Mysql数据库,然后drop database cm;
3> 删除Agent节点namenode和datanode节点信息
# rm -rf /dfs/nn/*
# rm -rf /dfs/dn/*
4> 在主节点上重新初始化CM数据库
5> 执行启动脚本
Server节点:# /opt/cloudera-manager/cm-5.15.0/etc/init.d/cloudera-scm-server start
Agent节点:# /opt/cloudera-manager/cm-5.15.0/etc/init.d/cloudera-scm-agent start
即可通过Server节点的7180端口重新进行CDH的安装 http://master:7180/cmf/login。