Install hadoop with Cloudera Manager 5.2 using Parcel on CentOS 6.5

主机分配考虑:

master:
HDFS NameNode 1 + HDFS NameNode 2
YARN ResourceManager 1 + YARN ResourceManager 2

slave: (these roles co-deployed)
HDFS DataNode + YARN NodeManager + HBase RegionServer + Impala Daemon

others:
HBase Master: multiple, on dedicated nodes

ZooKeeper + Jounal Node: >3, odd number, recommended on dedicated nodes, not with flume agent service together


Hive + Hue + Impala + Oozie + Solr + Sqoop2

Spark (like MapReduce)


Cloudera Management Service


分区考虑,不要使用LVM
root -- >50G

var -- >100G

opt -- >50G

/tmp -- >100G (run job失败的话请查看此目录空间)

swap -- 2倍系统内存

RAM -- >8GB

Master node:
RAID 10, dual Ethernet cards, dual power supplies, etc.

Slave node:
1. RAID is not necessary


2. HDFS分区, not using LVM
/etc/fstab -- ext3/ext4    defaults,noatime
挂载到/data/N/, for N=0,1,2... (one partition per disk)

挂载到/data/N/, for N=0,1,2... (one partition per disk)


Cloudera CDH repository:

http://archive.cloudera.com/cdh5
http://archive-primary.cloudera.com/cm5
http://archive.cloudera.com/gplextras5


Cloudera parcel repository:

http://archive.cloudera.com/cdh5/parcels/
http://archive.cloudera.com/gplextras5/parcels/
http://archive.cloudera.com/sqoop-connectors/parcels/
http://archive.cloudera.com/accumulo-c5/parcels/

http://archive.cloudera.com/kafka/parcels/


Cloudera Labs repository:

http://archive-primary.cloudera.com/cloudera-labs/


on cloudera manager and all cluster nodes (including master + slave nodes):

at least 3 Servers for ZooKeeper, 3 DataNodes for HDFS.


1.disable selinux and iptables
service iptables stop
chkconfig iptables off; chkconfig ip6tables off

setenforce 0
sed -i 's,SELINUX=enforcing,SELINUX=disabled,g' /etc/selinux/config


2. disable ipv6 and kernel parameters tuning
echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf

echo "vm.swappiness = 0" >> /etc/sysctl.conf

echo 'net.ipv4.tcp_retries2 = 2' >> /etc/sysctl.conf
echo 'vm.overcommit_memory = 1' >> /etc/sysctl.conf
echo "fs.file-max = 6815744" >> /etc/sysctl.conf
echo "fs.aio-max-nr = 1048576" >> /etc/sysctl.conf
echo "net.core.rmem_default = 262144" >> /etc/sysctl.conf
echo "net.core.wmem_default = 262144" >> /etc/sysctl.conf
echo "net.core.rmem_max = 16777216" >> /etc/sysctl.conf
echo "net.core.wmem_max = 16777216" >> /etc/sysctl.conf
echo "net.ipv4.tcp_rmem = 4096 262144 16777216" >> /etc/sysctl.conf
echo "net.ipv4.tcp_wmem = 4096 262144 16777216" >> /etc/sysctl.conf

only on ResourceManager and JobHistory Server
echo "net.core.somaxconn = 1000" >> /etc/sysctl.conf


sysctl -p


echo "echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled" >> /etc/rc.local
echo "echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag" >> /etc/rc.local
echo "echo no > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag" >> /etc/rc.local


3. vi /etc/hosts to add all hosts FQDN, like below:
192.168.1.19    cm5.local cm5 archive.cloudera.com

192.168.1.20    master1.local master1  # HDFS NameNode
192.168.1.21    master2.local master2  # YARN ResourceManager
192.168.1.22    slave1.local slave1
192.168.1.23    slave2.local slave2

192.168.1.24    slave3.local slave3


vi /etc/sysconfig/network to set FQDN

yum -y install ntp openssh-clients lzo

service ntpd start; chkconfig ntpd on


cat << EOF > /etc/yum.repos.d/iso.repo
[iso]
name=iso
baseurl=http://mirrors.aliyun.com/centos/6.5/os/x86_64
enable=1
gpgcheck=0
EOF


vi /etc/security/limits.conf

*         soft    core         unlimited
*         hard    core         unlimited
*         soft    nofile       65536
*         hard    nofile       65536
*         soft    nproc        unlimited
*         hard    nproc        unlimited
*         soft    memlock      unlimited
*         hard    memlock      unlimited


vi /etc/grub.conf
add "elevator=deadline"(no quotes) at the end of kernel line


reboot to take effect


4. On cloudera manager, we will install mysql 5.6 and apache
rpm -e --nodeps mysql-libs
yum -y install libaio perl
rpm -ivh MySQL-shared-compat-5.6.20-1.el6.x86_64.rpm
rpm -ivh MySQL-shared-5.6.20-1.el6.x86_64.rpm
rpm -ivh MySQL-server-5.6.20-1.el6.x86_64.rpm
rpm -ivh MySQL-client-5.6.20-1.el6.x86_64.rpm


vi /etc/my.cnf
[mysqld]
transaction-isolation=READ-COMMITTED
symbolic-links=0


key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1

# Allow 100 maximum connections for each database and then add 50 extra connections

max_connections = 550

log-bin=mysql-bin
binlog_format=mixed
expire_logs_days=10
max_binlog_size=100M

read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M

# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M


service mysql start; chkconfig mysql on
cat ~/.mysql_secret
mysqladmin -uroot -p'oldpassword' password newpassword
mysql_secure_installation


mysql -u root -p

# for Activity Monitor
create database amon DEFAULT CHARACTER SET utf8;
grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
grant all on amon.* TO 'amon'@'localhost' IDENTIFIED BY 'amon';

# for Reports Manager
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
grant all on rman.* TO 'rman'@'localhost' IDENTIFIED BY 'rman';

# for Hive Metastore Server
create database metastore DEFAULT CHARACTER SET utf8;
grant all on metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
grant all on metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive';

# for Sentry Server
create database sentry DEFAULT CHARACTER SET utf8;
grant all on sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
grant all on sentry.* TO 'sentry'@'localhost' IDENTIFIED BY 'sentry';

# for Cloudera Navigator Audit Server
create database nav DEFAULT CHARACTER SET utf8;
grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
grant all on nav.* TO 'nav'@'localhost' IDENTIFIED BY 'nav';

flush privileges;

yum -y install httpd

service httpd start; chkconfig httpd on


mkdir /var/www/html/cm520

mkdir /var/www/html/parcel520

mount -o loop cm520.iso /var/www/html/cm520

mount -o loop parcel520.iso /var/www/html/parcel520


cat << EOF > /etc/yum.repos.d/cm520.repo
[cm520]
name=cm520
baseurl=http://192.168.1.19/cm520
enable=1
gpgcheck=0
EOF

yum -y install oracle-j2sdk1.7 cloudera-manager-daemons cloudera-manager-server

ln -s  /usr/java/jdk1.7.0_67-cloudera /usr/java/default
echo 'export JAVA_HOME=/usr/java/default' >> /etc/profile
echo 'export PATH=$JAVA_HOME/bin:$PATH' >> /etc/profile
source /etc/profile


Install mysql jdbc connector:

tar zxf mysql-connector-java-5.1.33.tar.gz
mkdir /usr/share/java
cp mysql-connector-java-5.1.33/mysql-connector-java-5.1.33-bin.jar /usr/share/java/mysql-connector-java.jar


/usr/share/cmf/schema/scm_prepare_database.sh mysql -uroot -ppassword cm5 cm5 cm5

(Running the script when MySQL is installed on another host

on mysql server:
grant all on *.* to 'temp'@'%' identified by 'temp' with grant option;
flush privileges;

on cloudera manager server:
/usr/share/cmf/schema/scm_prepare_database.sh mysql -h mysql-server-ip -utemp -ptemp --scm-host scm-host-ip cm5 cm5 cm5)


service cloudera-scm-server start


wait several minutes, then open http://192.168.1.19:7180

username/password: admin/admin


if it's ok

yum -y install cloudera-manager-agent
service cloudera-scm-agent start


5. on all cluster nodes

cat << EOF > /etc/yum.repos.d/cm520.repo
[cm520]
name=cm520
baseurl=http://192.168.1.19/cm520
enable=1
gpgcheck=0
EOF


yum -y install oracle-j2sdk1.7 cloudera-manager-agent cloudera-manager-daemons


ln -s  /usr/java/jdk1.7.0_67-cloudera /usr/java/default
echo 'export JAVA_HOME=/usr/java/default' >> /etc/profile
echo 'export PATH=$JAVA_HOME/bin:$PATH' >> /etc/profile
source /etc/profile


vi /etc/cloudera-scm-agent/config.ini
server_host=cm5.local
server_port=7182


service cloudera-scm-agent start


wKioL1RAjmfzC8SjAAPg293yWsA741.jpg

wKioL1RBJj-iZkzcAAPsxtVzVmg968.jpg

wKiom1RBJgnBBrP5AAJxGHP72o8309.jpg

wKiom1RBJgmyjUPlAAHJFHeU7t0359.jpg

wKioL1RBJkDyEPNhAAJW-HxBk0o448.jpg



wKiom1RBKS_Q-SvLAARNdVMvbFo120.jpg

wKioL1RBKWbgjBWGAASW6z40AXs339.jpg

wKioL1RBKWmQhRvHAAOHRnwJ6ws476.jpg

wKioL1RJoyeSir0WAAPVUJMVRk4997.jpg

dfs.datanode.failed.volumes.tolerated
If you have > three or four disks, you might want to set this to 1 or if you have many disks, two or more.

wKioL1RIkKbz0ekaAAGiIDGHXOo614.jpg

wKiom1RIkFjSTMF8AAG6SJzBcM0945.jpg

wKioL1RIkKbxUpfoAAJCpwVxKes420.jpg

wKiom1RIkFixxymUAAI4Q_grDvs897.jpg

wKioL1RIkKfg_9z2AAKujTxXn0Q831.jpg

关机的正确步骤:

1. stop Cluster and cloudera management service

2. poweroff hosts

ok

你可能感兴趣的:(Cloudera)