Hadoop:
广义: 以hadoop软件为主的生态圈
狭义: hadoop软件
hadoop.apache.org
hive.apache.org
spark.apache.org
flink.apache.org
hadoop软件:
1.x
2.x 生产 2.6
3.x
hadoop2.x组件:
hdfs: 存储 分布式文件系统 底层 生产
hive/hbase
mapreduce: 分布式计算 --》开发难度高、计算慢(shuffle 磁盘)
hive sql/spark
yarn: 资源(内存+core)+作业(job)调度管理系统 生产
但:
apache hadoop 不选择部署
企业一般选择CDH、Ambari、hdp部署
CDH:
cloudera公司 将Apache hadoop-2.6.0源代码,
修复bug,新功能,编译为自己的版本cdh5.7.0
Apache hadoop-2.6.0 --》hadoop-2.6.0-cdh5.7.0
部署:
1.添加sudo权限的无密码访问的hadoop用户
[root@hadoop002 ~]# useradd hadoop
[root@hadoop002 ~]# cat /etc/sudoers |grep hadoop
hadoop ALL=(ALL) NOPASSWD: ALL
[root@hadoop002 ~]#
[root@hadoop002 ~]# su - hadoop
[hadoop@hadoop002 ~]$
2.下载
[hadoop@hadoop002 ~]$ mkdir app
[hadoop@hadoop002 ~]$ cd app
[hadoop@hadoop002 app]$ wget http://archive-primary.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gz
[hadoop@hadoop002 app]$ tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz
[hadoop@hadoop002 app]$ cd hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
Required software for Linux include:
Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.
3.JAVA1.7部署
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ll /usr/java/
total 319160
drwxr-xr-x 8 root root 4096 Apr 11 2015 jdk1.7.0_80
drwxr-xr-x 8 root root 4096 Apr 11 2015 jdk1.8.0_45
-rw-r–r-- 1 root root 153530841 Jul 8 2015 jdk-7u80-linux-x64.tar.gz
-rw-r–r-- 1 root root 173271626 Sep 19 11:49 jdk-8u45-linux-x64.gz
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ echo J A V A H O M E / u s r / j a v a / j d k 1.7. 0 8 0 [ h a d o o p @ h a d o o p 002 h a d o o p − 2.6.0 − c d h 5.7.0 ] JAVA_HOME /usr/java/jdk1.7.0_80 [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0] JAVAHOME/usr/java/jdk1.7.080[hadoop@hadoop002hadoop−2.6.0−cdh5.7.0]
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ which java
/usr/java/jdk1.7.0_80/bin/java
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ java -version
java version “1.7.0_80”
Java™ SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot™ 64-Bit Server VM (build 24.80-b11, mixed mode)
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
4.准备
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ cd etc/hadoop
[hadoop@hadoop002 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hadoop
Usage: hadoop [–config confdir] COMMAND
where COMMAND is one of:
启动三种模式
Local (Standalone) Mode: 单机 没有进程 不用
Pseudo-Distributed Mode: 伪分布式 1台机器 进程 学习
Fully-Distributed Mode: 分布式 进程 生产
5.配置文件
[hadoop@hadoop002 hadoop]$ vi core-site.xml
[hadoop@hadoop002 hadoop]$ vi hdfs-site.xml
dfs.replication 1 "hdfs-site.xml" 23L, 866C written [hadoop@hadoop002 hadoop]$ cd6.无密码ssh
[hadoop@hadoop002 hadoop]$ cd
[hadoop@hadoop002 ~]$
[hadoop@hadoop002 ~]$
[hadoop@hadoop002 ~]$ rm -rf .ssh
[hadoop@hadoop002 ~]$ ssh-keygen -t dsa -P ‘’ -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory ‘/home/hadoop/.ssh’.
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
a3:c7:ba:e9:2e:77:ff:6f:50:bd:bc:f7:1b:1d:a6:e1 hadoop@hadoop002
The key’s randomart image is:
±-[ DSA 1024]----+
| |
| |
| . |
| . .|
| S o.o.|
| o . o +oo|
| . o E .o|
| . .+. …o|
| =*o …o…=|
±----------------+
[hadoop@hadoop002 ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop002 ~]$ cd .ssh
[hadoop@hadoop002 .ssh]$ ll
total 12
-rw-rw-r-- 1 hadoop hadoop 606 Sep 19 23:16 authorized_keys
-rw------- 1 hadoop hadoop 668 Sep 19 23:16 id_dsa
-rw-r–r-- 1 hadoop hadoop 606 Sep 19 23:16 id_dsa.pub
[hadoop@hadoop002 .ssh]$ chmod 600 authorized_keys
[hadoop@hadoop002 .ssh]$
[hadoop@hadoop002 .ssh]$ ssh hadoop002
The authenticity of host ‘hadoop002 (172.31.236.240)’ can’t be established.
RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘hadoop002,172.31.236.240’ (RSA) to the list of known hosts.
Last login: Wed Sep 19 18:21:09 2018 from 172.31.236.240
Welcome to Alibaba Cloud Elastic Compute Service !
[hadoop@hadoop002 ~]$
无密码ssh访问
权限修改
ssh命令重新连接云主机
7.环境变量
[hadoop@hadoop002 ~]$ vi .bash_profile
export MVN_HOME=/home/hadoop/app/apache-maven-3.3.9
export PROTOC_HOME=/home/hadoop/app/protobuf
export FINDBUGS_HOME=/home/hadoop/app/findbugs-1.3.9
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export PATH= H A D O O P P R E F I X / b i n : HADOOP_PREFIX/bin: HADOOPPREFIX/bin:JAVA_HOME/bin: P A T H " . b a s h p r o f i l e " 12 L , 293 C w r i t t e n [ h a d o o p @ h a d o o p 002 ] PATH ~ ~ ".bash_profile" 12L, 293C written [hadoop@hadoop002 ~] PATH ".bashprofile"12L,293Cwritten[hadoop@hadoop002 ]
[hadoop@hadoop002 ~]$
[hadoop@hadoop002 ~]$
[hadoop@hadoop002 ~]$ ssh hadoop002
Last login: Wed Sep 19 23:18:35 2018 from 172.31.236.240
Welcome to Alibaba Cloud Elastic Compute Service !
[hadoop@hadoop002 ~]$ which hdfs
~/app/hadoop-2.6.0-cdh5.7.0/bin/hdfs
[hadoop@hadoop002 ~]$
[hadoop@hadoop002 ~]$ cd ~/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format
格式化
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
27707 SecondaryNameNode
27820 Jps
27432 NameNode
HDFS部署完成后
发现DN进程有问题,重新部署
[root@hadoop002 tmp]# rm -rf /tmp/hadoop-hadoop
[root@hadoop002 tmp]#
[hadoop@hadoop002 hadoop]$ vi slaves
hadoop002
[hadoop@hadoop002 hadoop]$ cd …/…/
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
18/09/19 23:29:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Starting namenodes on [hadoop002]
hadoop002: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
18/09/19 23:29:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
28288 NameNode
28686 Jps
28410 DataNode
28575 SecondaryNameNode
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
云主机,开启防火墙
http://47.75.249.8:50070