Ubuntu18.04搭建Hadoop环境

一、安装JDK

  • 更新源
sudo apt-get update
sudo apt install openjdk-8-jdk-headless

请一定注意JDK不要安装最新版此前因为安装版本太高走了很多弯路

  • 查看JDK版本
java -version
  • 查看JDK安装目录
update-alternatives --display java

二、设在SSH免密码登录

  • 安装SSH
sudo apt-get install ssh
  • 安装rsync
sudo apt-get install rsync
  • 产生SSH KEY(密钥)进行后续验证
hduser@master:~$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in /home/hduser/.ssh/id_dsa.
Your public key has been saved in /home/hduser/.ssh/id_dsa.pub.
The key fingerprint is:
SHA256:OSfzo5m9QUvq6nsb4a2HbRxh8Y7wc1KFSrJgRRS6i4o hduser@master
The key's randomart image is:
+---[DSA 1024]----+
|       +=.   .   |
|      o.. o . .  |
|     ... + + .   |
|       .o.+ o    |
|      . S++=     |
|     . o %*.o    |
|    . . ++*=     |
| . .   .oB=o     |
|E .  .++B+o.     |
+----[SHA256]-----+
  • 查看产生的SSH KEY
ll /home/hduser/.ssh
  • 将产生的KEY 放到许可文件当中
hduser@master:~$ ll /home/hduser/.ssh
总用量 16
drwx------  2 hduser hduser 4096 8月   1 09:54 ./
drwxr-xr-x 42 hduser hduser 4096 8月   1 08:57 ../
-rw-------  1 hduser hduser  668 8月   1 09:54 id_dsa
-rw-r--r--  1 hduser hduser  603 8月   1 09:54 id_dsa.pub
hduser@master:~$ ll ~/.ssh
总用量 16
drwx------  2 hduser hduser 4096 8月   1 09:54 ./
drwxr-xr-x 42 hduser hduser 4096 8月   1 08:57 ../
-rw-------  1 hduser hduser  668 8月   1 09:54 id_dsa
-rw-r--r--  1 hduser hduser  603 8月   1 09:54 id_dsa.pub
hduser@master:~$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

三、下载安装Hadoop

wget http://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.7.6/hadoop-2.7.6.tar.gz

考虑到现在的企业大多用2.7版本自里安装2.7

  • 解压Hadoop-2.7.6.tar.gz
sudo tar -zxvf hadoop-2.7.6.tar.gz 
  • 移动Hadoop到目录usr/loacl/hadoop
 sudo mv hadoop-2.7.6 /usr/local/hadoop
  • 查看目录
hduser@master:~$ ll /usr/local/hadoop/
总用量 144
drwxr-xr-x  9 20415 systemd-journal  4096 4月  18  2018 ./
drwxr-xr-x 12 root  root             4096 8月   1 11:18 ../
drwxr-xr-x  2 20415 systemd-journal  4096 4月  18  2018 bin/
drwxr-xr-x  3 20415 systemd-journal  4096 4月  18  2018 etc/
drwxr-xr-x  2 20415 systemd-journal  4096 4月  18  2018 include/
drwxr-xr-x  3 20415 systemd-journal  4096 4月  18  2018 lib/
drwxr-xr-x  2 20415 systemd-journal  4096 4月  18  2018 libexec/
-rw-r--r--  1 20415 systemd-journal 86424 4月  18  2018 LICENSE.txt
-rw-r--r--  1 20415 systemd-journal 14978 4月  18  2018 NOTICE.txt
-rw-r--r--  1 20415 systemd-journal  1366 4月  18  2018 README.txt
drwxr-xr-x  2 20415 systemd-journal  4096 4月  18  2018 sbin/
drwxr-xr-x  4 20415 systemd-journal  4096 4月  18  2018 share/
  • 常用目录说明
目录 说明
bin 各项运行文件包括Hadoop,HDFS,YARS等
sbin 各项shall运行文件包括start-all.sh和stop-all.sh
etc etc/Hadoop子目录包含Hadoop配置文件 ,例如Hadoop-env.sh,core-site.xml,yars-site.xml,mapred-site.xml,hdfs-site.xml
bli Hadoop函数
logs 系统日志,可以查看但系统运行状况,运行有问题是可以从文件中找出问题

四、Hadoop环境设置

sudo gedit ~/.bashrc
  • 文件末尾添加
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 # JDK安装目录
export HADOOP_HOME=/usr/local/hadoop 
export PATH=$PATH:$HADOOP_HOME/bin 
export PATH=$PATH:$HADOOP_HOME/sbin 
export HADOOP_MAPRED_HOME=$HADOOP_HOME 
export HADOOP_COMMON_HOME=$HADOOP_HOME 
export HADOOP_HDFS_HOME=$HADOOP_HOME 
export YARN_HOME=$HADOOP_HOME 
export HADOOP_COMMON_HOME=$HADOOP_HOME 
export HADOOP_HDFS_HOME=$HADOOP_HOME 
export YARN_HOME=$HADOOP_HOME 
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native 
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" 
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
  • 激活环境
hduser@master:~$ source ~/.bashrc
hduser@master:~$ hadoop
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
  CLASSNAME            run the class named CLASSNAME
 or
  where COMMAND is one of:
  fs                   run a generic filesystem user client
  version              print the version
  jar <jar>            run a jar file
                       note: please use "yarn jar" to launch
                             YARN applications, not this command.
  checknative [-a|-h]  check native hadoop and compression libraries availability
  distcp <srcurl> <desturl> copy file or directories recursively
  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  classpath            prints the class path needed to get the
  credential           interact with credential providers
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
  trace                view and modify Hadoop tracing settings

Most commands print help when invoked w/o parameters.

五、修改Hadoop配置文件

  • 修改hadoop-env.sh
sudo gedit /usr/local/hadoop/etc/hadoop/hadoop-env.sh
修改之前: export JAVA_HOME=${JAVA_HOME}
修改之后:export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
  • 修改core-site.xml
sudo gedit /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
	<property>
		<name>fs.default.name</name>	
		<value>hdfs://localhost:9000</value>
	</property>
</configuration>
  • 修改yarn-site.xml
sudo gedit /usr/local/hadoop/etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
	<name>yarn.nodemanager.aux-services</name>
	<value>mapreduce_shuffle</value>
</property>
	<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
	<value>org.apache.hadoop.mapred.ShuffleHandler</value>
<property>
</property>
</configuration>
  • 设置mapred-site.xml
sudo cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
  • 编辑mapred-site.xml
<configuration>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
</configuration>
  • 编辑hdfs-site.xml
sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>file:/usr/local/hadoop/hadoop_data/hdfs/datanode</value>
	</property>
</configuration>
  • 创建并格式化HDFS目录
sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode
sudo mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode
sudo chown hduser:hduser -R /usr/local/hadoop
  • 将HDFS格式化

六、启动Hadoop

hduser@master:~$ start-dfs.sh
19/08/01 13:10:26 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected <property> but found <name>
19/08/01 13:10:26 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected <property> but found <value>
19/08/01 13:10:26 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected <property> but found <name>
19/08/01 13:10:26 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected <property> but found <value>
Starting namenodes on [localhost]
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:4TVFf1D79N8Df9Qg9JQLyX/1Lj/1h1reMBN6/W5sQ88.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
hduser@localhost's password: 
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-master.out
hduser@localhost's password: 
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-master.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:4TVFf1D79N8Df9Qg9JQLyX/1Lj/1h1reMBN6/W5sQ88.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
[email protected]'s password: 
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-master.out
19/08/01 13:11:15 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected  but found 
19/08/01 13:11:15 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected  but found 
19/08/01 13:11:15 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected  but found 
19/08/01 13:11:15 WARN conf.Configuration: Unexpected tag in conf file yarn-site.xml: expected  but found 
hduser@master:~$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-master.out
hduser@localhost's password: 
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-master.out

浏览器打开输入:localhost:8088
Ubuntu18.04搭建Hadoop环境_第1张图片

你可能感兴趣的:(大数据,Hadoop)