开源地址 https://github.com/wangxiaoleiAI/big-data
卜算子·大数据 目录
开源“卜算子·大数据”系列文章、源码,面向大数据(分布式计算)的编程、应用、架构——每周更新!Linux、Java、Hadoop、Spark、Sqoop、hive、pig、hbase、zookeeper、Oozie、flink…etc
本节介绍如何在Linux系统中快速安装Hadoop伪分布式部署(如果没有Linux环境,1.1 Virtualbox虚拟机快速入门 在vitualbox中安装Ubuntu18.04)
下载地址
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz
也可以通过wget命令行下载
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz
# 创建jdk文件夹
sudo mkdir -p /opt/java
# 进入Downloads文件夹
cd ~/Downloads
# 通过wget下载
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u172-b11/a58eab1ec242421181065cdc37240b08/jdk-8u172-linux-x64.tar.gz
# 解压
tar -zxf jdk-8u172-linux-x64.tar.gz
# 将jdk文件夹移动到/opt/java/
sudo mv jdk1.8.0_172/ /opt/java/jdk1.8.0_172/
1.通过vim创建jdk-1.8.sh文件
sudo vim /etc/profile.d/jdk-1.8.sh
2.添加如下内容
#!/bin/sh
# Author:wangxiaolei 王小雷
# Blog: http://blog.csdn.net/dream_an
# Github: https://github.com/wangxiaoleiai
# Date: 2018.05
# Path: /etc/profile.d/
export JAVA_HOME=/opt/java/jdk1.8.0_172
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
3.让Java变量生效
source /etc/profile
4.查看已配置完成的Java
java -version
sudo apt-get install ssh
sudo apt-get install rsync
# 创建ssh密匙,执行如下命令后回车到底
ssh-keygen -t rsa
# 将产生的公共密匙追加到authorized_keys
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# 尝试免密码登录(初次登录会询问然后需要输入yes,二次登录可以直接登录)配置成功
ssh localhost
export JAVA_HOME=/opt/java/jdk1.8.0_172
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://localhost:9000value>
property>
configuration>
<configuration>
<property>
<name>dfs.replicationname>
<value>1value>
property>
configuration>
bin/hdfs namenode -format
sbin/start-dfs.sh
jps
注意,自Hadoop3.0之后,http://localhost:5007/ 变成http://localhost:9870/ 官方解释
Namenode ports
----------------
50070 --> 9070
50470 --> 9470
Datanode ports
---------------
50010 --> 9010
50020 --> 9020
50075 --> 9075
50475 --> 9475
Secondary NN ports
---------------
50090 --> 9090
50091 --> 9091
<configuration>
<property>
<name>mapreduce.framework.namename>
<value>yarnvalue>
property>
<property>
<name>mapreduce.application.classpathname>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*value>
property>
configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-servicesname>
<value>mapreduce_shufflevalue>
property>
<property>
<name>yarn.nodemanager.env-whitelistname>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOMEvalue>
property>
configuration>
sbin/start-yarn.sh
jps
至此,伪分布式部署完成。
sbin/stop-yarn.sh
sbin/stop-dfs.sh
rm -rf /opt/hadoop/hadoop-3.1.0/logs/*
rm -rf /tmp/hadoop*