1 配置jiava环境
- 查看安装的java版本
rpm -qa|grep java
- 查看java安装目录
which java
- 配置hadoop-env.sh中的JAVA_HOME,复制上一条命令的结果,粘贴到JAVA_HOME中,把连带bin之后的字符删除
2 配置hadoop四大模块:common,hdfs,yarn,mapreduce
- 配置common,core-site.xml
fs.defaultFS
hdfs://localhost:8020
#configure temp directory
hadoop.tmp.dir
/opt/module/hadoop-3.1.2/data/tmp
- 配置hdfs,hdfs-site.xml
配置副本数
dfs.replication
1
3 启动文件系统
- 格式化文件系统
bin/hdfs namenode -format
- 启动namenode,datanode,secondarynamenode
sbin/start-dfs.sh
jps #查看启动的进程
9809 SecondaryNameNode
3752 DataNode
4171 Jps
3647 NameNode
netstat -ntlp 查看进程占用的端口
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:9864 0.0.0.0:* LISTEN 3752/java
tcp 0 0 0.0.0.0:9866 0.0.0.0:* LISTEN 3752/java
tcp 0 0 0.0.0.0:9867 0.0.0.0:* LISTEN 3752/java
tcp 0 0 0.0.0.0:9868 0.0.0.0:* LISTEN 3922/java
tcp 0 0 0.0.0.0:9870 0.0.0.0:* LISTEN 3647/java
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.8:8020 0.0.0.0:* LISTEN 3647/java
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:33530 0.0.0.0:* LISTEN 3752/java
- 查看namenode状态
http://ip:9870/
- 创建用户
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/
- 在分布式系统上创建输入文件夹
$ bin/hdfs dfs -put etc/hadoop input
- 运行实例
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar grep input output 'dfs[a-z.]+'
-获取, 查看日志
$ bin/hdfs dfs -get output output
$ cat output/*
-
在浏览器上查看
4 启动yarn
- 配置mapreduce,mapred-site.xml
- 配置yarn,yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.env-whitelist
JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
- 配置mapreduce,mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.application.classpath
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
- 启动
sbin/start-yarn.sh
jps #查看进程
3922 SecondaryNameNode
6276 Jps
3752 DataNode
6056 ResourceManager
3647 NameNode
6191 NodeManager
-
查看集群管理
[http://192.168.1.8:8088/cluster]
执行mapreduce实例
错误
Error:/bin/bash: /bin/java: No such file or directory
/bin/java不存在,则创建
sudo ln -s /opt/module/jdk1.8.0_162/bin/java /bin/java
- 关闭
sbin/stop-dfs.sh
sbin/stop-yarn.sh
5 配置
- 指定namenode主机: core-site.xml
fs.defaultFS
hdfs://hadoop-yarn.cloudyhadoop.com:8020
- 指定datanode主机:worker
- 指定secondarynamenode主机:hdfs-site.xml
dfs.namenode.secondary.http-address
0.0.0.0:9868
The secondary namenode http server address and port.
- 指定resourcemanager主机名:yarn-site.xml
The hostname of the RM.
yarn.resourcemanager.hostname
0.0.0.0
- 指定nodemanager主机名:yarn-site.xml
The hostname of the NM.
yarn.nodemanager.hostname
0.0.0.0
- 指定historyserver主机名:mapred-site.xml
mapreduce.jobhistory.admin.address
0.0.0.0:10033
The address of the History server admin interface.