ubuntu18.04 LTS安装后,更新hadoop系列安装。这次针对hadoop,系统和安装环境如下:ubuntu18.04 LTS和hadoop3.0.3、jdk1.8.0_172。
sudo apt-get install ssh openssh-server
ssh-keygen -t rsa
cd ~/.ssh
cat id_rsa.pub >> authorized_keys
ssh localhost
tar -zxf /opt/software/hadoop-3.0.3.tar.gz -C /opt/modules/
配置文件包括hadoop-env.sh、core-site.xml、mapred-site.xml、hdfs-site.xml,存放在etc/hadoop下面。
exportJAVA_HOME=/opt/modules/jdk1.8.0_172
fs.defaultFS是hdfs集群访问入口
hadoop.tmp.dir
/opt/modules/hadoop-3.0.3/tmp
A base for other temporary directories
fs.defaultFS
hdfs://lee:8020
dfs.replicationblock副本数,web打不开可设置dfs.http.address
dfs.replication
1
replication of block
dfs.namenode.name.dir
/opt/modules/hadoop-3.0.3/tmp/dfs/name
dfs.datanode.data.dir
/opt/modules/hadoop-3.0.3/tmp/dfs/data
dfs.http.address
lee:50070
bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemon.sh start datanode
或
sbin/start-dfs.sh
bin/hdfs dfs -mkdir /input
bin/hdfs dfs -put /opt/data/git.txt /input
bin/hdfs dfs -cat /input/git.txt
yarn-env.sh增加JAVA_HOME配置。
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.resourcemanager.hostname
lee
mapred-env.sh增加JAVA_HOME配置。
mapreduce.framework.name
yarn
若报错如下:
则增加配置
yarn.app.mapreduce.am.env
HADOOP_MAPRED_HOME=/opt/modules/hadoop-3.0.3
mapreduce.map.env
HADOOP_MAPRED_HOME=/opt/modules/hadoop-3.0.3
mapreduce.reduce.env
HADOOP_MAPRED_HOME=/opt/modules/hadoop-3.0.3
sbin/start-yarn.sh
http://lee:8088
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.3.jar wordcount /input /output
查看结果
bin/hdfs dfs -cat /output/part*
yarn-site.xml新增配置如下:
yarn.log-aggregation-enable
true
启用日志聚合功能
yarn.log-aggregation.retain-seconds
86400
日志保存时间
mapred-site.xml新增配置如下:
mapreduce.jobhistory.address
lee:10020
进程通信
mapreduce.jobhistory.webapp.address
lee:19888
客户端访问入口
启动historyserver服务
sbin/mr-jobhistory-daemon.sh start historyserver