官方的安装文档
https://hadoop.apache.org/docs/r2.7.7/hadoop-project-dist/hadoop-common/ClusterSetup.html
由于线上使用docker, 所以需要打包成docker镜像.
具体的步骤:
选用和当服务器一致的ubuntu16.04, 安装vim和tzdata, 并设置时区为东8区.
sources.list是软件源, 可以从主机上复制过来.
FROM ubuntu:16.04
COPY sources.list /etc/apt/
RUN apt update
RUN apt install -y vim tzdata
RUN rm /etc/localtime && ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo "Asia/Shanghai" > /etc/timezone
ENV TZ="Asia/Shanghai"
WORKDIR /
COPY jdk1.8.0_171 /jdk1.8.0_171
ENV JAVA_HOME=/jdk1.8.0_171
RUN ln -s /jdk1.8.0_171/bin/java /usr/bin/java
WORKDIR /hadoop
COPY hadoop-2.7.7 .
ENV HADOOP_PREFIX=/hadoop
ENV HADOOP_CONF_DIR=/hadoop/etc/hadoop
将hadoop2.7.7/etc/hadoop目录复制到/home/mo/sjfx-hadoop-data/config, 这个目录作为hadoop的配置目录 ,以后修改配置时就修改这个目录下面的文件.
修改core-site.xml, 配置hdfs地址, 属性fs.defaultFS配置为namenode所在的服务器地址
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://192.168.1.26:9000value>
property>
configuration>
修改hdfs-site.xml
dfs.replication
1
dfs.namenode.name.dir
/hadoop_data/hdfs/namenode
dfs.datanode.data.dir
/hadoop_data/hdfs/datanode
dfs.namenode.datanode.registration.ip-hostname-check
false
在启动namenode之前, 需要格式化namenode数据
注意要将hadoop的logs, data和config目录映射出来
#/bin/sh
docker stop sjfxhadoop-namenode
docker rm sjfxhadoop-namenode
docker run -it --name sjfxhadoop-namenode --net=host \
-v /home/mo/sjfx-hadoop-data/data:/hadoop_data \
-v /home/mo/sjfx-hadoop-data/logs:/hadoop/logs \
-v /home/mo/sjfx-hadoop-data/config:/hadoop/etc/hadoop \
sjfxhadoop:v1 sh -c "/hadoop/bin/hdfs namenode -format sjfx-cluster"
#启动namenode
使用主机网络模式
注意hadoop-daemon.sh启动的程序会自动退出, 为了防止容器主进程结束, 使用tail -f /dev/null来阻塞.
#/bin/sh
docker stop sjfxhadoop-namenode
docker rm sjfxhadoop-namenode
docker run -d --name sjfxhadoop-namenode --net=host \
-v /home/mo/sjfx-hadoop-data/data:/hadoop_data \
-v /home/mo/sjfx-hadoop-data/logs:/hadoop/logs \
-v /home/mo/sjfx-hadoop-data/config:/hadoop/etc/hadoop \
sjfxhadoop:v1 sh -c "/hadoop/sbin/hadoop-daemon.sh --script hdfs start namenode && tail -f /dev/null"
启动后查看日志目录中的启动日志有无异常.
然后可以访问http://192.168.1.26:50070, 查看namenode相关信息
#启动datanode
#/bin/sh
docker stop sjfxhadoop-datanode
docker rm sjfxhadoop-datanode
docker run -d --name sjfxhadoop-datanode --net=host \
-v /home/mo/sjfx-hadoop-data/data:/hadoop_data \
-v /home/mo/sjfx-hadoop-data/logs:/hadoop/logs \
-v /home/mo/sjfx-hadoop-data/config:/hadoop/etc/hadoop \
sjfxhadoop:v1 sh -c "/hadoop/sbin/hadoop-daemon.sh --script hdfs start datanode && tail -f /dev/null"
#测试
将日志文件上传到hdfs
./hadoop-2.7.7/bin/hadoop fs -mkdir hdfs://192.168.1.26:9000/sjfxlogs
出现错误:
mkdir: Permission denied: user=mo, access=WRITE, inode="/":root:supergroup:drwxr-xr-x
是因为当前用户是mo, 而数据目录要求用户是root才能访问, 在hdfs-site.xml关闭权限检查,并重启namenode即可.
<property>
<name>dfs.permissions.enabledname>
<value>falsevalue>
property>
创建/sjfx目录之后, 可以放入日志文件
./hadoop-2.7.7/bin/hadoop fs -put ~/sjfxlogs/bak/2019-05-31/gateway-json-2019-05-31-1.log hdfs://192.168.1.26:9000/sjfxlogs/
查看文件
./hadoop-2.7.7/bin/hadoop fs -ls hdfs://192.168.1.26:9000/sjfxlogs
Found 1 items
-rw-r–r-- 3 mo supergroup 1004671801 2019-06-04 13:58 hdfs://192.168.1.26:9000/sjfxlogs/gateway-json-2019-05-31-1.log
#总结