安装hadoop3.2.1(mac伪分布式)

一、安装

从官网下载hadoop包,http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz

#解压,路径为/Users/zheng/hadoop/hadoop-3.2.1
$ tar -zxvf hadoop-3.2.1.tar.gz

#设置环境变量
$ vim  /etc/profile
#加入以下设置
export HADOOP_HOME=/Users/zheng/hadoop/hadoop-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin

#生效
$ source /etc/profile

#检查环境变量是否设置成功
$ hadoop version
#以下则表示设置成功
2020-04-09 09:20:20,371 DEBUG util.VersionInfo: version: 3.2.1
Hadoop 3.2.1
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r b3cbbb467e22ea829b3808f4b7b01d07e0bf3842
Compiled by rohithsharmaks on 2019-09-10T15:56Z
Compiled with protoc 2.5.0
From source with checksum 776eaf9eee9c0ffc370bcbc1888737
This command was run using /Users/zheng/hadoop-3.2.1/share/hadoop/common/hadoop-common-3.2.1.jar

二、参数设置

core-site.xml:集群全局参数,定义系统级别的参数,如HDFS URL 、Hadoop的临时目录等

# 修改/Users/zheng/hadoop-3.2.1/etc/hadoop/core-site.xml

	<!--文件系统主机和端口-->
	
	     fs.defaultFS</name>
	     hdfs://localhost:9000</value>
	</property>
	
	<!--指定hadoop运行时产生文件的存放目录,不需要提交创建好,后续会自动生成-->
	
          hadoop.tmp.dir</name>
          file:/Users/zheng/hadoop/tmp</value>
    </property>
</configuration>

hdfs-site.xml:namenode,datanode存放位置、文件副本的个数、文件的读取权限等

# 修改//Users/zheng/hadoop-3.2.1/etc/hadoop/hdfs-site.xml
<configuration>
	<property>
	     <name>dfs.replication</name>
	     <value>1</value>
	</property>
	<property>
	     <name>dfs.permissions</name>
	     <value>false</value>
	</property>
	<!--不需要提交创建好,后续会自动生成-->
	<property>
	     <name>dfs.namenode.name.dir</name>
	     <value>file:/Users/zheng/hadoop/dfs/name</value>
	</property>
	<!--不需要提交创建好,后续会自动生成-->
	<property>
	     <name>dfs.datanode.data.dir</name>
	     <value>file:/Users/zheng/hadoop/dfs/data</value>
	</property>
</configuration>

mapred-site.xml:Mapreduce参数

# 修改/Users/zheng/hadoop-3.2.1/etc/hadoop/mapred-site.xml
<configuration>
	<property>
	      <name>mapreduce.framework.name</name>
	      <value>yarn</value>
	</property>
</configuration>

yarn-site.xml:集群资源管理系统参数,ResourceManager ,nodeManager的通信端口,web监控端口等

# 修改/Users/zheng/hadoop-3.2.1/etc/hadoop/yarn-site.xml
<configuration>
	<property>
	      <name>yarn.nodemanager.aux-services</name>
	      <value>mapreduce_shuffle</value>
	</property>
</configuration>

初始化hdfs

# 格式化HDFS,可以执行hdfs namenode -format,如果这个命令不行执行以下命令
cd /Users/zheng/hadoop/hadoop-3.2.1/bin
./hdfs namenode -format

启动hadoop

cd /Users/zheng/hadoop/hadoop-3.2.1/sbin
./start-all.sh

报如下错误:
WARNING: Attempting to start all Apache Hadoop daemons as zheng in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [zheng-2.local]
zheng-2.local: ERROR: Cannot set priority of namenode process 3282
Starting datanodes
Starting secondary namenodes [account.jetbrains.com]
account.jetbrains.com: ERROR: Cannot set priority of secondarynamenode process 3524
2020-04-09 12:01:03,083 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers

如何查错:进入/Users/zheng/hadoop/hadoop-3.2.1/logs可看到启动日志。
这里namenode启动报错,可以看见以下错误:

java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/Users/zheng/hadoop/tmp has no authority.
        at org.apache.hadoop.hdfs.DFSUtilClient.getNNAddress(DFSUtilClient.java:780)
        at org.apache.hadoop.hdfs.DFSUtilClient.getNNAddressCheckLogical(DFSUtilClient.java:809)
        at org.apache.hadoop.hdfs.DFSUtilClient.getNNAddress(DFSUtilClient.java:771)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:545)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:676)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:696)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:926)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1692)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1759)


解决:
本以为是/Users/zheng/hadoop/tmp文件权限问题,chmod 777 /Users/zheng/hadoop/tmp赋权以后还是不行,最后发现之前配置core-site.xml设置复制粘贴错了, 	
<property>
          <name>hadoop.tmp.dir</name>
          <value>file:/Users/zheng/hadoop/tmp</value>
</property> 
修改参数后需要重新hdfs namenode -format一下文件,再重新启动

确认启动成功

#启动成功后执行jps检查下是否成功启动
$ jps
17521 SecondaryNameNode
17717 ResourceManager
17369 DataNode
17820 NodeManager
17262 NameNode
17886 Jps

hadoop web页面默认地址:http://localhost:9870/
安装hadoop3.2.1(mac伪分布式)_第1张图片
yarn默认地址:http://localhost:8088
安装hadoop3.2.1(mac伪分布式)_第2张图片

三、补充

1、需要提前装好jdk,如何安装这个自行百度
2、ssh localhost

#ssh登录,好像之前配置过,所以本地直接执行以下命令就能登录,如果没配置过好像会报权限之类的问题
ssh localhost

# 如有问题执行以下命令添加权限
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

你可能感兴趣的:(hadoop)