参考
1
2
目标:安装Hadoop3.3.1 伪分布式
确认已经与java开发环境(java -version),用OracleJDK8,不要用OpenJDKyum install java-1.8
环境变量↓
export JAVA_HOME=/usr/lib/jvm/java
export PATH=$JAVA_HOME/bin:$PATH
下载hadoop.tar.zip安装包链接
解压到指定位置(建议/usr/local/hadoop
)
配置环境变量(.bashrc)
export HADOOP_HOME=/usr/local/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
修改配置文件(在hadoop/etc/hadoop/
):core-site.xml
,hdfs-site.xml
,hadoop-env.sh
[后面的不需要配,mapred-site.xml
,yarn-site.xml
],文件路径根据自己情况设置
我的hadoop的本机名为hadoop,
hosts
需要新增127.0.0.1 hadoop
或者直接使用0.0.0.0
core-site.xml
<configuration>
<property>
<name>fs.defaultFSname>
<value>hdfs://hadoop:9000value>
property>
<property>
<name>hadoop.tmp.dirname>
<value>/usr/local/hadoop/tmpvalue>
property>
configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.data.dirname>
<value>/usr/local/hadoop/hdfs/datavalue>
<description>datanode上数据块的物理存储位置description>
property>
<property>
<name>dfs.permissionsname>
<value>falsevalue>
property>
<property>
<name>dfs.datanode.hostnamename>
<value>hadoopvalue>
property>
configuration>
hadoop-env.sh
,在# export JAVA_HOME=
处新增,JAVA_HOME配自己的路径,root改为自己的用户名export JAVA_HOME=/usr/lib/jvm/java
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
配置ssh免密登录
文字描述
ssh-keygen -t rsa
3次回车(可能需要额外输入1次y)cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
回车! 回车! 回车!
重要的话说三遍,不要写其他什么密码
密码为空才能免密
[root@iZbp18y7b5jm99960ajdloZ ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): # 回车
Enter passphrase (empty for no passphrase): # 回车
/root/.ssh/id_rsa already exists. # 这行和下面这行 在`id_rsa`存在的时候出现
Overwrite (y/n)? y # yes,覆盖就行
Enter same passphrase again: # 回车
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:lZE2eWufB/OfpKZhzmJcNOSUbwR+hZw391ze+hUB9/0 root@iZbp18y7b5jm99960ajdloZ
The key's randomart image is:
+---[RSA 2048]----+
| .++.+o |
| **.=o+=|
| .*+oo.oX|
| . ++oo.*|
| S ..o. *E|
| . +.+|
| . .o oo+|
| ++ .o .o|
| . .+o |
+----[SHA256]-----+
[root@iZbp18y7b5jm99960ajdloZ ~]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hdfs namenode -format
一般只能执行一次,若执行多次,解决办法在后面start-all.sh
后,使用jps
查看DataNode NameNode NodeManager SecondaryNameNode ResourceManager
是不是都在运行stop-all.sh
关闭集群# 格式化 namenode
hdfs namenode -format # =hadoop namenode -format
# 启动hadoop所有节点
start-all.sh
# 关闭
stop-all.sh
# 查看java进程,正常应该有Jps,Namenode,Datanode,ResourceManager,NodeManager
jps
# 关闭安全模式,不关闭 HBase会出错
hdfs dfsadmin -safemode leave
hdfs dfsadmin -safemode get # 查看安全模式状态
hdfs dfsadmin -safemode leave # 强制NameNode退出安全模式
hdfs dfsadmin -safemode enter # 进入安全模式
hdfs dfsadmin -safemode wait # 等待一直到安全模式结束
# 确认java环境
[root@main ~]# java -version
java version "1.8.0_321"
Java(TM) SE Runtime Environment (build 1.8.0_321-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.321-b07, mixed mode)
# 格式化namenode
[root@main ~]# hdfs namenode -format
2022-04-09 14:14:57,705 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 3.3.1
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/shar...#一堆路径
#下面是一堆 "时间 INFO/WARN 详细信息"
************************************************************/
2022-04-09 14:14:57,742 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2022-04-09 14:14:57,907 INFO namenode.NameNode: createNameNode [-format]
2022-04-09 14:14:58,160 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2022-04-09 14:14:59,233 INFO namenode.NameNode: Formatting using clusterid: CID-d78dc564-61ff-4ee8-82af-428cdd0aa923
2022-04-09 14:14:59,289 INFO namenode.FSEditLog: Edit logging is async:true
2022-04-09 14:14:59,357 INFO namenode.FSNamesystem: KeyProvider: null
2022-04-09 14:14:59,362 INFO namenode.FSNamesystem: fsLock is fair: true
2022-04-09 14:14:59,362 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2022-04-09 14:14:59,377 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
2022-04-09 14:14:59,377 INFO namenode.FSNamesystem: supergroup = supergroup
2022-04-09 14:14:59,377 INFO namenode.FSNamesystem: isPermissionEnabled = false
2022-04-09 14:14:59,377 INFO namenode.FSNamesystem: isStoragePolicyEnabled = true
2022-04-09 14:14:59,377 INFO namenode.FSNamesystem: HA Enabled: false
2022-04-09 14:14:59,464 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2022-04-09 14:14:59,482 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2022-04-09 14:14:59,482 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2022-04-09 14:14:59,495 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2022-04-09 14:14:59,495 INFO blockmanagement.BlockManager: The block deletion will start around 2022 四月 09 14:14:59
2022-04-09 14:14:59,498 INFO util.GSet: Computing capacity for map BlocksMap
2022-04-09 14:14:59,498 INFO util.GSet: VM type = 64-bit
2022-04-09 14:14:59,500 INFO util.GSet: 2.0% max memory 442.8 MB = 8.9 MB
2022-04-09 14:14:59,500 INFO util.GSet: capacity = 2^20 = 1048576 entries
2022-04-09 14:14:59,516 INFO blockmanagement.BlockManager: Storage policy satisfier is disabled
2022-04-09 14:14:59,516 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2022-04-09 14:14:59,523 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.999
2022-04-09 14:14:59,523 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2022-04-09 14:14:59,523 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2022-04-09 14:14:59,524 INFO blockmanagement.BlockManager: defaultReplication = 1
2022-04-09 14:14:59,524 INFO blockmanagement.BlockManager: maxReplication = 512
2022-04-09 14:14:59,524 INFO blockmanagement.BlockManager: minReplication = 1
2022-04-09 14:14:59,524 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
2022-04-09 14:14:59,524 INFO blockmanagement.BlockManager: redundancyRecheckInterval = 3000ms
2022-04-09 14:14:59,524 INFO blockmanagement.BlockManager: encryptDataTransfer = false
2022-04-09 14:14:59,525 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
2022-04-09 14:14:59,566 INFO namenode.FSDirectory: GLOBAL serial map: bits=29 maxEntries=536870911
2022-04-09 14:14:59,566 INFO namenode.FSDirectory: USER serial map: bits=24 maxEntries=16777215
2022-04-09 14:14:59,566 INFO namenode.FSDirectory: GROUP serial map: bits=24 maxEntries=16777215
2022-04-09 14:14:59,566 INFO namenode.FSDirectory: XATTR serial map: bits=24 maxEntries=16777215
2022-04-09 14:14:59,584 INFO util.GSet: Computing capacity for map INodeMap
2022-04-09 14:14:59,584 INFO util.GSet: VM type = 64-bit
2022-04-09 14:14:59,585 INFO util.GSet: 1.0% max memory 442.8 MB = 4.4 MB
2022-04-09 14:14:59,585 INFO util.GSet: capacity = 2^19 = 524288 entries
2022-04-09 14:14:59,588 INFO namenode.FSDirectory: ACLs enabled? true
2022-04-09 14:14:59,588 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2022-04-09 14:14:59,588 INFO namenode.FSDirectory: XAttrs enabled? true
2022-04-09 14:14:59,588 INFO namenode.NameNode: Caching file names occurring more than 10 times
2022-04-09 14:14:59,595 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2022-04-09 14:14:59,599 INFO snapshot.SnapshotManager: SkipList is disabled
2022-04-09 14:14:59,604 INFO util.GSet: Computing capacity for map cachedBlocks
2022-04-09 14:14:59,604 INFO util.GSet: VM type = 64-bit
2022-04-09 14:14:59,604 INFO util.GSet: 0.25% max memory 442.8 MB = 1.1 MB
2022-04-09 14:14:59,604 INFO util.GSet: capacity = 2^17 = 131072 entries
2022-04-09 14:14:59,617 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2022-04-09 14:14:59,617 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2022-04-09 14:14:59,617 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2022-04-09 14:14:59,629 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2022-04-09 14:14:59,629 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2022-04-09 14:14:59,632 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2022-04-09 14:14:59,632 INFO util.GSet: VM type = 64-bit
2022-04-09 14:14:59,633 INFO util.GSet: 0.029999999329447746% max memory 442.8 MB = 136.0 KB
2022-04-09 14:14:59,633 INFO util.GSet: capacity = 2^14 = 16384 entries
2022-04-09 14:14:59,674 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1282586466-127.0.0.1-1649484899662
2022-04-09 14:14:59,694 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
2022-04-09 14:14:59,754 INFO namenode.FSImageFormatProtobuf: Saving image file /usr/local/hadoop/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2022-04-09 14:14:59,967 INFO namenode.FSImageFormatProtobuf: Image file /usr/local/hadoop/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 399 bytes saved in 0 seconds .
2022-04-09 14:15:00,018 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2022-04-09 14:15:00,033 INFO namenode.FSNamesystem: Stopping services started for active state
2022-04-09 14:15:00,033 INFO namenode.FSNamesystem: Stopping services started for standby state
2022-04-09 14:15:00,048 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
2022-04-09 14:15:00,049 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop/127.0.0.1
************************************************************/
–
[root@main ~]# hdfs namenode –format
2022-04-09 14:13:54,320 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop/127.0.0.1
STARTUP_MSG: args = [–format]
STARTUP_MSG: version = 3.3.1
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:
************************************************************/
2022-04-09 14:13:54,343 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2022-04-09 14:13:54,535 INFO namenode.NameNode: createNameNode [–format]
Usage: hdfs namenode [-backup] |
[-checkpoint] |
[-format [-clusterid cid ] [-force] [-nonInteractive] ] |
[-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] |
[-upgradeOnly [-clusterid cid] [-renameReserved<k-v pairs>] ] |
[-rollback] |
[-rollingUpgrade <rollback|started> ] |
[-importCheckpoint] |
[-initializeSharedEdits] |
[-bootstrapStandby [-force] [-nonInteractive] [-skipSharedEditsCheck] ] |
[-recover [ -force] ] |
[-metadataVersion ]
2022-04-09 14:13:54,602 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop/127.0.0.1
************************************************************/
[root@main ~]# start-all.sh
Starting namenodes on [0.0.0.0]
上一次登录:五 3月 25 12:29:55 CST 2022pts/0 上
Starting datanodes
上一次登录:五 3月 25 12:30:19 CST 2022pts/0 上
Starting secondary namenodes [main]
上一次登录:五 3月 25 12:30:22 CST 2022pts/0 上
Starting resourcemanager
上一次登录:五 3月 25 12:30:32 CST 2022pts/0 上
Starting nodemanagers
上一次登录:五 3月 25 12:30:46 CST 2022pts/0 上
[root@main ~]# jps
206288 Jps
204005 DataNode
205508 NodeManager
203691 NameNode
204558 SecondaryNameNode
205181 ResourceManager
[root@main ~]# hadoop fs -ls /
Found 7 items
drwxr-xr-x - root supergroup 0 2022-03-24 15:24 /hbase
-rw-r--r-- 1 wuhf supergroup 0 2022-03-24 17:54 /jjy.jpg
-rw-r--r-- 1 dr.who supergroup 15986 2022-03-18 18:53 /skeleton.png
[root@main ~]# stop-all.sh
Stopping namenodes on [0.0.0.0]
上一次登录:五 3月 25 12:30:49 CST 2022pts/0 上
Stopping datanodes
上一次登录:五 3月 25 12:31:25 CST 2022pts/0 上
Stopping secondary namenodes [main]
上一次登录:五 3月 25 12:31:27 CST 2022pts/0 上
Stopping nodemanagers
上一次登录:五 3月 25 12:31:30 CST 2022pts/0 上
Stopping resourcemanager
上一次登录:五 3月 25 12:31:35 CST 2022pts/0 上
有什么DataNode NameNode
相关异常的,到hadoop/logs/*.log
查看,然后搜那个报错去解决(比如namenode
有异常,到hadoop/logs/hadoop-root-namenode-main.log
查看)
[hadoop@main ~]$ hadoop fs -ls /
2022-03-18 13:42:49,610 WARN util.NativeCodeLoader:
Unable to load native-hadoop library for your platform...
using builtin-java classes where applicable
解决方法:修改hadoop/etc/hadoop/log4j.properties
文件,新增
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
参考:1
start-all.sh
报错:只能被root执行[hadoop@main ~]$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [0.0.0.0]
ERROR: namenode can only be executed by root.
Starting datanodes
ERROR: datanode can only be executed by root.
Starting secondary namenodes [main]
ERROR: secondarynamenode can only be executed by root.
Starting resourcemanager
ERROR: resourcemanager can only be executed by root.
Starting nodemanagers
ERROR: nodemanager can only be executed by root.
解决方法:按之前的修改hadoop/etc/hadoop/hadoop-env.sh
的几个=root
,
start-all.sh
报错:无法写logs
[hadoop@main ~]$ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [0.0.0.0]
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: ERROR: Unable to write in /usr/local/hadoop/logs. Aborting.
Starting datanodes
localhost: ERROR: Unable to write in /usr/local/hadoop/logs. Aborting.
Starting secondary namenodes [main]
main: Warning: Permanently added 'main,172.17.43.2' (ECDSA) to the list of known hosts.
main: ERROR: Unable to write in /usr/local/hadoop/logs. Aborting.
Starting resourcemanager
ERROR: Unable to write in /usr/local/hadoop/logs. Aborting.
Starting nodemanagers
localhost: ERROR: Unable to write in /usr/local/hadoop/logs. Aborting.
解决办法:给权限,执行sudo chmod -R 777 logs
start-all.sh
执行后,namenode未被启动原因:namenode未格式化 / 格式化异常
解决办法:格式化namenode
jps
中的NameNode
未启动java.net.BindException: Problem binding to [whfc.cc:9000] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:809)
at org.apache.hadoop.ipc.Server.bind(Server.java:640)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:1225)
at org.apache.hadoop.ipc.Server.(Server.java:3117)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1062)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server.(ProtobufRpcEngine2.java:464)
at org.apache.hadoop.ipc.ProtobufRpcEngine2.getServer(ProtobufRpcEngine2.java:371)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:853)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:476)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:861)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:767)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:1018)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:991)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1767)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1832)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:438)
at sun.nio.ch.Net.bind(Net.java:430)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.apache.hadoop.ipc.Server.bind(Server.java:623)
... 13 more
2022-04-02 22:03:39,410 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.net.BindException: Problem binding to [whfc.cc:9000] java.net.BindException: Cannot assign requested address; For more details see: http://wiki.apache.org/hadoop/BindException
2022-04-02 22:03:39,414 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
原因:字面意思,连不到whfc.cc
域名(core-site.xml
的hdfs URI)
具体原因(同时作用):
core-site.xml
:fs.defaultFS=hdfs://whfc.cc:9000/etc/hosts
中写入127.0.0.1 whfc.cc
解决办法:破环其中一条(建议/etc/hosts
中写入127.0.0.1 whfc.cc
)
start-all.sh
执行后,datanode未被启动原因:多次执行初始化命令hadoop namenode -format
解决方法:
- 不保留文件:删除hadoop/hdfs/data
后,重新启动hdaoop
- 保留文件:把namenode
和datanode
的clusterID
改为一致
参考
1
2
解决方法:
1. hadoop2.x端口为50070,hadoop3.x端口为9870
2. 对应端口没有放通:服务器的防火墙,云服务器的安全策略
原因:域名使用的是hadoop机器名
解决方法:
1. 使用其他工具上传文件
2. 修改C:\Windows\System32\drivers\etc\hosts
,进行域名映射
Big Data Tools
时,各种错误HADOOP_HOME
环境变量:下载残血版hadoop,只保留bin目录,并修改环境变量HADOOP_HOME
和path
winutils.exe
hadoop.dll
)无法运行:下载并添加进hadoop/binhdfs-site.xml
的dfs.datanode.hostname
不是hadoop
1&2 参考
1&2 一次解决,链接参考里的
原因:请求的namenode,需要把数据传给datanode;(同6,但6的解决办法不适用)但datanode使用的是hadoop机器名,不是真实的域名
解决方法:如下,仅供参考
// Configuration conf = new Configuration();
// 加上下面这一句,直接使用真实域名
conf.set("dfs.client.use.datanode.hostname", "true");