hadoop完全分布式文件系统就是正常的分布式文件系统,不同的数据节点部署在不同的主机上,而伪分布式文件系统是不同的数据节点部署在同一个主机上。下面是完全分布式文件系统的搭建:
实验环境:
主机 | 功能 |
---|---|
server1 | master |
server2 | slave |
server3 | slave |
停止伪分布式文件系统的服务:
[hadoop@server1 hadoop]$ sbin/stop-dfs.sh
Stopping namenodes on [server1]
Stopping datanodes
Stopping secondary namenodes [server1]
在server1添加另外两个slave节点:
vim workers
172.25.24.2
172.25.24.3
修改配置文件,添加master节点及设置slave节点的个数:
vim core-site.xml
fs.defaultFS</name>
hdfs://172.25.24.1:9000</value>
</property>
</configuration>
vim hdfs-site.xml
dfs.replication</name>
2</value>
</property>
</configuration>
给server2和server3做免密:
[hadoop@server1 hadoop]$ ssh-copy-id hadoop@server2
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'server2 (172.25.24.2)' can't be established.
ECDSA key fingerprint is SHA256:A64CwYCozIJPAetNmgfE2OfM8AhrkdrK7YkKUNXpqPs.
ECDSA key fingerprint is MD5:19:31:69:58:b9:4e:c0:f3:6b:0c:dd:bb:5e:93:23:52.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@server2's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@server2'"
and check to make sure that only the key(s) you wanted were added.
[hadoop@server1 hadoop]$ ssh-copy-id hadoop@server3
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'server3 (172.25.24.3)' can't be established.
ECDSA key fingerprint is SHA256:+rEpVkBC2vLeyskPoPpQVpScuZ1HTTSTKsJa3rnmItM.
ECDSA key fingerprint is MD5:ad:fa:03:20:6e:47:52:bc:39:9b:75:b3:1d:ab:4c:2c.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@server3's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@server3'"
and check to make sure that only the key(s) you wanted were added.
在所有节点上安装nfs,以实现目录共享:
[root@server1 ~]# yum install nfs-utils -y
[root@server2 ~]# yum install nfs-utils -y
[root@server3 ~]# yum install nfs-utils -y
在master上编辑共享文件:
[root@server1 ~]# vim /etc/exports
/home/hadoop *(rw,anonuid=1000,anongid=1000)
在server1上开启共享目录服务:
[root@server1 ~]# systemctl start nfs
[root@server1 ~]# showmount -e
Export list for server1:
/home/hadoop *
在server2和server3上开启共享目录服务,并挂载在本地目录:
[root@server2 ~]# systemctl start rpcbind
[root@server2 ~]# showmount -e server1
Export list for server1:
/home/hadoop *
[root@server2 ~]# mount 172.25.24.1:/home/hadoop/ /home/hadoop/
[root@server3 ~]# systemctl start rpcbind
[root@server3 ~]# showmount -e server1
Export list for server1:
/home/hadoop *
[root@server3 ~]# mount 172.25.24.1:/home/hadoop/ /home/hadoop/
开启分布式文件系统服务:
[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [server1]
Starting datanodes
Starting secondary namenodes [server1]
[hadoop@server1 hadoop]$ bin/hdfs dfsadmin -safemode leave
Safe mode is OFF
删除伪分布式文件系统创建的文件:
[hadoop@server1 hadoop]$ bin/hdfs dfs -rm -r output
Deleted output
[hadoop@server1 hadoop]$ ls
bin include libexec logs output sbin
etc lib LICENSE.txt NOTICE.txt README.txt share
[hadoop@server1 hadoop]$ dd if=/dev/zero of=bigfile bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 1.56441 s, 335 MB/s
[hadoop@server1 hadoop]$ bin/hdfs dfs -put bigfile