Tachyon基本使用08-----Running Hadoop MapReduce on Tachyon

一、修改Hadoop配置文件

1.修改core-site.xml文件

添加如下属性,让MapReduce作业可以使用Tachyon文件系统作为输入和输出

<property>
 <name>fs.tachyon.impl</name>
 <value>tachyon.hadoop.TFS</value>
</property>

2.配置hadoop-env.sh

hadoop-env.sh文件开头添加Tachyon客户端jar包路径的环境变量。

export HADOOP_CLASSPATH=/usr/local/tachyon/client/target/tachyon-client-0.5.0-jar-with-dependencies.jar

3.将修改后的配置文件同步到其它节点

[root@node1 hadoop]# scp hadoop-env.shnode2:/usr/local/hadoop/etc/hadoop/
hadoop-env.sh                                                                                                                        100%3499     3.4KB/s   00:00   
[root@node1 hadoop]# scp hadoop-env.shnode3:/usr/local/hadoop/etc/hadoop/
hadoop-env.sh                                                                                                                        100%3499     3.4KB/s   00:00   
[root@node1 hadoop]# scp core-site.xmlnode2:/usr/local/hadoop/etc/hadoop/
core-site.xml                                                                                                                        100% 1421     1.4KB/s   00:00   
[root@node1 hadoop]# scp core-site.xmlnode3:/usr/local/hadoop/etc/hadoop/
core-site.xml                                                                                                                        100% 1421     1.4KB/s   00:00   
[root@node1 hadoop]#

4.启动所有的zookeeper节点

[root@node1 hadoop]# zkServer.sh start
JMX enabled by default
Using config:
/usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@node1 hadoop]# ssh node2 zkServer.sh
start
JMX enabled by default
Using config:
/usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@node1 hadoop]# ssh node3 zkServer.sh
start
JMX enabled by default
Using config:
/usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@node1 hadoop]#

5.启动hadoop集群

[root@node1 hadoop]# pwd
/usr/local/hadoop
[root@node1 hadoop]# sbin/start-all.sh

6.启动Tachyon集群

[root@node1 hadoop]# tachyon-start.sh all Mount

7.启动node2上的Tachyon Master

[root@node2 ~]# tachyon-start.sh master
Starting master @ node2
[root@node2 ~]#

8.查看进程

[root@node1 conf]# jps
21954 QuorumPeerMain
22398 JournalNode
24120 TachyonWorker
22765 NodeManager
22572 DFSZKFailoverController
22663 ResourceManager
24009 TachyonMaster
24354 Jps
22216 DataNode
22115 NameNode
[root@node1 conf]# ssh node2 jps
15524 NameNode
16538 TachyonWorker
15880 NodeManager
15802 DFSZKFailoverController
16650 Jps
15592 DataNode
15456 QuorumPeerMain
15683 JournalNode
16598 TachyonMaster
[root@node1 conf]# ssh node3 jps
9294 DataNode
9231 QuorumPeerMain
9382 JournalNode
10050 Jps
10007 TachyonWorker
9476 NodeManager
[root@node1 conf]#

二、测试Mapreduce作业

1.上传测试文件到Tachyon

[root@node1 conf]# tachyon tfs copyFromLocal /etc/passwd /passwd
Copied /etc/passwd to /passwd
[root@node1 conf]# tachyon tfs tail /passwd
tp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin

2.执行Mapreduce作业

[root@node1 hadoop]# pwd
/usr/local/hadoop
[root@node1 hadoop]# hadoop jar
share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount -libjars
/usr/local/tachyon/client/target/tachyon-client-0.5.0-jar-with-dependencies.jar
tachyon://node1:19998/passwd tachyon://node1:19998/out

3.查看执行完成后的输出结果

[root@node1 hadoop]# tachyon tfs cat/out/part-r-00000
Daemon:/var/cache/rpcbind:/sbin/nologin     1
Daemon:/var/run/pulse:/sbin/nologin    1
IPv4LL       1
NFS  1
SSH:/var/empty/sshd:/sbin/nologin         1
Service     1
Stack:/var/lib/avahi-autoipd:/sbin/nologin      1
System     1
User:/var/ftp:/sbin/nologin     1
User:/var/lib/nfs:/sbin/nologin        2


你可能感兴趣的:(hadoop,spark,Tachyon)