CentOS 安装 hadoop(伪分布模式)

在本机上装的CentOS 5.5 虚拟机,

软件准备:jdk 1.6 U26

hadoop:hadoop-0.20.203.tar.gz


ssh检查配置

Linux代码 收藏代码
  1. [root@localhost~]#ssh-keygen-trsa
  2. Generatingpublic/privatersakeypair.
  3. Enterfileinwhichtosavethekey(/root/.ssh/id_rsa):
  4. Createddirectory'/root/.ssh'.
  5. Enterpassphrase(emptyfornopassphrase):
  6. Entersamepassphraseagain:
  7. Youridentificationhasbeensavedin/root/.ssh/id_rsa.
  8. Yourpublickeyhasbeensavedin/root/.ssh/id_rsa.pub.
  9. Thekeyfingerprintis:
  10. a8:7a:3e:f6:92:85:b8:c7:be:d9:0e:45:9c:d1:36:[email protected]
  11. [root@localhost~]#
  12. [root@localhost~]#cd..
  13. [root@localhost/]#cdroot
  14. [root@localhost~]#ls
  15. anaconda-ks.cfgDesktopinstall.loginstall.log.syslog
  16. [root@localhost~]#cd.ssh
  17. [[email protected]]#catid_rsa.pub>authorized_keys
  18. [[email protected]]#
  19. [[email protected]]#sshlocalhost
  20. Theauthenticityofhost'localhost(127.0.0.1)'can'tbeestablished.
  21. RSAkeyfingerprintis41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
  22. Areyousureyouwanttocontinueconnecting(yes/no)?yes
  23. Warning:Permanentlyadded'localhost'(RSA)tothelistofknownhosts.
  24. Lastlogin:TueJun2122:40:312011
  25. [root@localhost~]#

安装jdk

Linux代码 收藏代码
  1. [root@localhostjava]#chmod+xjdk-6u26-linux-i586.bin
  2. [root@localhostjava]#./jdk-6u26-linux-i586.bin
  3. ......
  4. ......
  5. ......
  6. FormoreinformationonwhatdataRegistrationcollectsand
  7. howitismanagedandused,see:
  8. http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html
  9. PressEntertocontinue.....
  10. Done.

安装完成后生成文件夹:jdk1.6.0_26

配置环境变量

Linux代码 收藏代码
  1. [root@localhostjava]#vi/etc/profile
  2. #添加如下信息
  3. #setjavaenvironment
  4. exportJAVA_HOME=/usr/java/jdk1.6.0_26
  5. exportCLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
  6. exportPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin
  7. exportHADOOP_HOME=/usr/local/hadoop/hadoop-0.20.203
  8. exportPATH=$PATH:$HADOOP_HOME/bin
  9. [root@localhostjava]#chmod+x/etc/profile
  10. [root@localhostjava]#source/etc/profile
  11. [root@localhostjava]#
  12. [root@localhostjava]#java-version
  13. javaversion"1.6.0_26"
  14. Java(TM)SERuntimeEnvironment(build1.6.0_26-b03)
  15. JavaHotSpot(TM)ClientVM(build20.1-b02,mixedmode,sharing)
  16. [root@localhostjava]#

修改hosts

Linuxa代码 收藏代码
  1. [root@localhostconf]#vi/etc/hosts
  2. #Donotremovethefollowingline,orvariousprograms
  3. #thatrequirenetworkfunctionalitywillfail.
  4. 127.0.0.1localhost.localdomainlocalhost
  5. ::1localhost6.localdomain6localhost6
  6. 127.0.0.1namenodedatanode01

解压安装hadoop

Linux代码 收藏代码
  1. [root@localhosthadoop]#tarzxvfhadoop-0.20.203.tar.gz
  2. ......
  3. ......
  4. ......
  5. hadoop-0.20.203.0/src/contrib/ec2/bin/image/create-hadoop-image-remote
  6. hadoop-0.20.203.0/src/contrib/ec2/bin/image/ec2-run-user-data
  7. hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-cluster
  8. hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-master
  9. hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-slaves
  10. hadoop-0.20.203.0/src/contrib/ec2/bin/list-hadoop-clusters
  11. hadoop-0.20.203.0/src/contrib/ec2/bin/terminate-hadoop-cluster
  12. [root@localhosthadoop]#

进入hadoop配置conf

Linux代码 收藏代码
  1. ####################################
  2. [root@localhostconf]#vihadoop-env.sh
  3. #添加代码
  4. #setjavaenvironment
  5. exportJAVA_HOME=/usr/java/jdk1.6.0_26
  6. #####################################
  7. [root@localhostconf]#vicore-site.xml
  8. <?xmlversion="1.0"?>
  9. <?xml-stylesheettype="text/xsl"href="configuration.xsl"?>
  10. <!--Putsite-specificpropertyoverridesinthisfile.-->
  11. <configuration>
  12. <property>
  13. <name>fs.default.name</name>
  14. <value>hdfs://namenode:9000/</value>
  15. </property>
  16. <property>
  17. <name>hadoop.tmp.dir</name>
  18. <value>/usr/local/hadoop/hadooptmp</value>
  19. </property>
  20. </configuration>
  21. #######################################
  22. [root@localhostconf]#vihdfs-site.xml
  23. <?xmlversion="1.0"?>
  24. <?xml-stylesheettype="text/xsl"href="configuration.xsl"?>
  25. <!--Putsite-specificpropertyoverridesinthisfile.-->
  26. <configuration>
  27. <property>
  28. <name>dfs.name.dir</name>
  29. <value>/usr/local/hadoop/hdfs/name</value>
  30. </property>
  31. <property>
  32. <name>dfs.data.dir</name>
  33. <value>/usr/local/hadoop/hdfs/data</value>
  34. </property>
  35. <property>
  36. <name>dfs.replication</name>
  37. <value>1</value>
  38. </property>
  39. </configuration>
  40. #########################################
  41. [root@localhostconf]#vimapred-site.xml
  42. <?xmlversion="1.0"?>
  43. <?xml-stylesheettype="text/xsl"href="configuration.xsl"?>
  44. <!--Putsite-specificpropertyoverridesinthisfile.-->
  45. <configuration>
  46. <property>
  47. <name>mapred.job.tracker</name>
  48. <value>namenode:9001</value>
  49. </property>
  50. <property>
  51. <name>mapred.local.dir</name>
  52. <value>/usr/local/hadoop/mapred/local</value>
  53. </property>
  54. <property>
  55. <name>mapred.system.dir</name>
  56. <value>/tmp/hadoop/mapred/system</value>
  57. </property>
  58. </configuration>
  59. #########################################
  60. [root@localhostconf]#vimasters
  61. #localhost
  62. namenode
  63. #########################################
  64. [root@localhostconf]#vislaves
  65. #localhost
  66. datanode01

启动 hadoop

Linux代码 收藏代码
  1. #####################<spanstyle="font-size:small;">格式化namenode##############</span>
  2. [root@localhostbin]#hadoopnamenode-format
  3. 11/06/2300:43:54INFOnamenode.NameNode:STARTUP_MSG:
  4. /************************************************************
  5. STARTUP_MSG:StartingNameNode
  6. STARTUP_MSG:host=localhost.localdomain/127.0.0.1
  7. STARTUP_MSG:args=[-format]
  8. STARTUP_MSG:version=0.20.203.0
  9. STARTUP_MSG:build=http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r1099333;compiledby'oom'onWedMay407:57:50PDT2011
  10. ************************************************************/
  11. 11/06/2300:43:55INFOutil.GSet:VMtype=32-bit
  12. 11/06/2300:43:55INFOutil.GSet:2%maxmemory=19.33375MB
  13. 11/06/2300:43:55INFOutil.GSet:capacity=2^22=4194304entries
  14. 11/06/2300:43:55INFOutil.GSet:recommended=4194304,actual=4194304
  15. 11/06/2300:43:56INFOnamenode.FSNamesystem:fsOwner=root
  16. 11/06/2300:43:56INFOnamenode.FSNamesystem:supergroup=supergroup
  17. 11/06/2300:43:56INFOnamenode.FSNamesystem:isPermissionEnabled=true
  18. 11/06/2300:43:56INFOnamenode.FSNamesystem:dfs.block.invalidate.limit=100
  19. 11/06/2300:43:56INFOnamenode.FSNamesystem:isAccessTokenEnabled=falseaccessKeyUpdateInterval=0min(s),accessTokenLifetime=0min(s)
  20. 11/06/2300:43:56INFOnamenode.NameNode:Cachingfilenamesoccuringmorethan10times
  21. 11/06/2300:43:57INFOcommon.Storage:Imagefileofsize110savedin0seconds.
  22. 11/06/2300:43:57INFOcommon.Storage:Storagedirectory/usr/local/hadoop/hdfs/namehasbeensuccessfullyformatted.
  23. 11/06/2300:43:57INFOnamenode.NameNode:SHUTDOWN_MSG:
  24. /************************************************************
  25. SHUTDOWN_MSG:ShuttingdownNameNodeatlocalhost.localdomain/127.0.0.1
  26. ************************************************************/
  27. [root@localhostbin]#
  28. ###########################################
  29. [root@localhostbin]#./start-all.sh
  30. startingnamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
  31. datanode01:startingdatanode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
  32. namenode:startingsecondarynamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
  33. startingjobtracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
  34. datanode01:startingtasktracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
  35. [root@localhostbin]#jps
  36. 11971TaskTracker
  37. 11807SecondaryNameNode
  38. 11599NameNode
  39. 12022Jps
  40. 11710DataNode
  41. 11877JobTracker

查看集群状态

Linux代码 收藏代码
  1. [root@localhostbin]#hadoopdfsadmin-report
  2. ConfiguredCapacity:4055396352(3.78GB)
  3. PresentCapacity:464142351(442.64MB)
  4. DFSRemaining:464089088(442.59MB)
  5. DFSUsed:53263(52.01KB)
  6. DFSUsed%:0.01%
  7. Underreplicatedblocks:0
  8. Blockswithcorruptreplicas:0
  9. Missingblocks:0
  10. -------------------------------------------------
  11. Datanodesavailable:1(1total,0dead)
  12. Name:127.0.0.1:50010
  13. DecommissionStatus:Normal
  14. ConfiguredCapacity:4055396352(3.78GB)
  15. DFSUsed:53263(52.01KB)
  16. NonDFSUsed:3591254001(3.34GB)
  17. DFSRemaining:464089088(442.59MB)
  18. DFSUsed%:0%
  19. DFSRemaining%:11.44%
  20. Lastcontact:ThuJun2301:11:15PDT2011
  21. [root@localhostbin]#

其他问题:1

Linux代码 收藏代码
  1. ####################启动报错##########
  2. [root@localhostbin]#./start-all.sh
  3. startingnamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
  4. Theauthenticityofhost'datanode01(127.0.0.1)'can'tbeestablished.
  5. RSAkeyfingerprintis41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
  6. Areyousureyouwanttocontinueconnecting(yes/no)?y
  7. Pleasetype'yes'or'no':yes
  8. datanode01:Warning:Permanentlyadded'datanode01'(RSA)tothelistofknownhosts.
  9. datanode01:startingdatanode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
  10. <strong><spanstyle="color:#ff0000;">datanode01:Unrecognizedoption:-jvm
  11. datanode01:CouldnotcreatetheJavavirtualmachine.</span>
  12. </strong>
  13. namenode:startingsecondarynamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
  14. startingjobtracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
  15. datanode01:startingtasktracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
  16. [root@localhostbin]#jps
  17. 10442JobTracker
  18. 10533TaskTracker
  19. 10386SecondaryNameNode
  20. 10201NameNode
  21. 10658Jps
  22. ################################################
  23. [root@localhostbin]#vihadoop
  24. elif["$COMMAND"="datanode"];then
  25. CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
  26. if[[$EUID-eq0]];then
  27. HADOOP_OPTS="$HADOOP_OPTS-jvmserver$HADOOP_DATANODE_OPTS"
  28. else
  29. HADOOP_OPTS="$HADOOP_OPTS-server$HADOOP_DATANODE_OPTS"
  30. fi
  31. #http://javoft.net/2011/06/hadoop-unrecognized-option-jvm-could-not-create-the-java-virtual-machine/
  32. #改为
  33. elif["$COMMAND"="datanode"];then
  34. CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
  35. #if[[$EUID-eq0]];then
  36. #HADOOP_OPTS="$HADOOP_OPTS-jvmserver$HADOOP_DATANODE_OPTS"
  37. #else
  38. HADOOP_OPTS="$HADOOP_OPTS-server$HADOOP_DATANODE_OPTS"
  39. #fi
  40. #或者换非root用户启动
  41. #启动成功

2,启动时要关闭防火墙

查看运行情况:

http://localhost:50070

Firefox显示代码 收藏代码
  1. NameNode'localhost.localdomain:9000'
  2. Started:ThuJun2301:07:18PDT2011
  3. Version:0.20.203.0,r1099333
  4. Compiled:WedMay407:57:50PDT2011byoom
  5. Upgrades:Therearenoupgradesinprogress.
  6. Browsethefilesystem
  7. NamenodeLogs
  8. ClusterSummary
  9. 6filesanddirectories,1blocks=7total.HeapSizeis31.38MB/966.69MB(3%)
  10. ConfiguredCapacity:3.78GB
  11. DFSUsed:52.01KB
  12. NonDFSUsed:3.34GB
  13. DFSRemaining:442.38MB
  14. DFSUsed%:0%
  15. DFSRemaining%:11.44%
  16. LiveNodes:1
  17. DeadNodes:0
  18. DecommissioningNodes:0
  19. NumberofUnder-ReplicatedBlocks:0
  20. NameNodeStorage:
  21. StorageDirectoryTypeState
  22. /usr/local/hadoop/hdfs/nameIMAGE_AND_EDITSActive

http://localhost:50030

Firefox显示代码 收藏代码
  1. namenodeHadoopMap/ReduceAdministration
  2. QuickLinks
  3. *SchedulingInfo
  4. *RunningJobs
  5. *RetiredJobs
  6. *LocalLogs
  7. State:RUNNING
  8. Started:ThuJun2301:07:30PDT2011
  9. Version:0.20.203.0,r1099333
  10. Compiled:WedMay407:57:50PDT2011byoom
  11. Identifier:201106230107
  12. ClusterSummary(HeapSizeis15.31MB/966.69MB)
  13. RunningMapTasksRunningReduceTasksTotalSubmissionsNodesOccupiedMapSlotsOccupiedReduceSlotsReservedMapSlotsReservedReduceSlotsMapTaskCapacityReduceTaskCapacityAvg.Tasks/NodeBlacklistedNodesGraylistedNodesExcludedNodes
  14. 00010000224.00000
  15. SchedulingInformation
  16. QueueNameStateSchedulingInformation
  17. defaultrunningN/A
  18. Filter(Jobid,Priority,User,Name)
  19. Example:'user:smith3200'willfilterby'smith'onlyintheuserfieldand'3200'inallfields
  20. RunningJobs
  21. none
  22. RetiredJobs
  23. none
  24. LocalLogs
  25. Logdirectory,JobTrackerHistoryThisisApacheHadooprelease0.20.203.0

测试:

Linux代码 收藏代码
  1. ##########建立目录名称##########
  2. [root@localhostbin]#hadoopfs-mkdirtestFolder
  3. ###############拷贝文件到文件夹中
  4. [root@localhostlocal]#ls
  5. binetcgameshadoopincludeliblibexecsbinsharesrcSSH_key_file
  6. [root@localhostlocal]#hadoopfs-copyFromLocalSSH_key_filetestFolder
  7. 进入web页面即可查看

参考:http://bxyzzy.blog.51cto.com/854497/352692

附: 准备FTP :yum install vsftpd (方便文件传输 和hadoop无关)

关闭防火墙:service iptables start

启动FTP:service vsftpd start


你可能感兴趣的:(centos)