hadoop深度实战

1       Linux虚拟环境搭建

1.1  安装VmWare

安装完可以看到vmnet1和vmnet8两块虚拟网卡

1.2  安装linux虚拟机

安装好以后检查应该可以上外网

1.3  配置Linux虚拟机

1.3.1              用root用户登录

先用sudo passwdroot改root密码,

然后在系统的"登录窗口"选项中设置允许本地管理员登陆就可以了

而Ubuntu9.10不需要在登录窗口中设置,目前我用这个版本也不需要设置

1.3.2              开启ssh服务

本机开放SSH服务就需要安装openssh-server

sudo apt-get install openssh-server

然后确认sshserver是否启动了:

如果看到sshd那说明ssh-server已经启动了。

如果没有则可以这样启动:sudo/etc/init.d/ssh start

1.3.3              开启ftp服务

sudo apt-get install vsftpd

修改etc/vsftpd.conf配置文件

write_enable=YES #允许上传

anon_upload_enable=YES #允许匿名上传

anon_mkdir_write_enable=YES #允许匿名用户建立文件夹

local_umask=022 #更改上传文件的权限,777-022=755

重启

sudo /etc/init.d/vsftpd restart

1.3.4              开启telnet服务

1)sudo apt-get install xinetd telnetd

 2)sudo vi /etc/inetd.conf  并加入以下一行

   telnet stream tcp nowait telnetd /usr/sbin/tcpd /usr/sbin/in.telnetd

 3)sudo vi /etc/xinetd.conf  并加入以下内容:

    #Simple configuration file for xinetd

    #

    #Some defaults, and include /etc/xinetd.d/

   defaults

    {

    #Please note that you need a log_type line to be able to use log_on_success

    #and log_on_failure. The default is the following :

    # log_type = SYSLOG daemon info

 

    instances = 60

    log_type = SYSLOG authpriv

    log_on_success =HOST PID

   log_on_failure = HOST

   cps = 25 30

    }

   includedir /etc/xinetd.d

 4)sudo vi /etc/xinetd.d/telnet并加入以下内容:

    #default: on

    #description: The telnet server serves telnet sessions; it uses /

    #unencrypted username/password pairs for authentication.

   service telnet

    {

   disable = no

   flags = REUSE

   socket_type = stream

   wait = no

   user = root

   server = /usr/sbin/in.telnetd

   log_on_failure += USERID

    }

  5)重启机器或重启网络服务sudo /etc/init.d/xinetdrestart

  7)使用root登录

   mv /etc/securetty /etc/securetty.bak 这样root可以登录了。也可这样:

    修改/etc/pam.d/login这个文件。只需将下面一行注释掉即可。

#auth requiredlib/security/pam_securetty.so

2       在Linux上安装配置Hadoop

2.1  安装JDK1.6

2.1.1              检查jdk当前版本

root@ubuntu:~# java -version

2.1.2              下载和安装JDK

直接安装jre环境(这个不带开发环境,推荐用下面的安装方法)

sudo add-apt-repository " debhttp://us.archive.ubuntu.com/ubuntu/ hardy multiverse"
sudo apt-get update
sudo apt-get install sun-java6-jdk

 

注:此次安装要在机器的图形终端上执行,不能在远程登录工具上执行

安装带编译环境的jdk

1.      下载gz安装包

http://www.oracle.com/technetwork/java/javase/downloads/index.html

2.      安装

sudo tar zxvf ./jdk-7-linux-i586.tar.gz  -C /usr/lib/jvm

cd /usr/lib/jvm

sudo mv jdk1.7.0/ java-7-sun

 

vi /etc/profile

export JAVA_HOME=/usr/lib/jvm/java-7-sun

export JRE_HOME=${JAVA_HOME}/jre

export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib

export PATH=${JAVA_HOME}/bin:$PATH

2.1.3              配置环境变量

vi /etc/profile

#set java env

export JAVA_HOME="/usr/lib/jvm/java-7-sun"

export JRE_HOME="$JAVA_HOME/jre"

exportCLASSPATH=".:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH"

exportPATH="$JAVA_HOME/bin:$PATH"

2.2  配置ssh免密码登录

2.2.1              检查ssh当前版本

Ssh –version

2.2.2              安装ssh

Sudo atp-get install ssh

2.2.3              配置为可以无密码登录本机

ls –a ~

mkdir ~/.ssh

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

ls ~/.ssh

cat ~/.ssh/id_dsa.pub >>~/.ssh/authorized_keys

2.2.4              验证ssh是否安装成功

Ssh –version

Ssh localhost

2.3  创建hadoop用户

2.3.1              创建用户

root@ubuntu:~# useradd -m -d /home/hadoophadoop

root@ubuntu:~# passwd hadoop

 

修正:

1. vi /etc/passwd

hadoop:x:1001:1001::/home/hadoop:/bin/bash

修改使用bash

2.3.2              配置环境变量

exportHADOOP_HOME="/home/hadoop/hadoop"

export HADOOP_VERSION="0.20.2"

2.3.3              可用性设置

l  Vi ~/. Bashrc 添加

alias p='ps -fuhadoop'

执行

Source .bashrc

2.4  安装伪分布式hadoop

2.4.1              安装hadoop

下载安装包:http://labs.renren.com/apache-mirror/hadoop/core/

将hadoop-0.20.2.tar.gz上传到hadoop用户

$ tar -xf hadoop-0.20.2.tar.gz

$ mv hadoop-0.20.2hadoop

2.4.2              配置hadoop

l  $ vi hadoop-env.sh

export JAVA_HOME="/usr/lib/jvm/java-6-sun"

l  core-site.xml

 

 

      

              fs.default.name

              hdfs://localhost:9000

      

      

      

 hadoop.tmp.dir

 /home/hadoop/data/hadoop-${user.name}

 A base for other temporarydirectories.

 

l  Hdfs-site.xml

 

 

              dfs.replication

              1

      

 

l Mapred-site.xml

 

 

              mapred.job.tracker

              localhost:9001

      

2.4.3              格式化hdfs

$ ./hadoop namenode -format

12/03/21 19:07:45 INFO namenode.NameNode:STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = ubuntu/127.0.1.1

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 0.20.2

STARTUP_MSG:   build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707;compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010

************************************************************/

12/03/21 19:07:46 INFOnamenode.FSNamesystem: fsOwner=hadoop,hadoop

12/03/21 19:07:46 INFOnamenode.FSNamesystem: supergroup=supergroup

12/03/21 19:07:46 INFOnamenode.FSNamesystem: isPermissionEnabled=true

12/03/21 19:07:47 INFO common.Storage:Image file of size 96 saved in 0 seconds.

12/03/21 19:07:47 INFO common.Storage:Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.

12/03/21 19:07:47 INFO namenode.NameNode:SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode atubuntu/127.0.1.1

************************************************************/

2.4.4              启动hdfs和mapreduce

./start-all.sh

2.4.5              验证是否安装成功

http://192.168.190.129:50070  --hdfs监控页面

http://192.168.190.129:50030  --mapred监控页面

2.5  安装集群hadoop

2.5.1              集群规划

192.168.190.132—master,NameNode,Jobtracker:master

192.168.190.133—slave,DataNode,Tasktracker:slave1

192.168.190.134—slave,DataNode,Tasktracker:slave2

2.5.2              修改通用配置文件

l  将伪分布式相关配置文件备份到localhost目录下

$ mkdir localhost

$ cp hadoop-env.sh hdfs-site.xmlcore-site.xml mapred-site.xml slaves masters ./localhost

l  创建集群配置文件

$ mkdir claster

$ cp hadoop-env.sh hdfs-site.xmlcore-site.xml mapred-site.xml slaves masters ./claster

$ ls claster

core-site.xml  hadoop-env.sh hdfs-site.xml mapred-site.xml  masters  slaves

l  hadoop-env.sh

和伪分布式模式一样,不用改。

l  core-site.xml

 

 

      

              fs.default.name

              hdfs://master:9000

      

l  hdfs-site.xml

 

 

              dfs.replication

              2

      

l  mapred-site.xml

 

 

              mapred.job.tracker

              master:9001

      

l  masters

添加备份的master节点,这里没有备份master节点,可以不添加.

l  Slaves

slave1

slave2

l  /etc/hosts

192.168.190.132master  # Added by NetworkManager

192.168.190.133slave1

192.168.190.134slave2

127.0.0.1       localhost.localdomain   localhost

::1     ubuntu localhost6.localdomain6 localhost6

127.0.0.1       master

2.5.3              克隆节点

从master节点克隆出slave1和slave2

2.5.4              修改特性配置文件

l  /etc/hostname

分别修改主机名为master,slave1,slave2

重启主机

2.5.5              配置ssh免密码登录

此步骤的目地是达到master主机可以无密码登录到slave主机,所以只需将master主机的~/.ssh/authorized_keys文件拷贝到slave主机的相同目录下就可以了,因为这里是克隆的,所以不用拷贝了。

坚持是否有效:

在master节点上执行:


Ssh slave1

Ssh slave2

2.5.6              生效配置文件

$ cd hadoop/conf

$ cp ./claster/*.

$ ls -lrt

total 64

-rw-r--r-- 1hadoop hadoop 1195 2010-02-18 23:55 ssl-server.xml.example

-rw-r--r-- 1hadoop hadoop 1243 2010-02-18 23:55 ssl-client.xml.example

-rw-r--r-- 1hadoop hadoop 2815 2010-02-18 23:55 log4j.properties

-rw-r--r-- 1hadoop hadoop 4190 2010-02-18 23:55 hadoop-policy.xml

-rw-r--r-- 1hadoop hadoop 1245 2010-02-18 23:55 hadoop-metrics.properties

-rw-r--r-- 1hadoop hadoop  535 2010-02-18 23:55configuration.xsl

-rw-r--r-- 1hadoop hadoop 3936 2010-02-18 23:55 capacity-scheduler.xml

drwxr-xr-x 2hadoop hadoop 4096 2012-03-26 00:57 localhost

drwxr-xr-x 2hadoop hadoop 4096 2012-03-26 01:00 claster

-rw-r--r-- 1hadoop hadoop  269 2012-03-26 02:07core-site.xml

-rw-r--r-- 1hadoop hadoop 2280 2012-03-26 02:07 hadoop-env.sh

-rw-r--r-- 1hadoop hadoop  251 2012-03-26 02:07hdfs-site.xml

-rw-r--r-- 1hadoop hadoop  264 2012-03-26 02:07mapred-site.xml

-rw-r--r-- 1hadoop hadoop   14 2012-03-26 02:07slaves

-rw-r--r-- 1 hadoophadoop   10 2012-03-26 02:07 masters

2.5.7              格式化hdfs

$ hadoop namenode –format

12/03/26 02:09:31 INFO namenode.NameNode:STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = master/127.0.1.1

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 0.20.2

STARTUP_MSG:   build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707;compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010

************************************************************/

Re-format filesystem in/tmp/hadoop-hadoop/dfs/name ? (Y or N) Y

12/03/26 02:09:39 INFOnamenode.FSNamesystem: fsOwner=hadoop,hadoop

12/03/26 02:09:39 INFOnamenode.FSNamesystem: supergroup=supergroup

12/03/26 02:09:39 INFOnamenode.FSNamesystem: isPermissionEnabled=true

12/03/26 02:09:40 INFO common.Storage:Image file of size 96 saved in 0 seconds.

12/03/26 02:09:40 INFO common.Storage:Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.

12/03/26 02:09:40 INFO namenode.NameNode:SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode atmaster/127.0.1.1

************************************************************/

2.5.8              启动集群hadoop

$ start-all.sh

starting namenode, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-namenode-master.out

slave1: starting datanode, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-slave1.out

slave2: starting datanode, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-slave2.out

localhost: starting secondarynamenode,logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-master.out

starting jobtracker, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-jobtracker-master.out

slave1: starting tasktracker, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-slave1.out

slave2: starting tasktracker, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-slave2.out

2.5.9              检查启动结果

Ps –fu hadoop检查进程是否存在;

登录URL界面监控状态是否正确;

3       MapReduce小试牛刀

3.1  试用hdfs文件系统

$ ./hadoop fs -mkdir heyi

$ ./hadoop fs -ls

Found 1 items

drwxr-xr-x  - hadoop supergroup          02012-03-21 20:13 /user/hadoop/heyi

$ ./hadoop fs -put testf.txt heyi/

$ ./hadoop fs -ls heyi

Found 1 items

-rw-r--r--  1 hadoop supergroup         542012-03-21 20:17 /user/hadoop/heyi/testf.txt

$ ./hadoop fs -cat heyi/testf.txt

aaaaaaaaaaaaaaaaaa

bbbbbbbbbbbbbbbb

bbbccccccccccccc

3.2  第一个hello word程序

l  在home目录下创建源代码文件

WordCount.java

packageorg.myorg; 

 

importjava.io.IOException; 

importjava.util.*; 

 

importorg.apache.hadoop.fs.Path; 

importorg.apache.hadoop.conf.*; 

importorg.apache.hadoop.io.*; 

importorg.apache.hadoop.mapred.*; 

importorg.apache.hadoop.util.*; 

 

 

 public class WordCount { 

 

    public static class Map extendsMapReduceBase implements Mapper

      private final static IntWritable one =new IntWritable(1); 

      private Text word = new Text(); 

 

      public void map(LongWritable key, Text value,OutputCollector output, Reporter reporter) throwsIOException { 

        String line = value.toString(); 

        StringTokenizer tokenizer = newStringTokenizer(line); 

        while (tokenizer.hasMoreTokens()){ 

          word.set(tokenizer.nextToken()); 

          output.collect(word, one); 

        } 

      } 

    } 

 

    public static class Reduce extendsMapReduceBase implements Reducer

      public void reduce(Text key,Iterator values, OutputCollectoroutput, Reporter reporter) throws IOException { 

        int sum = 0; 

        while (values.hasNext()) { 

          sum += values.next().get(); 

       } 

        output.collect(key, newIntWritable(sum)); 

      } 

    } 

 

    public static void main(String[] args)throws Exception { 

      JobConf conf = newJobConf(WordCount.class); 

     conf.setJobName("wordcount"); 

 

      conf.setOutputKeyClass(Text.class); 

     conf.setOutputValueClass(IntWritable.class); 

 

      conf.setMapperClass(Map.class); 

      conf.setCombinerClass(Reduce.class); 

      conf.setReducerClass(Reduce.class); 

 

      conf.setInputFormat(TextInputFormat.class); 

     conf.setOutputFormat(TextOutputFormat.class); 

 

      FileInputFormat.setInputPaths(conf, newPath(args[0])); 

      FileOutputFormat.setOutputPath(conf, newPath(args[1])); 

 

      JobClient.runJob(conf); 

    } 

 } 

 

$ ls|grep -v sun|grep -v jdk

Desktop

Documents

Downloads

examples.desktop

hadoop

hadoop-0.20.2.tar.gz

Music

Pictures

Public

Templates

Videos

wordcount_classes

wordcount.jar

WordCount.java

$

l  编译

javac -classpath${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classesWordCount.java

l  打包

$ jar -cvfwordcount.jar -C wordcount_classes/ .

added manifest

adding:WordCount$Map.class(in = 1938) (out= 802)(deflated 58%)

adding:WordCount.class(in = 1546) (out= 750)(deflated 51%)

adding:WordCount$Reduce.class(in = 1611) (out= 648)(deflated 59%)

l  在hdfs上创建输入

$ ./hadoop fs-mkdir input

$ ./hadoop fs-ls    

Found 1 items

drwxr-xr-x   - hadoop supergroup          0 2012-03-22 00:53 /user/hadoop/input

$ echo"Hello word bye word" > file01

$ echo"Hello hadoop bye hadoop" > file02

$ ./hadoop fs-put file01 input/

$ ./hadoop fs-put file02 input/

$ ./hadoop fs-ls input

Found 2 items

-rw-r--r--   1 hadoop supergroup         20 2012-03-22 00:55/user/hadoop/input/file01

-rw-r--r--   1 hadoop supergroup         24 2012-03-22 00:55/user/hadoop/input/file02

l  在hdfs上创建输出

$ ./hadoop fs-mkdir output

$ ./hadoop fs-ls

Found 2 items

drwxr-xr-x   - hadoop supergroup          0 2012-03-22 00:55 /user/hadoop/input

drwxr-xr-x   - hadoop supergroup          0 2012-03-22 00:56/user/hadoop/output

l  执行程序

$./hadoop/bin/hadoop jar ./wordcount.jar org.myorg.WordCount inputoutput/word_count

12/03/2201:34:07 WARN mapred.JobClient: Use GenericOptionsParser for parsing thearguments. Applications should implement Tool for the same.

12/03/2201:34:07 INFO mapred.FileInputFormat: Total input paths to process : 2

12/03/2201:34:09 INFO mapred.JobClient: Running job: job_201203220026_0002

12/03/2201:34:10 INFO mapred.JobClient:  map 0%reduce 0%

12/03/2201:34:56 INFO mapred.JobClient:  map 100%reduce 0%

12/03/2201:35:24 INFO mapred.JobClient:  map 100%reduce 100%

12/03/2201:35:29 INFO mapred.JobClient: Job complete: job_201203220026_0002

12/03/2201:35:29 INFO mapred.JobClient: Counters: 18

12/03/2201:35:29 INFO mapred.JobClient:   JobCounters

12/03/2201:35:29 INFO mapred.JobClient:    Launched reduce tasks=1

12/03/2201:35:29 INFO mapred.JobClient:    Launched map tasks=2

12/03/2201:35:29 INFO mapred.JobClient:    Data-local map tasks=2

12/03/2201:35:29 INFO mapred.JobClient:  FileSystemCounters

12/03/2201:35:29 INFO mapred.JobClient:    FILE_BYTES_READ=74

12/03/2201:35:29 INFO mapred.JobClient:    HDFS_BYTES_READ=44

12/03/2201:35:29 INFO mapred.JobClient:    FILE_BYTES_WRITTEN=218

12/03/2201:35:29 INFO mapred.JobClient:    HDFS_BYTES_WRITTEN=30

12/03/22 01:35:29INFO mapred.JobClient:   Map-ReduceFramework

12/03/2201:35:29 INFO mapred.JobClient:    Reduce input groups=4

12/03/2201:35:29 INFO mapred.JobClient:    Combine output records=6

12/03/2201:35:29 INFO mapred.JobClient:     Mapinput records=2

12/03/2201:35:29 INFO mapred.JobClient:    Reduce shuffle bytes=80

12/03/2201:35:29 INFO mapred.JobClient:    Reduce output records=4

12/03/2201:35:29 INFO mapred.JobClient:    Spilled Records=12

12/03/2201:35:29 INFO mapred.JobClient:     Mapoutput bytes=76

12/03/2201:35:29 INFO mapred.JobClient:     Mapinput bytes=44

12/03/2201:35:29 INFO mapred.JobClient:    Combine input records=8

12/03/2201:35:29 INFO mapred.JobClient:     Mapoutput records=8

12/03/2201:35:29 INFO mapred.JobClient:    Reduce input records=6

 

l  检查结果

$./hadoop/bin/hadoop fs -cat output/word_count/part-00000

Hello   2

bye     2

hadoop  2

word    2

3.3  一个单表关联的例子

l  代码

package org.joinorg;

 

import java.io.IOException; 

import java.util.*; 

 

importorg.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

importorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

importorg.apache.hadoop.util.GenericOptionsParser;

//import org.apache.commons.cli.Options;

 

public class STjoin {

       publicstatic int times = 0;

      

       publicstatic class Map extends Mapper{

             

              publicvoid map(Object key1, Text value1, Context context) throws IOException,

                     InterruptedException{

                            Stringchildname = new String();

                            Stringparentname = new String();

                            Stringrelationtype = new String();

                            Stringline = value1.toString();

                            inti = 0;

                            while(line.charAt(i)!=''){

                                   i++;

                            }

                            String[]values = {line.substring(0, i), line.substring(i+1)};

                            if(values[0].compareTo("child")!= 0){

                                   childname= values[0];

                                   parentname= values[1];

                                   relationtype= "1";

                                   context.write(newText(values[1]), new Text(relationtype + "+" +

                                                 childname+ "+" + parentname));

                                   relationtype= "2";

                                   context.write(newText(values[0]), new Text(relationtype + "+" +

                                                 childname+ "+" + parentname));

                            }

                     }

       }

      

       publicstatic class Reduce extends Reducer{

             

              publicvoid reduce(Text key2, Iterable value2, Context context) throws

              IOException,InterruptedException{

                     if(times== 0){

                            context.write(newText("grandchild"), new Text("grandparent"));

                            times++;

                     }

                    

                     intgrandchildnum = 0;

                     Stringgrandchild[] = new String[10];

                     intgrandparentnum = 0;

                     Stringgrandparent[] = new String[10];

                     Iteratorite = value2.iterator();

                    

                     while(ite.hasNext()){

                            Stringrecord = ite.next().toString();

                            intlen = record.length();

                            inti = 2;

                            if(len== 0) continue;

                            charrelationtype = record.charAt(0);

                            Stringchildname = new String();

                            Stringparentname = new String();

                            while(record.charAt(i)!= '+'){

                                   childname= childname + record.charAt(i);

                                   i++;

                            }

                            i+= 1;

                            while(i

                                   parentname+= record.charAt(i);

                                   i++;

                            }

                           

                            if(relationtype== '1'){

                                   grandchild[grandchildnum]= childname;

                                   grandchildnum++;

                            }

                            else{

                                   grandparent[grandparentnum]= parentname;

                                   grandparentnum++;

                            }

                     }

                     if(grandparentnum!= 0 && grandchildnum!=0){

                            for(intm=0; m

                                   for(intn=0; n

                                          context.write(newText(grandchild[m]), new Text(grandparent[n]));

                                   }

                            }

                     }

              }

       }

      

       publicstatic void main(String[] args) throws Exception {

              Configurationconf = new Configuration();

              //String[]otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();

              String[]otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();

              if(otherArgs.length!= 2){

                     System.err.println("Usage:wordcount ");

                     System.exit(2);

              }

             

              Jobjob = new Job(conf, "single table join");

              job.setJarByClass(STjoin.class);

              job.setMapperClass(Map.class);

              job.setReducerClass(Reduce.class);

              job.setOutputKeyClass(Text.class);

              job.setOutputValueClass(Text.class);

              FileInputFormat.addInputPath(job,new Path(otherArgs[0]));

              FileOutputFormat.setOutputPath(job,new Path(otherArgs[1]));

              System.exit(job.waitForCompletion(true)? 0 : 1);

       }

}

l  编译打包

javac -classpath${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar:${HADOOP_HOME}/lib/commons-cli-1.2.jar-d STjoin_class STjoin.java

 

 

$ jar -cvf STjoin_class.jar -CSTjoin_class/ .

l  运行

$ hadoop jar ./STjoin_class.jarorg.joinorg.STjoin inputjoin output/STjoin

 

$ hadoop fs -cat output/STjoin/*

grandchild      grandparent

aaa    ccc

bbb    ddd

ccc    eee

ddd    fff

eee    ggg

4       在Linux上安装Hive

Hive的热门博客:http://www.iteye.com/blogs/tag/hive

4.1  下载并解压

wget http://mirror.bjtu.edu.cn/apache/hive/hive-0.7.1/hive-0.7.1.tar.gz

hadoop@master:~/hadoop$tar -xzf hive-0.7.1.tar.gz

hadoop@master:~/hadoop$cd hive-0.7.1

hadoop@master:~/hadoop/hive-0.7.1$ls

bin conf  docs  examples lib  LICENSE  NOTICE README.txt  RELEASE_NOTES.txt  scripts src

hadoop@master:~/hadoop$ mv hive-0.7.1hive

4.2  修改hive环境变量

hadoop@master:~/hadoop/hive/bin$ vihive-config.sh

增加

 

export HADOOP=/home/hadoop/hadoop

exprot HIVE_HOME=/home/hadoop/hadoop/hive

 

vi .profile

 

exportHADOOP_HOME="/home/hadoop/hadoop"

exportHIVE_HOME="/home/hadoop/hadoop/hive" #新增

export HADOOP_VERSION="0.20.2"

exportPATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH" #修改

4.3  检查安装情况

hadoop@master:~$ hive

Hive historyfile=/tmp/hadoop/hive_job_log_hadoop_201203272119_2011178563.txt

hive> create table tt(id int,namestring) row format delimited fields terminated by ',' collection itemsterminated by "\n" stored as textfile;

OK

Time taken: 36.15 seconds

hive> select * from tt;

OK

Time taken: 1.503 seconds

hive>

显示如上表示安装成功。

4.4  配置hive-site.xml

 hive.metastore.warehouse.dir --HDFS上的数据目录

  /user/hive/warehouse

 location of default database for thewarehouse

 

 hive.exec.scratchdir--HDFS上的临时文件目录

 /tmp/hive-${user.name}

 Scratch space for Hive jobs

 

 javax.jdo.option.ConnectionURL

 jdbc:derby:;databaseName=metastore_db;create=true

 

 JDBC connect string for a JDBCmetastore

 

 javax.jdo.option.ConnectionDriverName

 org.apache.derby.jdbc.EmbeddedDriver

 

 Driver class name for a JDBCmetastore

 

 javax.jdo.option.ConnectionUserName--DB链接用户名

 APP

 username to use against metastoredatabase

 

 javax.jdo.option.ConnectionPassword--DB链接密码

 mine

 password to use against metastoredatabase

4.5  启动hive

 

5       Hive小试牛刀

Apache hive相关帮助和介绍资料可以从以下链接获取

https://cwiki.apache.org/confluence/display/Hive/Home

5.1  创建一个内部表

hadoop@master:~$ hive

Hive historyfile=/tmp/hadoop/hive_job_log_hadoop_201203272213_1746810339.txt

hive> create table hive_table1(namestring, age int) row format delimited fields terminated by ',' stored astextfile;

OK

Time taken: 0.402 seconds

hive>

 

hadoop@master:~/hadoop/conf$ hadoop fs -ls/user/hive/warehouse

Found 2 items

drwxr-xr-x  - hadoop supergroup          02012-03-27 22:25 /user/hive/warehouse/hive_table1

drwxr-xr-x  - hadoop supergroup          02012-03-27 21:21 /user/hive/warehouse/tt

hadoop@master:~/hadoop/conf$

5.2  向表中load数据

hive>

   > load data LOCAL inpath '/home/hadoop/hadoop/hive/hive_table1.dat'into table  hive_table1;

Copying data fromfile:/home/hadoop/hadoop/hive/hive_table1.dat

Copying file:file:/home/hadoop/hadoop/hive/hive_table1.dat

Loading data to table default.hive_table1

OK

Time taken: 2.129 seconds

hive>

 

hadoop@master:~/hadoop/conf$ hadoop fs -ls/user/hive/warehouse/hive_table1

Found 1 items

-rw-r--r--  1 hadoop supergroup         492012-03-27 22:40 /user/hive/warehouse/hive_table1/hive_table1.dat

hadoop@master:~/hadoop/conf$ hadoop fs -cat/user/hive/warehouse/hive_table1/hive_table1.dat

Heyi,30

Hljk,29

lajdlf,30

alh,29

allj,27

lsjk,33

hadoop@master:~/hadoop/conf$

 

5.3  查询结果

hive>

   > select * from hive_table1;

OK

Heyi   30

Hljk   29

lajdlf 30

alh    29

allj   27

lsjk   33

Time taken: 1.601 seconds

hive>

5.4  JDBC驱动连接hive操作

5.4.1              启动远程服务接口

hadoop@master:~$ hive --service hiveserver

 

Starting Hive Thrift Server

5.4.2              编写jdbc客户端代码

//package com.javabloger.hive;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.ResultSet;

import java.sql.Statement;

//importorg.apache.hadoop.hive.jdbc.HiveDriver;

 

public class HiveTestCase {

   public static void main(String[] args) throws  Exception {

       Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver");

       //

       String dropSQL="drop table javabloger";

       String createSQL="create table javabloger (key int, valuestring)  row format delimited fields terminatedby ',' ";

       String insterSQL="LOAD DATA LOCAL INPATH'/home/hadoop/data/kv1.txt' OVERWRITE INTO TABLE javabloger";

       String querySQL="SELECT a.* FROM javabloger a";

       

       //Connection con = DriverManager.getConnection("jdbc:derby://localhost:3338/default;databaseName=metastore_db;create=true","APP", "mine");

       Connection con =DriverManager.getConnection("jdbc:hive://localhost:10000/default","", "");

       

       Statement stmt = con.createStatement();

       stmt.executeQuery(dropSQL);

       stmt.executeQuery(createSQL);

       stmt.executeQuery(insterSQL);

       ResultSet res = stmt.executeQuery(querySQL);

       

         while (res.next()) {

           System.out.println("Result: key:"+res.getString(1)+"  ->  value:" +res.getString(2));

       }

       System.out.println("ok");

    }

}

5.4.3              编译

Javac HiveTestCase.java

5.4.4              执行

使用脚本执行

#hivetest.sh

#!/bin/bash

 

echo "100,aaa" >/home/hadoop/data/kv1.txt

echo "102,aab" >>/home/hadoop/data/kv1.txt

echo "103,aac" >>/home/hadoop/data/kv1.txt

echo "104,aad" >>/home/hadoop/data/kv1.txt

echo "105,aae" >> /home/hadoop/data/kv1.txt

echo "106,aaf" >>/home/hadoop/data/kv1.txt

 

HADOOP_CORE=`ls$HADOOP_HOME/hadoop-*-core.jar`

CLASSPATH=.:$HADOOP_CORE:$HIVE_HOME/conf

 

for i in ${HIVE_HOME}/lib/*.jar ; do

   CLASSPATH=$CLASSPATH:$i

done

 

 

java -cp $CLASSPATH HiveTestCase

 

 

hadoop@master:~$ ./hivetest.sh

12/03/28 19:18:05 INFOjdbc.HiveQueryResultSet: Column names: key,value

12/03/28 19:18:05 INFOjdbc.HiveQueryResultSet: Column types: int,string

Result: key:100  -> value:aaa

Result: key:102  -> value:aab

Result: key:103  -> value:aac

Result: key:104  -> value:aad

Result: key:105  -> value:aae

Result: key:106  -> value:aaf

ok

hadoop@master:~$

6       在Linux上安装Hbase

6.1  下载并解压

wget http://mirror.bjtu.edu.cn/apache/hbase/hbase-0.90.5/hbase-0.90.5.tar.gz

hadoop@master:~/hadoop$tar xzf hbase-0.90.5.tar.gz

hadoop@master:~/hadoop$ mv hbase-0.90.5hbase

hadoop@master:~/hadoop$ cd hbase

hadoop@master:~/hadoop/hbase$ ls

bin         conf  hbase-0.90.5.jar        hbase-webapps  LICENSE.txt pom.xml     src

CHANGES.txt docs  hbase-0.90.5-tests.jar  lib           NOTICE.txt   README.txt

hadoop@master:~/hadoop/hbase$

6.2  替换hadoop-core包

将hbase/lib/hadoop-core-0.20-append-r1056497.jar包替换为

$HADOOP_HOME/ hadoop-0.20.2-core.jar

同时将$ HADOOP_HOME/hadoop-0.20.2-test.jar 拷贝到hbase/lib/目录下

如果不替换jar文件Hbase启动时会因为hadoop和Hbase的客户端协议不一致而导致HMaster启动异常。报错

6.3  修改相关环境变量

l  Vi .profile

 

export HADOOP_HOME="/home/hadoop/hadoop"

exportHIVE_HOME="/home/hadoop/hadoop/hive"

exportHBASE_HOME="/home/hadoop/hadoop/hbase"

export HADOOP_VERSION="0.20.2"

exportPATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$PATH"

 

l  hbase-env.sh

exportHBASE_MANAGES_ZK=true

hbase的运行需要用到zookeeper,而hbase-0.90.3自带了zookeeper,所以可以使用hbase自带的zookeeper,在conf/hbase-env.sh 文件中 exportHBASE_MANAGES_ZK=true  ,true表示使用hbase自带的zookeeper,如果不想使用其自带的zookeeper,自己下载包安装的化,该项设置为false。 当然如果自己安装zookeeper,启动及关闭先后顺序为:启动Hadoop—>启动ZooKeeper集群—>启动HBase—>停止HBase—>停止ZooKeeper集群—>停止Hadoop

6.4  伪分布式

6.4.1              配置hbase-site.xml

 

      

         hbase.rootdir

        hdfs://localhost:9000/hbase

          

            

      

      

      

         dfs.replication

         1

          

            

      

 

6.4.2              启动hbase

在启动hbase之前需要确保hdfs已经启动,并且已经安装了ZooKeeper,否则启动的时候会报错。

Start-hbase.sh

6.4.3              验证运行情况

hadoop@master:~/hadoop/hbase/logs$ hbaseshell

HBase Shell; enter 'help' forlist of supported commands.

Type "exit" toleave the HBase Shell

Version 0.90.5,r1212209, Fri Dec  9 05:40:36 UTC 2011

 

hbase(main):001:0> list

TABLE                                                                                                                              

0 row(s) in 3.0690 seconds

 

hbase(main):002:0> create 'test','person', 'address'

0 row(s) in 1.6310 seconds

 

hbase(main):003:0>  put 'test', 'hing', 'person:name', 'hing'

0 row(s) in 0.6720 seconds

 

hbase(main):004:0> put 'test', 'hing','person:age', '28'  

0 row(s) in 0.0440 seconds

 

hbase(main):005:0> put 'test', 'hing','address:position', 'haidian'

0 row(s) in 0.0420 seconds

 

hbase(main):006:0> put 'test', 'hing','address:zipcode', '100085' 

0 row(s) in 0.0340 seconds

 

hbase(main):007:0> put 'test','forward', 'person:name', 'forward'

0 row(s) in 0.0700 seconds

 

hbase(main):008:0> put 'test','forward', 'person:age', '27'

0 row(s) in 0.0630 seconds

 

hbase(main):009:0> put 'test','forward', 'address:position', 'xicheng'

0 row(s) in 0.0320 seconds

 

hbase(main):010:0> scan 'test'

ROW                               COLUMN+CELL                                                                                     

 forward                          column=address:position, timestamp=1333007851171, value=xicheng                                 

 forward                           column=person:age,timestamp=1333007842973, value=27                                            

 forward                           column=person:name,timestamp=1333007835784, value=forward                                      

 hing                             column=address:position, timestamp=1333007819916, value=haidian                                 

 hing                              column=address:zipcode,timestamp=1333007826558, value=100085                                   

 hing                             column=person:age, timestamp=1333007813753, value=28                                            

 hing                              column=person:name,timestamp=1333007790586, value=hing                                         

2 row(s) in 0.3380 seconds

 

hbase(main):011:0> get 'test', 'hing'

COLUMN                             CELL                                                                                             

 address:position                  timestamp=1333007819916,value=haidian                                                          

 address:zipcode                   timestamp=1333007826558,value=100085                                                            

 person:age                       timestamp=1333007813753, value=28                                                               

 person:name                       timestamp=1333007790586,value=hing                                                             

4 row(s) in 0.1200 seconds

 

hbase(main):012:0>

6.4.4              停止hbase

Stop-hbase.sh

7       Hbase小试牛刀

Hbase API相关参考文档可以从以下链接获取:

http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html

7.1  编写hbase java API程序

import java.io.IOException;

import java.io.ByteArrayOutputStream;

import java.io.DataOutputStream;

import java.io.ByteArrayInputStream;

import java.io.DataInputStream;

import java.util.Map;

import java.util.ArrayList;

import java.util.List;

 

import org.apache.hadoop.io.Writable;

import org.apache.hadoop.io.IntWritable;

importorg.apache.hadoop.conf.Configuration;

importorg.apache.hadoop.hbase.HBaseConfiguration;

importorg.apache.hadoop.hbase.HTableDescriptor;

importorg.apache.hadoop.hbase.HColumnDescriptor;

import org.apache.hadoop.hbase.client.HBaseAdmin;

importorg.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Get;

importorg.apache.hadoop.hbase.client.Delete;

 

import org.apache.hadoop.hbase.util.*;

import org.apache.hadoop.hbase.KeyValue;

 

importorg.apache.hadoop.hbase.util.Writables;

importorg.apache.hadoop.hbase.client.Result;

importorg.apache.hadoop.hbase.client.ResultScanner;

import org.apache.hadoop.hbase.client.Scan;

importorg.apache.hadoop.hbase.MasterNotRunningException;

 

//importorg.apache.hadoop.hbase.ZooKeeperConnectionException;

 

 

 

 

public class HBaseHandler {

 

   //private static HBaseConfiguration conf = null;

   private static Configuration conf = null;  

   /**

    * init config

    */

   static {

      //conf = HBaseConfiguration.create();

      //  conf = newHBaseConfiguration();

      //       conf.addResource("hbase-site.xml");

       Configuration HBASE_CONFIG = new Configuration();    

       HBASE_CONFIG.set("hbase.zookeeper.quorum", "localhost");   

       HBASE_CONFIG.set("hbase.zookeeper.property.clientPort","2181");   

       conf = HBaseConfiguration.create(HBASE_CONFIG);   

 

       

    }

   /**

    * @param args

    * @throws IOException

    */

   public static void main(String[] args) throws IOException {

       // TODO Auto-generated method stub

       System.out.println("Helloworld");

       String[] cfs;

       cfs = new String[1];

       cfs[0] = "Hello";

       createTable("Hello_test",cfs);

    }

   

 

   

   /**

    * create table

    * @throws IOException

    */

   public static void createTable(String tablename, String[] cfs) throwsIOException {

       HBaseAdmin admin = new HBaseAdmin(conf);

       if (admin.tableExists(tablename)) {

           System.out.println("table isexists");

       }

       else {

           HTableDescriptor tableDesc = new HTableDescriptor(tablename);

           for (int i = 0; i < cfs.length; i++) {

                tableDesc.addFamily(newHColumnDescriptor(cfs[i]));

           }

           admin.createTable(tableDesc);

           System.out.println("create table success");

       }

   }   

 

   /**

    * delete table

    * @param tablename

    * @throws IOException

    */

   public static void deleteTable(String tablename) throws IOException

    {

       try {

           HBaseAdmin admin = new HBaseAdmin(conf);

           admin.disableTable(tablename);

           admin.deleteTable(tablename);

           System.out.println("delete table success");

       }

       catch (MasterNotRunningException e)

       {

           e.printStackTrace();

       }

    }

   

   /**

    * insert one record

    * @param tablename

    * @param cfs

    */

   public static void writeRow(String tablename, String[] cfs) {

       try {

           HTable table = new HTable(conf, tablename);

           Put put = new Put(Bytes.toBytes("rows1"));

           for (int j = 0; j < cfs.length; j++) {

                put.add(Bytes.toBytes(cfs[j]),

                        Bytes.toBytes(String.valueOf(1)),

                       Bytes.toBytes("value_1"));

                table.put(put);

           }

       } catch (IOException e) {

           e.printStackTrace();

       }

    }

   

   /**

    * delete one record

    * @param tablename

    * @param rowkey

    * @throws IOException

    */

   public static void deleteRow(String tablename, String rowkey) throwsIOException {

       HTable table = new HTable(conf, tablename);

       List list = new ArrayList();

       Delete d1 = new Delete(rowkey.getBytes());

       list.add(d1);

       table.delete(d1);

       System.out.println("delete row success");

    }

   /**

    * query one record

    * @param tablename

    * @param rowkey

    */

   public static void selectRow(String tablename, String rowKey)

           throws IOException {

       HTable table = new HTable(conf, tablename);

       Get g = new Get(rowKey.getBytes());

       Result rs = table.get(g);

       for (KeyValue kv : rs.raw()) {

            System.out.print(newString(kv.getRow()) + "  ");

           System.out.print(new String(kv.getFamily()) + ":");

           System.out.print(new String(kv.getQualifier()) + "  ");

           System.out.print(kv.getTimestamp()+ "  ");

            System.out.println(newString(kv.getValue()));

       }

    }

   

   

   /**

    * select all recored from one table

    * @param tablename

    */

   public static void scaner(String tablename) {

       try {

           HTable table = new HTable(conf, tablename);

           Scan s = new Scan();

           ResultScanner rs = table.getScanner(s);

           for (Result r : rs) {

                KeyValue[] kv = r.raw();

                for (int i = 0; i

                    System.out.print(newString(kv[i].getRow()) + "  ");

                    System.out.print(newString(kv[i].getFamily()) + ":");

                    System.out.print(new String(kv[i].getQualifier())+ "  ");

                   System.out.print(kv[i].getTimestamp() + "  ");

                    System.out.println(newString(kv[i].getValue()));

                }

           }

       } catch (IOException e)

       {

           e.printStackTrace();

       }

    }

   

}

7.2  编译

#compile.sh

#!/bin/bash

 

 

HADOOP_CORE=`ls$HADOOP_HOME/hadoop-*-core.jar`

CLASSPATH=.:$HADOOP_CORE:$HBASE_HOME/conf

 

for i in ${HBASE_HOME}/lib/*.jar; do

   CLASSPATH=$CLASSPATH:$i

done

 

 

javac $1

 

注意:红色字体,一定要HBASE_HOMElib,我在试验时由于直接拷贝成HIVE_HOME导致运行时报版本不匹配的错误。

7.3  运行

#execjava.sh

#!/bin/bash

 

 

HADOOP_CORE=`ls$HADOOP_HOME/hadoop-*-core.jar`

CLASSPATH=.:$HADOOP_CORE:$HBASE_HOME/conf

 

for i in ${HBASE_HOME}/lib/*.jar ; do

   CLASSPATH=$CLASSPATH:$i

done

 

 

java -cp $CLASSPATH $1

 

8       在Linux下安装ZooKeeper

8.1  下载并解压

wget http://mirror.bjtu.edu.cn/apache/zookeeper/zookeeper-3.3.3/zookeeper-3.3.3.tar.gz

hadoop@master:~/hadoop$ tar -xzf zookeeper-3.3.3.tar.gz

hadoop@master:~/hadoop$ mv zookeeper-3.3.3zookeeper

hadoop@master:~/hadoop$ cd zookeeper

hadoop@master:~/hadoop/zookeeper$ ls

bin         conf        docs             lib          README.txt  zookeeper-3.3.3.jar      zookeeper-3.3.3.jar.sha1

build.xml   contrib     ivysettings.xml  LICENSE.txt recipes     zookeeper-3.3.3.jar.asc

CHANGES.txt dist-maven  ivy.xml          NOTICE.txt   src        zookeeper-3.3.3.jar.md5

hadoop@master:~/hadoop/zookeeper$

8.2  修改相关环境变量

Vi .profile

exportHADOOP_HOME="/home/hadoop/hadoop"

exportHIVE_HOME="/home/hadoop/hadoop/hive"

exportHBASE_HOME="/home/hadoop/hadoop/hbase"

exportZOOKEEPER_HOME="/home/hadoop/hadoop/zookeeper"

export HADOOP_VERSION="0.20.2"

exportPATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH"

8.3  单机下安装zookeeper

8.3.1              配置zoo.cfg

添加如下内容:

tickTime=2000

dataDir=/data/zookeeper/

clientPort=2181

8.3.2              启动ZooKeeper

hadoop@master:~/hadoop/zookeeper/bin$zkServer.sh start

JMX enabled by default

Using config:/home/hadoop/hadoop/zookeeper/bin/../conf/zoo.cfg

Starting zookeeper ...

/zookeeper_server.pid: Directorynonexistenth: 120: cannot create /home/hadoop/data/zookeeper/

STARTED

hadoop@master:~/hadoop/zookeeper/bin$2012-03-28 22:23:24,755 - INFO [main:QuorumPeerConfig@90] - Readingconfiguration from: /home/hadoop/hadoop/zookeeper/bin/../conf/zoo.cfg

2012-03-28 22:23:24,773 - WARN  [main:QuorumPeerMain@105] - Either no configor no quorum defined in config, running in standalone mode

2012-03-28 22:23:24,898 - INFO  [main:QuorumPeerConfig@90] - Reading configurationfrom: /home/hadoop/hadoop/zookeeper/bin/../conf/zoo.cfg

2012-03-28 22:23:24,903 - INFO  [main:ZooKeeperServerMain@94] - Startingserver

2012-03-28 22:23:24,986 - INFO  [main:Environment@97] - Serverenvironment:zookeeper.version=3.3.3-1073969,built on 02/23/2011 22:27 GMT

2012-03-28 22:23:24,987 - INFO  [main:Environment@97] - Serverenvironment:host.name=master

2012-03-28 22:23:24,989 - INFO  [main:Environment@97] - Serverenvironment:java.version=1.7.0_03

2012-03-28 22:23:24,991 - INFO  [main:Environment@97] - Serverenvironment:java.vendor=Oracle Corporation

2012-03-28 22:23:24,992 - INFO  [main:Environment@97] - Serverenvironment:java.home=/usr/lib/jvm/java-7-sun/jre

2012-03-28 22:23:24,992 - INFO  [main:Environment@97] - Serverenvironment:java.class.path=/home/hadoop/hadoop/zookeeper/bin/../build/classes:/home/hadoop/hadoop/zookeeper/bin/../build/lib/*.jar:/home/hadoop/hadoop/zookeeper/bin/../zookeeper-3.3.3.jar:/home/hadoop/hadoop/zookeeper/bin/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop/zookeeper/bin/../lib/jline-0.9.94.jar:/home/hadoop/hadoop/zookeeper/bin/../src/java/lib/*.jar:/home/hadoop/hadoop/zookeeper/bin/../conf:.:/usr/lib/jvm/java-7-sun/lib:/usr/lib/jvm/java-7-sun/jre/lib:

2012-03-28 22:23:24,996 - INFO  [main:Environment@97] - Serverenvironment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib

2012-03-28 22:23:25,006 - INFO  [main:Environment@97] - Serverenvironment:java.io.tmpdir=/tmp

2012-03-28 22:23:25,008 - INFO  [main:Environment@97] - Serverenvironment:java.compiler=

2012-03-28 22:23:25,009 - INFO  [main:Environment@97] - Serverenvironment:os.name=Linux

2012-03-28 22:23:25,017 - INFO  [main:Environment@97] - Serverenvironment:os.arch=i386

2012-03-28 22:23:25,018 - INFO  [main:Environment@97] - Serverenvironment:os.version=2.6.35-22-generic

2012-03-28 22:23:25,019 - INFO  [main:Environment@97] - Serverenvironment:user.name=hadoop

2012-03-28 22:23:25,020 - INFO  [main:Environment@97] - Serverenvironment:user.home=/home/hadoop

2012-03-28 22:23:25,021 - INFO  [main:Environment@97] - Serverenvironment:user.dir=/home/hadoop/hadoop/zookeeper/bin

2012-03-28 22:23:25,110 - INFO  [main:ZooKeeperServer@663] - tickTime set to2000

2012-03-28 22:23:25,111 - INFO  [main:ZooKeeperServer@672] -minSessionTimeout set to -1

2012-03-28 22:23:25,112 - INFO  [main:ZooKeeperServer@681] -maxSessionTimeout set to -1

2012-03-28 22:23:25,217 - INFO  [main:NIOServerCnxn$Factory@143] - binding toport 0.0.0.0/0.0.0.0:2181

2012-03-28 22:23:25,318 - INFO  [main:FileSnap@82] - Reading snapshot /home/hadoop/data/zookeeper/version-2/snapshot.0

2012-03-28 22:23:25,361 - INFO  [main:FileTxnSnapLog@208] - Snapshotting: 0

 

8.3.3              验证启动情况

hadoop@master:~/hadoop/zookeeper/bin$zkCli.sh -server localhost:2181

9       常用问题

9.1  版本匹配

Hadoop与hbase等存在版本匹配的问题,需要选择匹配的版本才能运行。

我的试验版本是hadoop -0.20.2+hbase- hbase-0.90.0

下载链接:http://archive.apache.org/dist/

9.2  解除安全模式

./hadoop dfsadmin -safemode leave

9.3  File /user/hadoop could only bereplicated to 0 nodes, instead of 1

hadoop@master:~$ hadoop fs -put file01 .

12/03/29 20:24:19 WARN hdfs.DFSClient:DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:java.io.IOException: File /user/hadoop could only be replicated to 0 nodes,instead of 1

 

默认的hadoop.tmp.dir的路径为/tmp/hadoop-${user.name},而我的linux系统的/tmp目录文件系统的类型往往是Hadoop不支持的。所以这里就需要更改一下hadoop.tmp.dir的路径到别的地方,但是hadoop.tmp.dir的格式必须为???//hadoop-${user.name}这种格式

你可能感兴趣的:(大数据开发)