安装完可以看到vmnet1和vmnet8两块虚拟网卡
安装好以后检查应该可以上外网
先用sudo passwdroot改root密码,
然后在系统的"登录窗口"选项中设置允许本地管理员登陆就可以了
而Ubuntu9.10不需要在登录窗口中设置,目前我用这个版本也不需要设置
本机开放SSH服务就需要安装openssh-server
sudo apt-get install openssh-server
然后确认sshserver是否启动了:
如果看到sshd那说明ssh-server已经启动了。
如果没有则可以这样启动:sudo/etc/init.d/ssh start
sudo apt-get install vsftpd
修改etc/vsftpd.conf配置文件
write_enable=YES #允许上传
anon_upload_enable=YES #允许匿名上传
anon_mkdir_write_enable=YES #允许匿名用户建立文件夹
local_umask=022 #更改上传文件的权限,777-022=755
重启
sudo /etc/init.d/vsftpd restart
1)sudo apt-get install xinetd telnetd
2)sudo vi /etc/inetd.conf 并加入以下一行
telnet stream tcp nowait telnetd /usr/sbin/tcpd /usr/sbin/in.telnetd
3)sudo vi /etc/xinetd.conf 并加入以下内容:
#Simple configuration file for xinetd
#
#Some defaults, and include /etc/xinetd.d/
defaults
{
#Please note that you need a log_type line to be able to use log_on_success
#and log_on_failure. The default is the following :
# log_type = SYSLOG daemon info
instances = 60
log_type = SYSLOG authpriv
log_on_success =HOST PID
log_on_failure = HOST
cps = 25 30
}
includedir /etc/xinetd.d
4)sudo vi /etc/xinetd.d/telnet并加入以下内容:
#default: on
#description: The telnet server serves telnet sessions; it uses /
#unencrypted username/password pairs for authentication.
service telnet
{
disable = no
flags = REUSE
socket_type = stream
wait = no
user = root
server = /usr/sbin/in.telnetd
log_on_failure += USERID
}
5)重启机器或重启网络服务sudo /etc/init.d/xinetdrestart
7)使用root登录
mv /etc/securetty /etc/securetty.bak 这样root可以登录了。也可这样:
修改/etc/pam.d/login这个文件。只需将下面一行注释掉即可。
#auth requiredlib/security/pam_securetty.so
root@ubuntu:~# java -version
直接安装jre环境(这个不带开发环境,推荐用下面的安装方法)
sudo add-apt-repository " debhttp://us.archive.ubuntu.com/ubuntu/ hardy multiverse"
sudo apt-get update
sudo apt-get install sun-java6-jdk
注:此次安装要在机器的图形终端上执行,不能在远程登录工具上执行
安装带编译环境的jdk
1. 下载gz安装包
http://www.oracle.com/technetwork/java/javase/downloads/index.html
2. 安装
sudo tar zxvf ./jdk-7-linux-i586.tar.gz -C /usr/lib/jvm
cd /usr/lib/jvm
sudo mv jdk1.7.0/ java-7-sun
vi /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-7-sun
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
vi /etc/profile
#set java env
export JAVA_HOME="/usr/lib/jvm/java-7-sun"
export JRE_HOME="$JAVA_HOME/jre"
exportCLASSPATH=".:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH"
exportPATH="$JAVA_HOME/bin:$PATH"
Ssh –version
Sudo atp-get install ssh
ls –a ~
mkdir ~/.ssh
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
ls ~/.ssh
cat ~/.ssh/id_dsa.pub >>~/.ssh/authorized_keys
Ssh –version
Ssh localhost
root@ubuntu:~# useradd -m -d /home/hadoophadoop
root@ubuntu:~# passwd hadoop
修正:
1. vi /etc/passwd
hadoop:x:1001:1001::/home/hadoop:/bin/bash
修改使用bash
exportHADOOP_HOME="/home/hadoop/hadoop"
export HADOOP_VERSION="0.20.2"
l Vi ~/. Bashrc 添加
alias p='ps -fuhadoop'
执行
Source .bashrc
下载安装包:http://labs.renren.com/apache-mirror/hadoop/core/
将hadoop-0.20.2.tar.gz上传到hadoop用户
$ tar -xf hadoop-0.20.2.tar.gz
$ mv hadoop-0.20.2hadoop
l $ vi hadoop-env.sh
export JAVA_HOME="/usr/lib/jvm/java-6-sun"
l core-site.xml
l Hdfs-site.xml
l Mapred-site.xml
$ ./hadoop namenode -format
12/03/21 19:07:45 INFO namenode.NameNode:STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = ubuntu/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707;compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
12/03/21 19:07:46 INFOnamenode.FSNamesystem: fsOwner=hadoop,hadoop
12/03/21 19:07:46 INFOnamenode.FSNamesystem: supergroup=supergroup
12/03/21 19:07:46 INFOnamenode.FSNamesystem: isPermissionEnabled=true
12/03/21 19:07:47 INFO common.Storage:Image file of size 96 saved in 0 seconds.
12/03/21 19:07:47 INFO common.Storage:Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/03/21 19:07:47 INFO namenode.NameNode:SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode atubuntu/127.0.1.1
************************************************************/
./start-all.sh
http://192.168.190.129:50070 --hdfs监控页面
http://192.168.190.129:50030 --mapred监控页面
192.168.190.132—master,NameNode,Jobtracker:master
192.168.190.133—slave,DataNode,Tasktracker:slave1
192.168.190.134—slave,DataNode,Tasktracker:slave2
l 将伪分布式相关配置文件备份到localhost目录下
$ mkdir localhost
$ cp hadoop-env.sh hdfs-site.xmlcore-site.xml mapred-site.xml slaves masters ./localhost
l 创建集群配置文件
$ mkdir claster
$ cp hadoop-env.sh hdfs-site.xmlcore-site.xml mapred-site.xml slaves masters ./claster
$ ls claster
core-site.xml hadoop-env.sh hdfs-site.xml mapred-site.xml masters slaves
l hadoop-env.sh
和伪分布式模式一样,不用改。
l core-site.xml
l hdfs-site.xml
l mapred-site.xml
l masters
添加备份的master节点,这里没有备份master节点,可以不添加.
l Slaves
slave1
slave2
l /etc/hosts
192.168.190.132master # Added by NetworkManager
192.168.190.133slave1
192.168.190.134slave2
127.0.0.1 localhost.localdomain localhost
::1 ubuntu localhost6.localdomain6 localhost6
127.0.0.1 master
从master节点克隆出slave1和slave2
l /etc/hostname
分别修改主机名为master,slave1,slave2
重启主机
此步骤的目地是达到master主机可以无密码登录到slave主机,所以只需将master主机的~/.ssh/authorized_keys文件拷贝到slave主机的相同目录下就可以了,因为这里是克隆的,所以不用拷贝了。
坚持是否有效:
在master节点上执行:
Ssh slave1
Ssh slave2
$ cd hadoop/conf
$ cp ./claster/*.
$ ls -lrt
total 64
-rw-r--r-- 1hadoop hadoop 1195 2010-02-18 23:55 ssl-server.xml.example
-rw-r--r-- 1hadoop hadoop 1243 2010-02-18 23:55 ssl-client.xml.example
-rw-r--r-- 1hadoop hadoop 2815 2010-02-18 23:55 log4j.properties
-rw-r--r-- 1hadoop hadoop 4190 2010-02-18 23:55 hadoop-policy.xml
-rw-r--r-- 1hadoop hadoop 1245 2010-02-18 23:55 hadoop-metrics.properties
-rw-r--r-- 1hadoop hadoop 535 2010-02-18 23:55configuration.xsl
-rw-r--r-- 1hadoop hadoop 3936 2010-02-18 23:55 capacity-scheduler.xml
drwxr-xr-x 2hadoop hadoop 4096 2012-03-26 00:57 localhost
drwxr-xr-x 2hadoop hadoop 4096 2012-03-26 01:00 claster
-rw-r--r-- 1hadoop hadoop 269 2012-03-26 02:07core-site.xml
-rw-r--r-- 1hadoop hadoop 2280 2012-03-26 02:07 hadoop-env.sh
-rw-r--r-- 1hadoop hadoop 251 2012-03-26 02:07hdfs-site.xml
-rw-r--r-- 1hadoop hadoop 264 2012-03-26 02:07mapred-site.xml
-rw-r--r-- 1hadoop hadoop 14 2012-03-26 02:07slaves
-rw-r--r-- 1 hadoophadoop 10 2012-03-26 02:07 masters
$ hadoop namenode –format
12/03/26 02:09:31 INFO namenode.NameNode:STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707;compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
Re-format filesystem in/tmp/hadoop-hadoop/dfs/name ? (Y or N) Y
12/03/26 02:09:39 INFOnamenode.FSNamesystem: fsOwner=hadoop,hadoop
12/03/26 02:09:39 INFOnamenode.FSNamesystem: supergroup=supergroup
12/03/26 02:09:39 INFOnamenode.FSNamesystem: isPermissionEnabled=true
12/03/26 02:09:40 INFO common.Storage:Image file of size 96 saved in 0 seconds.
12/03/26 02:09:40 INFO common.Storage:Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
12/03/26 02:09:40 INFO namenode.NameNode:SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode atmaster/127.0.1.1
************************************************************/
$ start-all.sh
starting namenode, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-namenode-master.out
slave1: starting datanode, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-slave1.out
slave2: starting datanode, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-datanode-slave2.out
localhost: starting secondarynamenode,logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-master.out
starting jobtracker, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-jobtracker-master.out
slave1: starting tasktracker, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-slave1.out
slave2: starting tasktracker, logging to/home/hadoop/hadoop/bin/../logs/hadoop-hadoop-tasktracker-slave2.out
Ps –fu hadoop检查进程是否存在;
登录URL界面监控状态是否正确;
$ ./hadoop fs -mkdir heyi
$ ./hadoop fs -ls
Found 1 items
drwxr-xr-x - hadoop supergroup 02012-03-21 20:13 /user/hadoop/heyi
$ ./hadoop fs -put testf.txt heyi/
$ ./hadoop fs -ls heyi
Found 1 items
-rw-r--r-- 1 hadoop supergroup 542012-03-21 20:17 /user/hadoop/heyi/testf.txt
$ ./hadoop fs -cat heyi/testf.txt
aaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbb
bbbccccccccccccc
l 在home目录下创建源代码文件
WordCount.java
packageorg.myorg;
importjava.io.IOException;
importjava.util.*;
importorg.apache.hadoop.fs.Path;
importorg.apache.hadoop.conf.*;
importorg.apache.hadoop.io.*;
importorg.apache.hadoop.mapred.*;
importorg.apache.hadoop.util.*;
public class WordCount {
public static class Map extendsMapReduceBase implements Mapper
private final static IntWritable one =new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value,OutputCollector
String line = value.toString();
StringTokenizer tokenizer = newStringTokenizer(line);
while (tokenizer.hasMoreTokens()){
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
public static class Reduce extendsMapReduceBase implements Reducer
public void reduce(Text key,Iterator
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, newIntWritable(sum));
}
}
public static void main(String[] args)throws Exception {
JobConf conf = newJobConf(WordCount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, newPath(args[0]));
FileOutputFormat.setOutputPath(conf, newPath(args[1]));
JobClient.runJob(conf);
}
}
$ ls|grep -v sun|grep -v jdk
Desktop
Documents
Downloads
examples.desktop
hadoop
hadoop-0.20.2.tar.gz
Music
Pictures
Public
Templates
Videos
wordcount_classes
wordcount.jar
WordCount.java
$
l 编译
javac -classpath${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classesWordCount.java
l 打包
$ jar -cvfwordcount.jar -C wordcount_classes/ .
added manifest
adding:WordCount$Map.class(in = 1938) (out= 802)(deflated 58%)
adding:WordCount.class(in = 1546) (out= 750)(deflated 51%)
adding:WordCount$Reduce.class(in = 1611) (out= 648)(deflated 59%)
l 在hdfs上创建输入
$ ./hadoop fs-mkdir input
$ ./hadoop fs-ls
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2012-03-22 00:53 /user/hadoop/input
$ echo"Hello word bye word" > file01
$ echo"Hello hadoop bye hadoop" > file02
$ ./hadoop fs-put file01 input/
$ ./hadoop fs-put file02 input/
$ ./hadoop fs-ls input
Found 2 items
-rw-r--r-- 1 hadoop supergroup 20 2012-03-22 00:55/user/hadoop/input/file01
-rw-r--r-- 1 hadoop supergroup 24 2012-03-22 00:55/user/hadoop/input/file02
l 在hdfs上创建输出
$ ./hadoop fs-mkdir output
$ ./hadoop fs-ls
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2012-03-22 00:55 /user/hadoop/input
drwxr-xr-x - hadoop supergroup 0 2012-03-22 00:56/user/hadoop/output
l 执行程序
$./hadoop/bin/hadoop jar ./wordcount.jar org.myorg.WordCount inputoutput/word_count
12/03/2201:34:07 WARN mapred.JobClient: Use GenericOptionsParser for parsing thearguments. Applications should implement Tool for the same.
12/03/2201:34:07 INFO mapred.FileInputFormat: Total input paths to process : 2
12/03/2201:34:09 INFO mapred.JobClient: Running job: job_201203220026_0002
12/03/2201:34:10 INFO mapred.JobClient: map 0%reduce 0%
12/03/2201:34:56 INFO mapred.JobClient: map 100%reduce 0%
12/03/2201:35:24 INFO mapred.JobClient: map 100%reduce 100%
12/03/2201:35:29 INFO mapred.JobClient: Job complete: job_201203220026_0002
12/03/2201:35:29 INFO mapred.JobClient: Counters: 18
12/03/2201:35:29 INFO mapred.JobClient: JobCounters
12/03/2201:35:29 INFO mapred.JobClient: Launched reduce tasks=1
12/03/2201:35:29 INFO mapred.JobClient: Launched map tasks=2
12/03/2201:35:29 INFO mapred.JobClient: Data-local map tasks=2
12/03/2201:35:29 INFO mapred.JobClient: FileSystemCounters
12/03/2201:35:29 INFO mapred.JobClient: FILE_BYTES_READ=74
12/03/2201:35:29 INFO mapred.JobClient: HDFS_BYTES_READ=44
12/03/2201:35:29 INFO mapred.JobClient: FILE_BYTES_WRITTEN=218
12/03/2201:35:29 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=30
12/03/22 01:35:29INFO mapred.JobClient: Map-ReduceFramework
12/03/2201:35:29 INFO mapred.JobClient: Reduce input groups=4
12/03/2201:35:29 INFO mapred.JobClient: Combine output records=6
12/03/2201:35:29 INFO mapred.JobClient: Mapinput records=2
12/03/2201:35:29 INFO mapred.JobClient: Reduce shuffle bytes=80
12/03/2201:35:29 INFO mapred.JobClient: Reduce output records=4
12/03/2201:35:29 INFO mapred.JobClient: Spilled Records=12
12/03/2201:35:29 INFO mapred.JobClient: Mapoutput bytes=76
12/03/2201:35:29 INFO mapred.JobClient: Mapinput bytes=44
12/03/2201:35:29 INFO mapred.JobClient: Combine input records=8
12/03/2201:35:29 INFO mapred.JobClient: Mapoutput records=8
12/03/2201:35:29 INFO mapred.JobClient: Reduce input records=6
l 检查结果
$./hadoop/bin/hadoop fs -cat output/word_count/part-00000
Hello 2
bye 2
hadoop 2
word 2
l 代码
package org.joinorg;
import java.io.IOException;
import java.util.*;
importorg.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
importorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
importorg.apache.hadoop.util.GenericOptionsParser;
//import org.apache.commons.cli.Options;
public class STjoin {
publicstatic int times = 0;
publicstatic class Map extends Mapper
publicvoid map(Object key1, Text value1, Context context) throws IOException,
InterruptedException{
Stringchildname = new String();
Stringparentname = new String();
Stringrelationtype = new String();
Stringline = value1.toString();
inti = 0;
while(line.charAt(i)!=''){
i++;
}
String[]values = {line.substring(0, i), line.substring(i+1)};
if(values[0].compareTo("child")!= 0){
childname= values[0];
parentname= values[1];
relationtype= "1";
context.write(newText(values[1]), new Text(relationtype + "+" +
childname+ "+" + parentname));
relationtype= "2";
context.write(newText(values[0]), new Text(relationtype + "+" +
childname+ "+" + parentname));
}
}
}
publicstatic class Reduce extends Reducer
publicvoid reduce(Text key2, Iterable
IOException,InterruptedException{
if(times== 0){
context.write(newText("grandchild"), new Text("grandparent"));
times++;
}
intgrandchildnum = 0;
Stringgrandchild[] = new String[10];
intgrandparentnum = 0;
Stringgrandparent[] = new String[10];
Iteratorite = value2.iterator();
while(ite.hasNext()){
Stringrecord = ite.next().toString();
intlen = record.length();
inti = 2;
if(len== 0) continue;
charrelationtype = record.charAt(0);
Stringchildname = new String();
Stringparentname = new String();
while(record.charAt(i)!= '+'){
childname= childname + record.charAt(i);
i++;
}
i+= 1;
while(i parentname+= record.charAt(i); i++; } if(relationtype== '1'){ grandchild[grandchildnum]= childname; grandchildnum++; } else{ grandparent[grandparentnum]= parentname; grandparentnum++; } } if(grandparentnum!= 0 && grandchildnum!=0){ for(intm=0; m for(intn=0; n context.write(newText(grandchild[m]), new Text(grandparent[n])); } } } } } publicstatic void main(String[] args) throws Exception { Configurationconf = new Configuration(); //String[]otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); String[]otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if(otherArgs.length!= 2){ System.err.println("Usage:wordcount System.exit(2); } Jobjob = new Job(conf, "single table join"); job.setJarByClass(STjoin.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); FileInputFormat.addInputPath(job,new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job,new Path(otherArgs[1])); System.exit(job.waitForCompletion(true)? 0 : 1); } } l 编译打包 javac -classpath${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar:${HADOOP_HOME}/lib/commons-cli-1.2.jar-d STjoin_class STjoin.java $ jar -cvf STjoin_class.jar -CSTjoin_class/ . l 运行 $ hadoop jar ./STjoin_class.jarorg.joinorg.STjoin inputjoin output/STjoin $ hadoop fs -cat output/STjoin/* grandchild grandparent aaa ccc bbb ddd ccc eee ddd fff eee ggg Hive的热门博客:http://www.iteye.com/blogs/tag/hive wget http://mirror.bjtu.edu.cn/apache/hive/hive-0.7.1/hive-0.7.1.tar.gz hadoop@master:~/hadoop$tar -xzf hive-0.7.1.tar.gz hadoop@master:~/hadoop$cd hive-0.7.1 hadoop@master:~/hadoop/hive-0.7.1$ls bin conf docs examples lib LICENSE NOTICE README.txt RELEASE_NOTES.txt scripts src hadoop@master:~/hadoop$ mv hive-0.7.1hive hadoop@master:~/hadoop/hive/bin$ vihive-config.sh 增加 export HADOOP=/home/hadoop/hadoop exprot HIVE_HOME=/home/hadoop/hadoop/hive vi .profile exportHADOOP_HOME="/home/hadoop/hadoop" exportHIVE_HOME="/home/hadoop/hadoop/hive" #新增 export HADOOP_VERSION="0.20.2" exportPATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH" #修改 hadoop@master:~$ hive Hive historyfile=/tmp/hadoop/hive_job_log_hadoop_201203272119_2011178563.txt hive> create table tt(id int,namestring) row format delimited fields terminated by ',' collection itemsterminated by "\n" stored as textfile; OK Time taken: 36.15 seconds hive> select * from tt; OK Time taken: 1.503 seconds hive> 显示如上表示安装成功。 Apache hive相关帮助和介绍资料可以从以下链接获取 https://cwiki.apache.org/confluence/display/Hive/Home hadoop@master:~$ hive Hive historyfile=/tmp/hadoop/hive_job_log_hadoop_201203272213_1746810339.txt hive> create table hive_table1(namestring, age int) row format delimited fields terminated by ',' stored astextfile; OK Time taken: 0.402 seconds hive> hadoop@master:~/hadoop/conf$ hadoop fs -ls/user/hive/warehouse Found 2 items drwxr-xr-x - hadoop supergroup 02012-03-27 22:25 /user/hive/warehouse/hive_table1 drwxr-xr-x - hadoop supergroup 02012-03-27 21:21 /user/hive/warehouse/tt hadoop@master:~/hadoop/conf$ hive> > load data LOCAL inpath '/home/hadoop/hadoop/hive/hive_table1.dat'into table hive_table1; Copying data fromfile:/home/hadoop/hadoop/hive/hive_table1.dat Copying file:file:/home/hadoop/hadoop/hive/hive_table1.dat Loading data to table default.hive_table1 OK Time taken: 2.129 seconds hive> hadoop@master:~/hadoop/conf$ hadoop fs -ls/user/hive/warehouse/hive_table1 Found 1 items -rw-r--r-- 1 hadoop supergroup 492012-03-27 22:40 /user/hive/warehouse/hive_table1/hive_table1.dat hadoop@master:~/hadoop/conf$ hadoop fs -cat/user/hive/warehouse/hive_table1/hive_table1.dat Heyi,30 Hljk,29 lajdlf,30 alh,29 allj,27 lsjk,33 hadoop@master:~/hadoop/conf$ hive> > select * from hive_table1; OK Heyi 30 Hljk 29 lajdlf 30 alh 29 allj 27 lsjk 33 Time taken: 1.601 seconds hive> hadoop@master:~$ hive --service hiveserver Starting Hive Thrift Server //package com.javabloger.hive; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.Statement; //importorg.apache.hadoop.hive.jdbc.HiveDriver; public class HiveTestCase { public static void main(String[] args) throws Exception { Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver"); // String dropSQL="drop table javabloger"; String createSQL="create table javabloger (key int, valuestring) row format delimited fields terminatedby ',' "; String insterSQL="LOAD DATA LOCAL INPATH'/home/hadoop/data/kv1.txt' OVERWRITE INTO TABLE javabloger"; String querySQL="SELECT a.* FROM javabloger a"; //Connection con = DriverManager.getConnection("jdbc:derby://localhost:3338/default;databaseName=metastore_db;create=true","APP", "mine"); Connection con =DriverManager.getConnection("jdbc:hive://localhost:10000/default","", ""); Statement stmt = con.createStatement(); stmt.executeQuery(dropSQL); stmt.executeQuery(createSQL); stmt.executeQuery(insterSQL); ResultSet res = stmt.executeQuery(querySQL); while (res.next()) { System.out.println("Result: key:"+res.getString(1)+" -> value:" +res.getString(2)); } System.out.println("ok"); } } Javac HiveTestCase.java 使用脚本执行 #hivetest.sh #!/bin/bash echo "100,aaa" >/home/hadoop/data/kv1.txt echo "102,aab" >>/home/hadoop/data/kv1.txt echo "103,aac" >>/home/hadoop/data/kv1.txt echo "104,aad" >>/home/hadoop/data/kv1.txt echo "105,aae" >> /home/hadoop/data/kv1.txt echo "106,aaf" >>/home/hadoop/data/kv1.txt HADOOP_CORE=`ls$HADOOP_HOME/hadoop-*-core.jar` CLASSPATH=.:$HADOOP_CORE:$HIVE_HOME/conf for i in ${HIVE_HOME}/lib/*.jar ; do CLASSPATH=$CLASSPATH:$i done java -cp $CLASSPATH HiveTestCase hadoop@master:~$ ./hivetest.sh 12/03/28 19:18:05 INFOjdbc.HiveQueryResultSet: Column names: key,value 12/03/28 19:18:05 INFOjdbc.HiveQueryResultSet: Column types: int,string Result: key:100 -> value:aaa Result: key:102 -> value:aab Result: key:103 -> value:aac Result: key:104 -> value:aad Result: key:105 -> value:aae Result: key:106 -> value:aaf ok hadoop@master:~$ wget http://mirror.bjtu.edu.cn/apache/hbase/hbase-0.90.5/hbase-0.90.5.tar.gz hadoop@master:~/hadoop$tar xzf hbase-0.90.5.tar.gz hadoop@master:~/hadoop$ mv hbase-0.90.5hbase hadoop@master:~/hadoop$ cd hbase hadoop@master:~/hadoop/hbase$ ls bin conf hbase-0.90.5.jar hbase-webapps LICENSE.txt pom.xml src CHANGES.txt docs hbase-0.90.5-tests.jar lib NOTICE.txt README.txt hadoop@master:~/hadoop/hbase$ 将hbase/lib/hadoop-core-0.20-append-r1056497.jar包替换为 $HADOOP_HOME/ hadoop-0.20.2-core.jar 同时将$ HADOOP_HOME/hadoop-0.20.2-test.jar 拷贝到hbase/lib/目录下 如果不替换jar文件Hbase启动时会因为hadoop和Hbase的客户端协议不一致而导致HMaster启动异常。报错 l Vi .profile export HADOOP_HOME="/home/hadoop/hadoop" exportHIVE_HOME="/home/hadoop/hadoop/hive" exportHBASE_HOME="/home/hadoop/hadoop/hbase" export HADOOP_VERSION="0.20.2" exportPATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$PATH" l hbase-env.sh exportHBASE_MANAGES_ZK=true hbase的运行需要用到zookeeper,而hbase-0.90.3自带了zookeeper,所以可以使用hbase自带的zookeeper,在conf/hbase-env.sh 文件中 exportHBASE_MANAGES_ZK=true ,true表示使用hbase自带的zookeeper,如果不想使用其自带的zookeeper,自己下载包安装的化,该项设置为false。 当然如果自己安装zookeeper,启动及关闭先后顺序为:启动Hadoop—>启动ZooKeeper集群—>启动HBase—>停止HBase—>停止ZooKeeper集群—>停止Hadoop 在启动hbase之前需要确保hdfs已经启动,并且已经安装了ZooKeeper,否则启动的时候会报错。 Start-hbase.sh hadoop@master:~/hadoop/hbase/logs$ hbaseshell HBase Shell; enter 'help Type "exit Version 0.90.5,r1212209, Fri Dec 9 05:40:36 UTC 2011 hbase(main):001:0> list TABLE 0 row(s) in 3.0690 seconds hbase(main):002:0> create 'test','person', 'address' 0 row(s) in 1.6310 seconds hbase(main):003:0> put 'test', 'hing', 'person:name', 'hing' 0 row(s) in 0.6720 seconds hbase(main):004:0> put 'test', 'hing','person:age', '28' 0 row(s) in 0.0440 seconds hbase(main):005:0> put 'test', 'hing','address:position', 'haidian' 0 row(s) in 0.0420 seconds hbase(main):006:0> put 'test', 'hing','address:zipcode', '100085' 0 row(s) in 0.0340 seconds hbase(main):007:0> put 'test','forward', 'person:name', 'forward' 0 row(s) in 0.0700 seconds hbase(main):008:0> put 'test','forward', 'person:age', '27' 0 row(s) in 0.0630 seconds hbase(main):009:0> put 'test','forward', 'address:position', 'xicheng' 0 row(s) in 0.0320 seconds hbase(main):010:0> scan 'test' ROW COLUMN+CELL forward column=address:position, timestamp=1333007851171, value=xicheng forward column=person:age,timestamp=1333007842973, value=27 forward column=person:name,timestamp=1333007835784, value=forward hing column=address:position, timestamp=1333007819916, value=haidian hing column=address:zipcode,timestamp=1333007826558, value=100085 hing column=person:age, timestamp=1333007813753, value=28 hing column=person:name,timestamp=1333007790586, value=hing 2 row(s) in 0.3380 seconds hbase(main):011:0> get 'test', 'hing' COLUMN CELL address:position timestamp=1333007819916,value=haidian address:zipcode timestamp=1333007826558,value=100085 person:age timestamp=1333007813753, value=28 person:name timestamp=1333007790586,value=hing 4 row(s) in 0.1200 seconds hbase(main):012:0> Stop-hbase.sh Hbase API相关参考文档可以从以下链接获取: http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html import java.io.IOException; import java.io.ByteArrayOutputStream; import java.io.DataOutputStream; import java.io.ByteArrayInputStream; import java.io.DataInputStream; import java.util.Map; import java.util.ArrayList; import java.util.List; import org.apache.hadoop.io.Writable; import org.apache.hadoop.io.IntWritable; importorg.apache.hadoop.conf.Configuration; importorg.apache.hadoop.hbase.HBaseConfiguration; importorg.apache.hadoop.hbase.HTableDescriptor; importorg.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin; importorg.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Get; importorg.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.util.*; import org.apache.hadoop.hbase.KeyValue; importorg.apache.hadoop.hbase.util.Writables; importorg.apache.hadoop.hbase.client.Result; importorg.apache.hadoop.hbase.client.ResultScanner; import org.apache.hadoop.hbase.client.Scan; importorg.apache.hadoop.hbase.MasterNotRunningException; //importorg.apache.hadoop.hbase.ZooKeeperConnectionException; public class HBaseHandler { //private static HBaseConfiguration conf = null; private static Configuration conf = null; /** * init config */ static { //conf = HBaseConfiguration.create(); // conf = newHBaseConfiguration(); // conf.addResource("hbase-site.xml"); Configuration HBASE_CONFIG = new Configuration(); HBASE_CONFIG.set("hbase.zookeeper.quorum", "localhost"); HBASE_CONFIG.set("hbase.zookeeper.property.clientPort","2181"); conf = HBaseConfiguration.create(HBASE_CONFIG); } /** * @param args * @throws IOException */ public static void main(String[] args) throws IOException { // TODO Auto-generated method stub System.out.println("Helloworld"); String[] cfs; cfs = new String[1]; cfs[0] = "Hello"; createTable("Hello_test",cfs); } /** * create table * @throws IOException */ public static void createTable(String tablename, String[] cfs) throwsIOException { HBaseAdmin admin = new HBaseAdmin(conf); if (admin.tableExists(tablename)) { System.out.println("table isexists"); } else { HTableDescriptor tableDesc = new HTableDescriptor(tablename); for (int i = 0; i < cfs.length; i++) { tableDesc.addFamily(newHColumnDescriptor(cfs[i])); } admin.createTable(tableDesc); System.out.println("create table success"); } } /** * delete table * @param tablename * @throws IOException */ public static void deleteTable(String tablename) throws IOException { try { HBaseAdmin admin = new HBaseAdmin(conf); admin.disableTable(tablename); admin.deleteTable(tablename); System.out.println("delete table success"); } catch (MasterNotRunningException e) { e.printStackTrace(); } } /** * insert one record * @param tablename * @param cfs */ public static void writeRow(String tablename, String[] cfs) { try { HTable table = new HTable(conf, tablename); Put put = new Put(Bytes.toBytes("rows1")); for (int j = 0; j < cfs.length; j++) { put.add(Bytes.toBytes(cfs[j]), Bytes.toBytes(String.valueOf(1)), Bytes.toBytes("value_1")); table.put(put); } } catch (IOException e) { e.printStackTrace(); } } /** * delete one record * @param tablename * @param rowkey * @throws IOException */ public static void deleteRow(String tablename, String rowkey) throwsIOException { HTable table = new HTable(conf, tablename); List list = new ArrayList(); Delete d1 = new Delete(rowkey.getBytes()); list.add(d1); table.delete(d1); System.out.println("delete row success"); } /** * query one record * @param tablename * @param rowkey */ public static void selectRow(String tablename, String rowKey) throws IOException { HTable table = new HTable(conf, tablename); Get g = new Get(rowKey.getBytes()); Result rs = table.get(g); for (KeyValue kv : rs.raw()) { System.out.print(newString(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp()+ " "); System.out.println(newString(kv.getValue())); } } /** * select all recored from one table * @param tablename */ public static void scaner(String tablename) { try { HTable table = new HTable(conf, tablename); Scan s = new Scan(); ResultScanner rs = table.getScanner(s); for (Result r : rs) { KeyValue[] kv = r.raw(); for (int i = 0; i System.out.print(newString(kv[i].getRow()) + " "); System.out.print(newString(kv[i].getFamily()) + ":"); System.out.print(new String(kv[i].getQualifier())+ " "); System.out.print(kv[i].getTimestamp() + " "); System.out.println(newString(kv[i].getValue())); } } } catch (IOException e) { e.printStackTrace(); } } } #compile.sh #!/bin/bash HADOOP_CORE=`ls$HADOOP_HOME/hadoop-*-core.jar` CLASSPATH=.:$HADOOP_CORE:$HBASE_HOME/conf for i in ${HBASE_HOME}/lib/*.jar; do CLASSPATH=$CLASSPATH:$i done javac $1 注意:红色字体,一定要HBASE_HOME的lib,我在试验时由于直接拷贝成HIVE_HOME导致运行时报版本不匹配的错误。 #execjava.sh #!/bin/bash HADOOP_CORE=`ls$HADOOP_HOME/hadoop-*-core.jar` CLASSPATH=.:$HADOOP_CORE:$HBASE_HOME/conf for i in ${HBASE_HOME}/lib/*.jar ; do CLASSPATH=$CLASSPATH:$i done java -cp $CLASSPATH $1 wget http://mirror.bjtu.edu.cn/apache/zookeeper/zookeeper-3.3.3/zookeeper-3.3.3.tar.gz hadoop@master:~/hadoop$ tar -xzf zookeeper-3.3.3.tar.gz hadoop@master:~/hadoop$ mv zookeeper-3.3.3zookeeper hadoop@master:~/hadoop$ cd zookeeper hadoop@master:~/hadoop/zookeeper$ ls bin conf docs lib README.txt zookeeper-3.3.3.jar zookeeper-3.3.3.jar.sha1 build.xml contrib ivysettings.xml LICENSE.txt recipes zookeeper-3.3.3.jar.asc CHANGES.txt dist-maven ivy.xml NOTICE.txt src zookeeper-3.3.3.jar.md5 hadoop@master:~/hadoop/zookeeper$ Vi .profile exportHADOOP_HOME="/home/hadoop/hadoop" exportHIVE_HOME="/home/hadoop/hadoop/hive" exportHBASE_HOME="/home/hadoop/hadoop/hbase" exportZOOKEEPER_HOME="/home/hadoop/hadoop/zookeeper" export HADOOP_VERSION="0.20.2" exportPATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH" 添加如下内容: tickTime=2000 dataDir=/data/zookeeper/ clientPort=2181 hadoop@master:~/hadoop/zookeeper/bin$zkServer.sh start JMX enabled by default Using config:/home/hadoop/hadoop/zookeeper/bin/../conf/zoo.cfg Starting zookeeper ... /zookeeper_server.pid: Directorynonexistenth: 120: cannot create /home/hadoop/data/zookeeper/ STARTED hadoop@master:~/hadoop/zookeeper/bin$2012-03-28 22:23:24,755 - INFO [main:QuorumPeerConfig@90] - Readingconfiguration from: /home/hadoop/hadoop/zookeeper/bin/../conf/zoo.cfg 2012-03-28 22:23:24,773 - WARN [main:QuorumPeerMain@105] - Either no configor no quorum defined in config, running in standalone mode 2012-03-28 22:23:24,898 - INFO [main:QuorumPeerConfig@90] - Reading configurationfrom: /home/hadoop/hadoop/zookeeper/bin/../conf/zoo.cfg 2012-03-28 22:23:24,903 - INFO [main:ZooKeeperServerMain@94] - Startingserver 2012-03-28 22:23:24,986 - INFO [main:Environment@97] - Serverenvironment:zookeeper.version=3.3.3-1073969,built on 02/23/2011 22:27 GMT 2012-03-28 22:23:24,987 - INFO [main:Environment@97] - Serverenvironment:host.name=master 2012-03-28 22:23:24,989 - INFO [main:Environment@97] - Serverenvironment:java.version=1.7.0_03 2012-03-28 22:23:24,991 - INFO [main:Environment@97] - Serverenvironment:java.vendor=Oracle Corporation 2012-03-28 22:23:24,992 - INFO [main:Environment@97] - Serverenvironment:java.home=/usr/lib/jvm/java-7-sun/jre 2012-03-28 22:23:24,992 - INFO [main:Environment@97] - Serverenvironment:java.class.path=/home/hadoop/hadoop/zookeeper/bin/../build/classes:/home/hadoop/hadoop/zookeeper/bin/../build/lib/*.jar:/home/hadoop/hadoop/zookeeper/bin/../zookeeper-3.3.3.jar:/home/hadoop/hadoop/zookeeper/bin/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop/zookeeper/bin/../lib/jline-0.9.94.jar:/home/hadoop/hadoop/zookeeper/bin/../src/java/lib/*.jar:/home/hadoop/hadoop/zookeeper/bin/../conf:.:/usr/lib/jvm/java-7-sun/lib:/usr/lib/jvm/java-7-sun/jre/lib: 2012-03-28 22:23:24,996 - INFO [main:Environment@97] - Serverenvironment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib 2012-03-28 22:23:25,006 - INFO [main:Environment@97] - Serverenvironment:java.io.tmpdir=/tmp 2012-03-28 22:23:25,008 - INFO [main:Environment@97] - Serverenvironment:java.compiler= 2012-03-28 22:23:25,009 - INFO [main:Environment@97] - Serverenvironment:os.name=Linux 2012-03-28 22:23:25,017 - INFO [main:Environment@97] - Serverenvironment:os.arch=i386 2012-03-28 22:23:25,018 - INFO [main:Environment@97] - Serverenvironment:os.version=2.6.35-22-generic 2012-03-28 22:23:25,019 - INFO [main:Environment@97] - Serverenvironment:user.name=hadoop 2012-03-28 22:23:25,020 - INFO [main:Environment@97] - Serverenvironment:user.home=/home/hadoop 2012-03-28 22:23:25,021 - INFO [main:Environment@97] - Serverenvironment:user.dir=/home/hadoop/hadoop/zookeeper/bin 2012-03-28 22:23:25,110 - INFO [main:ZooKeeperServer@663] - tickTime set to2000 2012-03-28 22:23:25,111 - INFO [main:ZooKeeperServer@672] -minSessionTimeout set to -1 2012-03-28 22:23:25,112 - INFO [main:ZooKeeperServer@681] -maxSessionTimeout set to -1 2012-03-28 22:23:25,217 - INFO [main:NIOServerCnxn$Factory@143] - binding toport 0.0.0.0/0.0.0.0:2181 2012-03-28 22:23:25,318 - INFO [main:FileSnap@82] - Reading snapshot /home/hadoop/data/zookeeper/version-2/snapshot.0 2012-03-28 22:23:25,361 - INFO [main:FileTxnSnapLog@208] - Snapshotting: 0 hadoop@master:~/hadoop/zookeeper/bin$zkCli.sh -server localhost:2181 Hadoop与hbase等存在版本匹配的问题,需要选择匹配的版本才能运行。 我的试验版本是hadoop -0.20.2+hbase- hbase-0.90.0 下载链接:http://archive.apache.org/dist/ ./hadoop dfsadmin -safemode leave hadoop@master:~$ hadoop fs -put file01 . 12/03/29 20:24:19 WARN hdfs.DFSClient:DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:java.io.IOException: File /user/hadoop could only be replicated to 0 nodes,instead of 1 默认的hadoop.tmp.dir的路径为/tmp/hadoop-${user.name},而我的linux系统的/tmp目录文件系统的类型往往是Hadoop不支持的。所以这里就需要更改一下hadoop.tmp.dir的路径到别的地方,但是hadoop.tmp.dir的格式必须为???//hadoop-${user.name}这种格式4 在Linux上安装Hive
4.1 下载并解压
4.2 修改hive环境变量
4.3 检查安装情况
4.4 配置hive-site.xml
4.5 启动hive
5 Hive小试牛刀
5.1 创建一个内部表
5.2 向表中load数据
5.3 查询结果
5.4 JDBC驱动连接hive操作
5.4.1 启动远程服务接口
5.4.2 编写jdbc客户端代码
5.4.3 编译
5.4.4 执行
6 在Linux上安装Hbase
6.1 下载并解压
6.2 替换hadoop-core包
6.3 修改相关环境变量
6.4 伪分布式
6.4.1 配置hbase-site.xml
6.4.2 启动hbase
6.4.3 验证运行情况
6.4.4 停止hbase
7 Hbase小试牛刀
7.1 编写hbase java API程序
7.2 编译
7.3 运行
8 在Linux下安装ZooKeeper
8.1 下载并解压
8.2 修改相关环境变量
8.3 单机下安装zookeeper
8.3.1 配置zoo.cfg
8.3.2 启动ZooKeeper
8.3.3 验证启动情况
9 常用问题
9.1 版本匹配
9.2 解除安全模式
9.3 File /user/hadoop could only bereplicated to 0 nodes, instead of 1