使用 Sqoop 使HDFS Informix 数据可以互相写入

1.下载sqoop-1.3.0-cdh3u6.tar.gz,实验中所有hadoop相关子项目都是cdh版,解压到指定目录。

[hadoop@master software]$ tar -xvf sqoop-1.3.0-cdh3u6.tar.gz

 2.复制informix驱动到sqoop lib目录下。

[hadoop@master software]$ cp ifxjdbc.jar ../hbase/sqoop-1.3.0-cdh3u6/lib/

3.编辑hdfs数据导出到informix数据库脚本:

(3.1)informix数据导入hdfs文件的脚本:

[hadoop@master software]$ vi informixtohdfs.sh
#连接字符串
driverUrl="jdbc:informix-sqli://10.132.30.9:9088/XXXXX:informixserver=XXXXX;user=XXXXX;password=XXXXX"
#要导出的表名
informixTableName="customermanager_user"
#要保存的位置
hdfsPath=/user/hadoop/test/customermanager_user
#Informix 用户:Sqoop 将单词计数的结果写入 Informix
./sqoop import --append --connect $driverUrl --driver com.informix.jdbc.IfxDriver --target-dir $hdfsPath --table $informixTableName --fields-terminated-by '|'
  
(3.1.1)执行脚本
[hadoop@master bin]$ sh hdfstoinformix.sh 
temp table:.
13/07/03 18:08:03 INFO manager.SqlManager: Using default fetchSize of 1000
13/07/03 18:08:03 INFO tool.CodeGenTool: Beginning code generation
13/07/03 18:08:12 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM zhoupeng AS t WHERE 1=0
13/07/03 18:08:12 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM zhoupeng AS t WHERE 1=0
13/07/03 18:08:12 INFO orm.CompilationManager: HADOOP_HOME is /opt/hadoop/hadoop-0.20.2-cdh3u6
13/07/03 18:08:14 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/bdafd8b074d8de2ecdc2da071f06b3ee/zhoupeng.jar
13/07/03 18:08:14 INFO mapreduce.ExportJobBase: Beginning export of zhoupeng
13/07/03 18:08:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM zhoupeng AS t WHERE 1=0
13/07/03 18:08:16 INFO input.FileInputFormat: Total input paths to process : 1
13/07/03 18:08:16 INFO input.FileInputFormat: Total input paths to process : 1
13/07/03 18:08:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/07/03 18:08:16 WARN snappy.LoadSnappy: Snappy native library not loaded
13/07/03 18:08:16 INFO mapred.JobClient: Running job: job_201307030829_0020
13/07/03 18:08:17 INFO mapred.JobClient:  map 0% reduce 0%
13/07/03 18:08:58 INFO mapred.JobClient:  map 25% reduce 0%
13/07/03 18:08:59 INFO mapred.JobClient:  map 100% reduce 0%
13/07/03 18:08:59 INFO mapred.JobClient: Job complete: job_201307030829_0020
13/07/03 18:08:59 INFO mapred.JobClient: Counters: 17
13/07/03 18:08:59 INFO mapred.JobClient:   Job Counters 
13/07/03 18:08:59 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=159574
13/07/03 18:08:59 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/03 18:08:59 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/03 18:08:59 INFO mapred.JobClient:     Rack-local map tasks=2
13/07/03 18:08:59 INFO mapred.JobClient:     Launched map tasks=4
13/07/03 18:08:59 INFO mapred.JobClient:     Data-local map tasks=2
13/07/03 18:08:59 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/07/03 18:08:59 INFO mapred.JobClient:   FileSystemCounters
13/07/03 18:08:59 INFO mapred.JobClient:     HDFS_BYTES_READ=62373
13/07/03 18:08:59 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=259358
13/07/03 18:08:59 INFO mapred.JobClient:   Map-Reduce Framework
13/07/03 18:08:59 INFO mapred.JobClient:     Map input records=198
13/07/03 18:08:59 INFO mapred.JobClient:     Physical memory (bytes) snapshot=147419136
13/07/03 18:08:59 INFO mapred.JobClient:     Spilled Records=0
13/07/03 18:08:59 INFO mapred.JobClient:     CPU time spent (ms)=1120
13/07/03 18:08:59 INFO mapred.JobClient:     Total committed heap usage (bytes)=52822016
13/07/03 18:08:59 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1377308672
13/07/03 18:08:59 INFO mapred.JobClient:     Map output records=198
13/07/03 18:08:59 INFO mapred.JobClient:     SPLIT_RAW_BYTES=536
13/07/03 18:08:59 INFO mapreduce.ExportJobBase: Transferred 60.9111 KB in 44.9075 seconds (1.3564 KB/sec)
13/07/03 18:08:59 INFO mapreduce.ExportJobBase: Exported 198 records.


(3.2)hdfs数据导入到informix数据库中:本次测试导入的数据是建立在(3.1)的基础上。
[hadoop@master software]$ vi hdfs2toinformix.sh
#连接字符串
driverUrl="jdbc:informix-sqli://10.132.30.9:9088/XXXXX:informixserver=XXXXX;user=XXXXX;password=XXXXX"
#要导出的表名
informixTableName="customermanager_user"
#要保存的位置
hdfsPath=/user/hadoop/test/customermanager_user/
#Informix 用户:Sqoop 将hdfs中的数据导入Informix
./qoop export -D sqoop.export.records.per.statement=1 \
--driver com.informix.jdbc.IfxDriver \
--connect $driverUrl \
--table $informixTableName --export-dir $hdfsPath

[hadoop@master bin]$ sh informixtohdfs.sh 
13/07/03 18:11:28 INFO manager.SqlManager: Using default fetchSize of 1000
13/07/03 18:11:28 INFO tool.CodeGenTool: Beginning code generation
13/07/03 18:11:37 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM zhoupeng AS t WHERE 1=0
13/07/03 18:11:37 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM zhoupeng AS t WHERE 1=0
13/07/03 18:11:38 INFO orm.CompilationManager: HADOOP_HOME is /opt/hadoop/hadoop-0.20.2-cdh3u6
13/07/03 18:11:39 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/e49a8928b471a2bfeb2c5beca5ca1194/zhoupeng.jar
13/07/03 18:11:39 INFO mapreduce.ImportJobBase: Beginning import of zhoupeng
13/07/03 18:11:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM zhoupeng AS t WHERE 1=0
13/07/03 18:11:51 INFO mapred.JobClient: Running job: job_201307030829_0022
13/07/03 18:11:52 INFO mapred.JobClient:  map 0% reduce 0%
13/07/03 18:12:06 INFO mapred.JobClient:  map 100% reduce 0%
13/07/03 18:12:06 INFO mapred.JobClient: Job complete: job_201307030829_0022
13/07/03 18:12:06 INFO mapred.JobClient: Counters: 16
13/07/03 18:12:06 INFO mapred.JobClient:   Job Counters 
13/07/03 18:12:06 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=13439
13/07/03 18:12:06 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/03 18:12:06 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/07/03 18:12:06 INFO mapred.JobClient:     Launched map tasks=1
13/07/03 18:12:06 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/07/03 18:12:06 INFO mapred.JobClient:   FileSystemCounters
13/07/03 18:12:06 INFO mapred.JobClient:     HDFS_BYTES_READ=87
13/07/03 18:12:06 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=64797
13/07/03 18:12:06 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=152064
13/07/03 18:12:06 INFO mapred.JobClient:   Map-Reduce Framework
13/07/03 18:12:06 INFO mapred.JobClient:     Map input records=594
13/07/03 18:12:06 INFO mapred.JobClient:     Physical memory (bytes) snapshot=37494784
13/07/03 18:12:06 INFO mapred.JobClient:     Spilled Records=0
13/07/03 18:12:06 INFO mapred.JobClient:     CPU time spent (ms)=300
13/07/03 18:12:06 INFO mapred.JobClient:     Total committed heap usage (bytes)=15007744
13/07/03 18:12:06 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=344674304
13/07/03 18:12:06 INFO mapred.JobClient:     Map output records=594
13/07/03 18:12:06 INFO mapred.JobClient:     SPLIT_RAW_BYTES=87
13/07/03 18:12:06 INFO mapreduce.ImportJobBase: Transferred 148.5 KB in 26.4893 seconds (5.606 KB/sec)
13/07/03 18:12:06 INFO mapreduce.ImportJobBase: Retrieved 594 records.


你可能感兴趣的:(sqoop)