连接建好 就开始建Job了
sqoop:000> create job --xid 1 --type import
Creating job for connection with id 1
Please fill following values to create new job object
Name: dimDate
Database configuration
Schema name: dbo
Table name: dimDate
Table SQL statement:
Table column names:
Partition column name:
Nulls in partition column:
Boundary query:
Output configuration
Storage type:
0 : HDFS
Choose: 0
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
Choose: 0
Output directory: /home/dimDate
Throttling resources
Extractors:
Loaders:
New job was successfully created with validation status FINE and persistent id 1
-------------------------------------------------------------
跑job
sqoop:000> start job --jid 1
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
Created by: root
Creation date: 2014-03-20 07:01:12 PDT
Lastly updated by: root
External ID: job_1395223907193_0001
http://localhost.localdomain:8088/proxy/application_1395223907193_0001/
2014-03-20 07:01:12 PDT: BOOTING - Progress is not available
查看 http://XXXXXXX:8088/proxy/application_1395223907193_0001/
发觉job 在跑
使用命令也可以查看状态
sqoop:000> status job --jid 1
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
Created by: root
Creation date: 2014-03-20 07:01:12 PDT
Lastly updated by: root
External ID: job_1395223907193_0001
http://localhost.localdomain:8088/proxy/application_1395223907193_0001/
2014-03-20 07:03:31 PDT: RUNNING - 0.00 %
过些时候再查
sqoop:000> status job --jid 1
Submission details
Job ID: 1
Server URL: http://localhost:12000/sqoop/
Created by: root
Creation date: 2014-03-20 07:01:12 PDT
Lastly updated by: root
External ID: job_1395223907193_0001
http://localhost.localdomain:8088/proxy/application_1395223907193_0001/
2014-03-20 07:04:46 PDT: SUCCEEDED
Counters:
org.apache.hadoop.mapreduce.JobCounter
SLOTS_MILLIS_MAPS: 772203
MB_MILLIS_MAPS: 790735872
TOTAL_LAUNCHED_MAPS: 10
MILLIS_MAPS: 772203
VCORES_MILLIS_MAPS: 772203
SLOTS_MILLIS_REDUCES: 0
OTHER_LOCAL_MAPS: 10
org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
BYTES_WRITTEN: 129332
org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
BYTES_READ: 0
org.apache.hadoop.mapreduce.TaskCounter
MAP_INPUT_RECORDS: 0
MERGED_MAP_OUTPUTS: 0
PHYSICAL_MEMORY_BYTES: 612765696
SPILLED_RECORDS: 0
COMMITTED_HEAP_BYTES: 161021952
CPU_MILLISECONDS: 8390
FAILED_SHUFFLE: 0
VIRTUAL_MEMORY_BYTES: 3890085888
SPLIT_RAW_BYTES: 1391
MAP_OUTPUT_RECORDS: 1188
GC_TIME_MILLIS: 2962
org.apache.hadoop.mapreduce.FileSystemCounter
FILE_WRITE_OPS: 0
FILE_READ_OPS: 0
FILE_LARGE_READ_OPS: 0
FILE_BYTES_READ: 0
HDFS_BYTES_READ: 1391
FILE_BYTES_WRITTEN: 934750
HDFS_LARGE_READ_OPS: 0
HDFS_WRITE_OPS: 20
HDFS_READ_OPS: 40
HDFS_BYTES_WRITTEN: 129332
org.apache.sqoop.submission.counter.SqoopCounters
ROWS_READ: 1188
Job executed successfully
应该成功了去HDFS 查看下
[root@localhost /]# hadoop fs -ls /home/ 14/03/20 08:17:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 2 items drwxr-xr-x - root supergroup 0 2014-03-20 07:04 /home/dimDate -rw-r--r-- 1 root supergroup 511 2014-03-20 08:04 /home/dimDate.sql [root@localhost /]# hdfs dfs -ls /home/dimDate/ 14/03/20 08:17:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 11 items -rw-r--r-- 1 root supergroup 0 2014-03-20 07:04 /home/dimDate/_SUCCESS -rw-r--r-- 1 root supergroup 20748 2014-03-20 07:04 /home/dimDate/part-m-00000 -rw-r--r-- 1 root supergroup 22248 2014-03-20 07:04 /home/dimDate/part-m-00001 -rw-r--r-- 1 root supergroup 17461 2014-03-20 07:04 /home/dimDate/part-m-00002 -rw-r--r-- 1 root supergroup 25573 2014-03-20 07:04 /home/dimDate/part-m-00003 -rw-r--r-- 1 root supergroup 14132 2014-03-20 07:04 /home/dimDate/part-m-00004 -rw-r--r-- 1 root supergroup 25693 2014-03-20 07:04 /home/dimDate/part-m-00005 -rw-r--r-- 1 root supergroup 0 2014-03-20 07:04 /home/dimDate/part-m-00006 -rw-r--r-- 1 root supergroup 0 2014-03-20 07:04 /home/dimDate/part-m-00007 -rw-r--r-- 1 root supergroup 0 2014-03-20 07:04 /home/dimDate/part-m-00008 -rw-r--r-- 1 root supergroup 3477 2014-03-20 07:04 /home/dimDate/part-m-00009
ok了 把dimDate 分成了好多小文件
可以用
hdfs dfs -cat /home/dimDate/part-m-00001
看下就知道了
关于增量的问题. 做找了下似乎没法做 详见
Sqoop 2 currently can't do incremental imports, however implementing this >> feature is definitely in our plan!
只有等了
去看了下 /home/sqoop-1.99.3/@LOGDIR@/spooq.log 发现有错误
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From localhost.admin/127.0.0.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212) at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100) ... 12 more
说明Hadoop2.X的端口指向还没弄好, 需要去设置下 core-site.xml
<property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property>