Hadoop study notes - Hive simple example

hive> create table dumprecord (line string);
OK
Time taken: 3.813 seconds
hive> load data local inpath '/home/userkkk/dump20gfile/DumpFileDemo.out'
    > overwrite into table dumprecord;
Copying data from file:/home/userkkk/dump20gfile/DumpFileDemo.out
Copying file: file:/home/userkkk/dump20gfile/DumpFileDemo.out
Loading data to table default.dumprecord
Deleted file:/user/hive/warehouse/dumprecord
OK
Time taken: 11.331 seconds
hive> ! wc -l /user/hive/warehouse/dumprecord/DumpFileDemo.out;
26396370 /user/hive/warehouse/dumprecord/DumpFileDemo.out
hive> select count(*) from dumprecord;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Execution log at: /tmp/root/root_20120313234141_497d50d6-f993-4db3-b550-4c4b5650ddeb.log
Job running in-process (local Hadoop)
2012-03-13 23:41:38,801 null map = 0%,  reduce = 0%
2012-03-13 23:42:00,855 null map = 100%,  reduce = 100%
Ended Job = job_local_0001
OK
26396370
Time taken: 25.635 seconds
[root@vm-6d71-fcfa hadoop]# grep 'The automatic failover chain feature does not currently work when using multiple masters.' /user/hive/warehouse/dumprecord/DumpFi
leDemo.out | wc -l
225225
hive> select count(*) from dumprecord where line like '%The automatic failover chain feature does not currently work when using multiple masters.%';
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Execution log at: /tmp/root/root_20120313234444_c2752641-4083-4dd6-9e47-830f1f4bf26c.log
Job running in-process (local Hadoop)
2012-03-13 23:44:49,518 null map = 0%,  reduce = 0%
2012-03-13 23:45:48,664 null map = 100%,  reduce = 100%
Ended Job = job_local_0001
OK
225225
Time taken: 62.416 seconds
hive>

你可能感兴趣的:(example)