grunt> cat t.txt
kw1 2
kw3 1
kw2 4
kw1 5
kw2 2
cat test.pig
A = LOAD '/user/input/t.txt' as (k:chararray,c:int);
B = group A BY k;
C = foreach B generate group,SUM(A.c);
-- DUMP C;
store C into 'test.output';
$ pig -e 'illustrate -script test.pig'
2014-05-03 17:11:25,182 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/dataset/pig_1399108285179.log
2014-05-03 17:11:25,330 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000
2014-05-03 17:11:25,514 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001
2014-05-03 17:11:26,103 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000
2014-05-03 17:11:26,104 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001
2014-05-03 17:11:26,291 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,305 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,306 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,315 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,330 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,474 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-05-03 17:11:26,475 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-05-03 17:11:26,513 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,520 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,521 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,521 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,522 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,523 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,531 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,597 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,599 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,599 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,600 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,601 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,601 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,608 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,611 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,611 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,639 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,641 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,642 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,642 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,643 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,643 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,650 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,652 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,677 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,679 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,679 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,680 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,680 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,681 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,686 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,688 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,688 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,710 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?
false
2014-05-03 17:11:26,712 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,712 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,713 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,714 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,714 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,721 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,724 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,724 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,744 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,746 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,746 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,747 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,747 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,748 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,754 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,757 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,757 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,772 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,774 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,774 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,775 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,775 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,776 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,782 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,784 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,784 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,804 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,807 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,807 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,808 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,812 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,821 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,821 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
(kw1,2)
2014-05-03 17:11:26,840 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?
false
2014-05-03 17:11:26,842 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,842 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,842 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,843 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,843 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,846 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,849 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,849 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,862 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,864 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,864 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,865 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,868 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,870 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,870 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,882 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?
false
2014-05-03 17:11:26,884 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,884 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,884 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,885 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,885 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,887 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,889 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,890 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,901 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,903 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,903 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,903 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,904 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,904 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,906 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,908 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,908 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,919 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?
false
2014-05-03 17:11:26,920 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,920 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,921 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,921 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,922 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,924 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,926 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,926 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,937 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?
false
2014-05-03 17:11:26,938 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,938 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,938 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,939 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,939 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,941 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,943 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,943 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,954 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:11:26,955 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,955 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,956 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,956 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,956 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,959 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,961 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,961 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2014-05-03 17:11:26,973 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic?
false
2014-05-03 17:11:26,974 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:11:26,974 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-05-03 17:11:26,974 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AccumulatorOptimizer - Reducer is to run in accumulative mode.
2014-05-03 17:11:26,975 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-05-03 17:11:26,975 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-05-03 17:11:26,978 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-05-03 17:11:26,980 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=30
2014-05-03 17:11:26,980 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
-------------------------------------
| A | k:chararray | c:int |
-------------------------------------
| | kw1 | 2 |
| | kw1 | 5 |
-------------------------------------
-----------------------------------------------------------------------------
| B | group:chararray | A:bag{:tuple(k:chararray,c:int)} |
-----------------------------------------------------------------------------
| | kw1 | {(kw1, 2), (kw1, 5)} |
-----------------------------------------------------------------------------
-----------------------------------------
| C | group:chararray | :long |
-----------------------------------------
| | kw1 | 7 |
-----------------------------------------
-------------------------------------------------
| Store : C | group:chararray | :long |
-------------------------------------------------
| | kw1 | 7 |
-------------------------------------------------
$ pig -e 'explain -script test.pig'
2014-05-03 17:19:59,359 [main] INFO org.apache.pig.Main - Logging error messages to: /opt/dataset/pig_1399108799355.log
2014-05-03 17:19:59,497 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.0.3.142:9000
2014-05-03 17:19:59,685 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.0.3.142:9001
#-----------------------------------------------
# New Logical Plan:
#-----------------------------------------------
C: (Name: LOStore Schema: group#19:chararray,#34:long)
|
|---C: (Name: LOForEach Schema: group#19:chararray,#34:long)
| |
| (Name: LOGenerate[false,false] Schema: group#19:chararray,#34:long)ColumnPrune:InputUids=[19, 30]ColumnPrune:OutputUids=[34, 19]
| | |
| | group:(Name: Project Type: chararray Uid: 19 Input: 0 Column: (*))
| | |
| | (Name: UserFunc(org.apache.pig.builtin.IntSum) Type: long Uid: 34)
| | |
| | |---(Name: Dereference Type: bag Uid: 33 Column:[1])
| | |
| | |---A:(Name: Project Type: bag Uid: 30 Input: 1 Column: (*))
| |
| |---(Name: LOInnerLoad[0] Schema: group#19:chararray)
| |
| |---A: (Name: LOInnerLoad[1] Schema: k#19:chararray,c#20:int)
|
|---B: (Name: LOCogroup Schema: group#19:chararray,A#30:bag{#37:tuple(k#19:chararray,c#20:int)})
| |
| k:(Name: Project Type: chararray Uid: 19 Input: 0 Column: 0)
|
|---A: (Name: LOForEach Schema: k#19:chararray,c#20:int)
| |
| (Name: LOGenerate[false,false] Schema: k#19:chararray,c#20:int)ColumnPrune:InputUids=[19, 20]ColumnPrune:OutputUids=[19, 20]
| | |
| | (Name: Cast Type: chararray Uid: 19)
| | |
| | |---k:(Name: Project Type: bytearray Uid: 19 Input: 0 Column: (*))
| | |
| | (Name: Cast Type: int Uid: 20)
| | |
| | |---c:(Name: Project Type: bytearray Uid: 20 Input: 1 Column: (*))
| |
| |---(Name: LOInnerLoad[0] Schema: k#19:bytearray)
| |
| |---(Name: LOInnerLoad[1] Schema: c#20:bytearray)
|
|---A: (Name: LOLoad Schema: k#19:bytearray,c#20:bytearray)RequiredFields:null
#-----------------------------------------------
# Physical Plan:
#-----------------------------------------------
C: Store(hdfs://namenode:9000/user/deve_test_user/test.output:org.apache.pig.builtin.PigStorage) - scope-19
|
|---C: New For Each(false,false)[bag] - scope-18
| |
| Project[chararray][0] - scope-12
| |
| POUserFunc(org.apache.pig.builtin.IntSum)[long] - scope-16
| |
| |---Project[bag][1] - scope-15
| |
| |---Project[bag][1] - scope-14
|
|---B: Package[tuple]{chararray} - scope-9
|
|---B: Global Rearrange[tuple] - scope-8
|
|---B: Local Rearrange[tuple]{chararray}(false) - scope-10
| |
| Project[chararray][0] - scope-11
|
|---A: New For Each(false,false)[bag] - scope-7
| |
| Cast[chararray] - scope-2
| |
| |---Project[bytearray][0] - scope-1
| |
| Cast[int] - scope-5
| |
| |---Project[bytearray][1] - scope-4
|
|---A: Load(/user/input/t.txt:org.apache.pig.builtin.PigStorage) - scope-0
2014-05-03 17:20:00,316 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-05-03 17:20:00,326 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2014-05-03 17:20:00,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-05-03 17:20:00,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-20
Map Plan
B: Local Rearrange[tuple]{chararray}(false) - scope-33
| |
| Project[chararray][0] - scope-34
|
|---C: New For Each(false,false)[bag] - scope-21
| |
| Project[chararray][0] - scope-22
| |
| POUserFunc(org.apache.pig.builtin.IntSum$Initial)[tuple] - scope-23
| |
| |---Project[bag][1] - scope-24
| |
| |---Project[bag][1] - scope-25
|
|---Pre Combiner Local Rearrange[tuple]{Unknown} - scope-35
|
|---A: New For Each(false,false)[bag] - scope-7
| |
| Cast[chararray] - scope-2
| |
| |---Project[bytearray][0] - scope-1
| |
| Cast[int] - scope-5
| |
| |---Project[bytearray][1] - scope-4
|
|---A: Load(/user/input/t.txt:org.apache.pig.builtin.PigStorage) - scope-0--------
Combine Plan
B: Local Rearrange[tuple]{chararray}(false) - scope-37
| |
| Project[chararray][0] - scope-38
|
|---C: New For Each(false,false)[bag] - scope-26
| |
| Project[chararray][0] - scope-27
| |
| POUserFunc(org.apache.pig.builtin.IntSum$Intermediate)[tuple] - scope-28
| |
| |---Project[bag][1] - scope-29
|
|---POCombinerPackage[tuple]{chararray} - scope-31--------
Reduce Plan
C: Store(hdfs://namenode:9000/user/deve_test_user/test.output:org.apache.pig.builtin.PigStorage) - scope-19
|
|---C: New For Each(false,false)[bag] - scope-18
| |
| Project[chararray][0] - scope-12
| |
| POUserFunc(org.apache.pig.builtin.IntSum$Final)[long] - scope-16
| |
| |---Project[bag][1] - scope-30
|
|---POCombinerPackage[tuple]{chararray} - scope-39--------
Global sort: false
----------------