试了下解析TPCH里面的Q9,解释如下:
hive> explain insert overwrite table q9_product_type_profit
> select> order by nation, o_year desc;
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME orders) o) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME part) p) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME partsupp) ps) (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_TABREF (TOK_TABNAME nation) n) (TOK_TABREF (TOK_TABNAME supplier) s) (= (. (TOK_TABLE_OR_COL n) n_nationkey) (. (TOK_TABLE_OR_COL s) s_nationkey)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL s_suppkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL n_name))))) s1) (TOK_TABREF (TOK_TABNAME lineitem) l) (= (. (TOK_TABLE_OR_COL s1) s_suppkey) (. (TOK_TABLE_OR_COL l) l_suppkey)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL l_suppkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_extendedprice)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_discount)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_quantity)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_partkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_orderkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL n_name))))) l1) (and (= (. (TOK_TABLE_OR_COL ps) ps_suppkey) (. (TOK_TABLE_OR_COL l1) l_suppkey)) (= (. (TOK_TABLE_OR_COL ps) ps_partkey) (. (TOK_TABLE_OR_COL l1) l_partkey))))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL l_extendedprice)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_discount)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_quantity)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_partkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_orderkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL n_name)) (TOK_SELEXPR (TOK_TABLE_OR_COL ps_supplycost))))) l2) (and (like (. (TOK_TABLE_OR_COL p) p_name) '%green%') (= (. (TOK_TABLE_OR_COL p) p_partkey) (. (TOK_TABLE_OR_COL l2) l_partkey))))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL l_extendedprice)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_discount)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_quantity)) (TOK_SELEXPR (TOK_TABLE_OR_COL l_orderkey)) (TOK_SELEXPR (TOK_TABLE_OR_COL n_name)) (TOK_SELEXPR (TOK_TABLE_OR_COL ps_supplycost))))) l3) (= (. (TOK_TABLE_OR_COL o) o_orderkey) (. (TOK_TABLE_OR_COL l3) l_orderkey)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL n_name) nation) (TOK_SELEXPR (TOK_FUNCTION year (TOK_TABLE_OR_COL o_orderdate)) o_year) (TOK_SELEXPR (- (* (TOK_TABLE_OR_COL l_extendedprice) (- 1 (TOK_TABLE_OR_COL l_discount))) (* (TOK_TABLE_OR_COL ps_supplycost) (TOK_TABLE_OR_COL l_quantity))) amount)))) profit)) (TOK_INSERT (TOK_DESTINATION (TOK_TAB (TOK_TABNAME q9_product_type_profit))) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL nation)) (TOK_SELEXPR (TOK_TABLE_OR_COL o_year)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL amount)) sum_profit)) (TOK_GROUPBY (TOK_TABLE_OR_COL nation) (TOK_TABLE_OR_COL o_year)) (TOK_ORDERBY (TOK_TABSORTCOLNAMEASC (TOK_TABLE_OR_COL nation)) (TOK_TABSORTCOLNAMEDESC (TOK_TABLE_OR_COL o_year)))))
STAGE DEPENDENCIES:
Stage-7 is a root stage
Stage-8 depends on stages: Stage-7
Stage-1 depends on stages: Stage-8
Stage-2 depends on stages: Stage-1
Stage-3 depends on stages: Stage-2
Stage-4 depends on stages: Stage-3
Stage-5 depends on stages: Stage-4
Stage-0 depends on stages: Stage-5
Stage-6 depends on stages: Stage-0
STAGE PLANS:
Stage: Stage-7
Map Reduce
Alias -> Map Operator Tree:
profit:l3:l2:l1:s1:n
TableScan
alias: n
Reduce Output Operator
key expressions:
expr: n_nationkey
type: int
sort order: +
Map-reduce partition columns:
expr: n_nationkey
type: int
tag: 0
value expressions:
expr: n_name
type: string
profit:l3:l2:l1:s1:s
TableScan
alias: s
Reduce Output Operator
key expressions:
expr: s_nationkey
type: int
sort order: +
Map-reduce partition columns:
expr: s_nationkey
type: int
tag: 1
value expressions:
expr: s_suppkey
type: int
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {VALUE._col1}
1 {VALUE._col0}
handleSkewJoin: false
outputColumnNames: _col1, _col6
Select Operator
expressions:
expr: _col6
type: int
expr: _col1
type: string
outputColumnNames: _col0, _col1
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-8
Map Reduce
Alias -> Map Operator Tree:
$INTNAME
Reduce Output Operator
key expressions:
expr: _col0
type: int
sort order: +
Map-reduce partition columns:
expr: _col0
type: int
tag: 0
value expressions:
expr: _col1
type: string
profit:l3:l2:l1:l
TableScan
alias: l
Reduce Output Operator
key expressions:
expr: l_suppkey
type: int
sort order: +
Map-reduce partition columns:
expr: l_suppkey
type: int
tag: 1
value expressions:
expr: l_orderkey
type: int
expr: l_partkey
type: int
expr: l_suppkey
type: int
expr: l_quantity
type: double
expr: l_extendedprice
type: double
expr: l_discount
type: double
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {VALUE._col1}
1 {VALUE._col0} {VALUE._col1} {VALUE._col2} {VALUE._col4} {VALUE._col5} {VALUE._col6}
handleSkewJoin: false
outputColumnNames: _col1, _col2, _col3, _col4, _col6, _col7, _col8
Select Operator
expressions:
expr: _col4
type: int
expr: _col7
type: double
expr: _col8
type: double
expr: _col6
type: double
expr: _col3
type: int
expr: _col2
type: int
expr: _col1
type: string
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
$INTNAME
Reduce Output Operator
key expressions:
expr: _col0
type: int
expr: _col4
type: int
sort order: ++
Map-reduce partition columns:
expr: _col0
type: int
expr: _col4
type: int
tag: 1
value expressions:
expr: _col1
type: double
expr: _col2
type: double
expr: _col3
type: double
expr: _col4
type: int
expr: _col5
type: int
expr: _col6
type: string
profit:l3:l2:ps
TableScan
alias: ps
Reduce Output Operator
key expressions:
expr: ps_suppkey
type: int
expr: ps_partkey
type: int
sort order: ++
Map-reduce partition columns:
expr: ps_suppkey
type: int
expr: ps_partkey
type: int
tag: 0
value expressions:
expr: ps_supplycost
type: double
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {VALUE._col3}
1 {VALUE._col1} {VALUE._col2} {VALUE._col3} {VALUE._col4} {VALUE._col5} {VALUE._col6}
handleSkewJoin: false
outputColumnNames: _col3, _col8, _col9, _col10, _col11, _col12, _col13
Select Operator
expressions:
expr: _col8
type: double
expr: _col9
type: double
expr: _col10
type: double
expr: _col11
type: int
expr: _col12
type: int
expr: _col13
type: string
expr: _col3
type: double
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-2
Map Reduce
Alias -> Map Operator Tree:
$INTNAME
Reduce Output Operator
key expressions:
expr: _col3
type: int
sort order: +
Map-reduce partition columns:
expr: _col3
type: int
tag: 1
value expressions:
expr: _col0
type: double
expr: _col1
type: double
expr: _col2
type: double
expr: _col4
type: int
expr: _col5
type: string
expr: _col6
type: double
profit:l3:p
TableScan
alias: p
Filter Operator
predicate:
expr: (p_name like '%green%')
type: boolean
Reduce Output Operator
key expressions:
expr: p_partkey
type: int
sort order: +
Map-reduce partition columns:
expr: p_partkey
type: int
tag: 0
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0
1 {VALUE._col0} {VALUE._col1} {VALUE._col2} {VALUE._col4} {VALUE._col5} {VALUE._col6}
handleSkewJoin: false
outputColumnNames: _col11, _col12, _col13, _col15, _col16, _col17
Select Operator
expressions:
expr: _col11
type: double
expr: _col12
type: double
expr: _col13
type: double
expr: _col15
type: int
expr: _col16
type: string
expr: _col17
type: double
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-3
Map Reduce
Alias -> Map Operator Tree:
$INTNAME
Reduce Output Operator
key expressions:
expr: _col3
type: int
sort order: +
Map-reduce partition columns:
expr: _col3
type: int
tag: 1
value expressions:
expr: _col0
type: double
expr: _col1
type: double
expr: _col2
type: double
expr: _col4
type: string
expr: _col5
type: double
profit:o
TableScan
alias: o
Reduce Output Operator
key expressions:
expr: o_orderkey
type: int
sort order: +
Map-reduce partition columns:
expr: o_orderkey
type: int
tag: 0
value expressions:
expr: o_orderdate
type: string
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {VALUE._col4}
1 {VALUE._col0} {VALUE._col1} {VALUE._col2} {VALUE._col4} {VALUE._col5}
handleSkewJoin: false
outputColumnNames: _col4, _col11, _col12, _col13, _col15, _col16
Select Operator
expressions:
expr: _col15
type: string
expr: year(_col4)
type: int
expr: ((_col11 * (1 - _col12)) - (_col16 * _col13))
type: double
outputColumnNames: _col0, _col1, _col2
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: int
expr: _col2
type: double
outputColumnNames: _col0, _col1, _col2
Group By Operator
aggregations:
expr: sum(_col2)
bucketGroup: false
keys:
expr: _col0
type: string
expr: _col1
type: int
mode: hash
outputColumnNames: _col0, _col1, _col2
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-4
Map Reduce
Alias -> Map Operator Tree:
hdfs://master:9000/tmp/hive-hadoop/hive_2013-11-10_17-59-57_971_4458270102831923970/-mr-10004
Reduce Output Operator
key expressions:
expr: _col0
type: string
expr: _col1
type: int
sort order: ++
Map-reduce partition columns:
expr: _col0
type: string
expr: _col1
type: int
tag: -1
value expressions:
expr: _col2
type: double
Reduce Operator Tree:
Group By Operator
aggregations:
expr: sum(VALUE._col0)
bucketGroup: false
keys:
expr: KEY._col0
type: string
expr: KEY._col1
type: int
mode: mergepartial
outputColumnNames: _col0, _col1, _col2
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: int
expr: _col2
type: double
outputColumnNames: _col0, _col1, _col2
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-5
Map Reduce
Alias -> Map Operator Tree:
hdfs://master:9000/tmp/hive-hadoop/hive_2013-11-10_17-59-57_971_4458270102831923970/-mr-10005
Reduce Output Operator
key expressions:
expr: _col0
type: string
expr: _col1
type: int
sort order: +-
tag: -1
value expressions:
expr: _col0
type: string
expr: _col1
type: int
expr: _col2
type: double
Reduce Operator Tree:
Extract
File Output Operator
compressed: false
GlobalTableId: 1
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name: default.q9_product_type_profit
Stage: Stage-0
Move Operator
tables:
replace: true
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name: default.q9_product_type_profit
Stage: Stage-6
Stats-Aggr Operator
Time taken: 0.444 seconds