AntDB 3.1版本引入Cluster Plan,区别于原PGXC的执行计划,通过Reduce Plan支持数据的实时动态分布,将原本PGXC无法下沉到Datanode执行的执行计划做了优化,使得执行计划的执行压力分散到各个Datanode节点,一方面减轻Coordinator节点的性能压力,另一方面提高了SQL的执行效率。
鉴于表的实际情况,Cluster Plan不一定是最优的执行计划,本文仅列举部分用例以示Cluster Plan与Remote Query Plan的区别
SELECT unique2
, (
SELECT COUNT(1)
FROM onek
WHERE odd > 10
) AS cnt
FROM tenk1;
QUERY PLAN
--------------------------------------------------------------------------------------------
Cluster Gather (cost=974.46..4089.21 rows=10000 width=12)
Plan id: 0
-> Seq Scan on tenk1 (cost=0.00..114.75 rows=2500 width=12)
Plan id: 1
InitPlan 1 (returns $0)
-> Cluster Reduce (cost=10.25..11.76 rows=1 width=8)
Plan id: 2
-> Finalize Aggregate (cost=18.51..18.52 rows=1 width=8)
Plan id: 3
-> Cluster Reduce (cost=17.89..18.50 rows=4 width=8)
Plan id: 4
-> Partial Aggregate (cost=21.11..21.12 rows=1 width=8)
Plan id: 5
-> Seq Scan on onek (cost=0.00..20.88 rows=95 width=0)
Plan id: 6
Filter: (odd > 10)
(16 rows)
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Result (cost=974.46..10433.46 rows=10000 width=12)
Plan id: 0
InitPlan 1 (returns $0)
-> Aggregate (cost=974.45..974.46 rows=1 width=8)
Plan id: 2
-> Data Node Scan on onek "_REMOTE_TABLE_QUERY__1" (cost=0.00..973.50 rows=379 width=0)
Plan id: 3
Node/s: dn1, dn2, dn3, dn4
-> Data Node Scan on tenk1 "_REMOTE_TABLE_QUERY_" (cost=0.00..9359.00 rows=10000 width=4)
Plan id: 1
Node/s: dn1, dn2, dn3, dn4
(11 rows)
SELECT unique2
, (
SELECT COUNT(1)
FROM onek
WHERE odd > 10
) AS cnt
FROM tenk1
WHERE even NOT IN (
SELECT COUNT(1)
FROM tenk2
WHERE even < 100
);
QUERY PLAN
--------------------------------------------------------------------------------------------
Cluster Gather (cost=975.64..7458.39 rows=20000 width=12)
Plan id: 0
-> Seq Scan on tenk1 (cost=1.18..483.93 rows=5000 width=12)
Plan id: 1
Filter: (NOT (hashed SubPlan 2))
InitPlan 1 (returns $0)
-> Cluster Reduce (cost=10.25..11.76 rows=1 width=8)
Plan id: 2
-> Finalize Aggregate (cost=18.51..18.52 rows=1 width=8)
Plan id: 3
-> Cluster Reduce (cost=17.89..18.50 rows=4 width=8)
Plan id: 4
-> Partial Aggregate (cost=21.11..21.12 rows=1 width=8)
Plan id: 5
-> Seq Scan on onek (cost=0.00..20.88 rows=95 width=0)
Plan id: 6
Filter: (odd > 10)
SubPlan 2
-> Cluster Reduce (cost=3.21..4.71 rows=1 width=8)
Plan id: 7
-> Finalize Aggregate (cost=4.42..4.43 rows=1 width=8)
Plan id: 8
-> Cluster Reduce (cost=3.80..4.41 rows=4 width=8)
Plan id: 9
-> Partial Aggregate (cost=3.50..3.51 rows=1 width=8)
Plan id: 10
-> Seq Scan on tenk2 (cost=0.00..3.44 rows=25 width=0)
Plan id: 11
Filter: (even < 100)
(29 rows)
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=974.46..38605.46 rows=20000 width=12)
Plan id: 0
InitPlan 1 (returns $0)
-> Aggregate (cost=974.45..974.46 rows=1 width=8)
Plan id: 2
-> Data Node Scan on onek "_REMOTE_TABLE_QUERY__1" (cost=0.00..973.50 rows=379 width=0)
Plan id: 3
Node/s: dn1, dn2, dn3, dn4
-> Data Node Scan on tenk1 "_REMOTE_TABLE_QUERY_" (cost=0.00..37431.00 rows=20000 width=4)
Plan id: 1
Node/s: dn1, dn2, dn3, dn4
Coordinator quals: (NOT (hashed SubPlan 2))
SubPlan 2
-> Aggregate (cost=281.00..281.01 rows=1 width=8)
Plan id: 4
-> Data Node Scan on tenk2 "_REMOTE_TABLE_QUERY__2" (cost=0.00..280.75 rows=100 width=0)
Plan id: 5
Node/s: dn1, dn2, dn3, dn4
(18 rows)
SELECT COUNT(1) AS c
FROM (
WITH t_tenk2 AS (
SELECT *
FROM tenk2
WHERE even < 100
)
SELECT *
FROM (
WITH t1 AS (
SELECT *
FROM onek
WHERE odd < 1000
)
SELECT *
FROM t1
LEFT JOIN tenk1 t2 ON 1 = 1
WHERE t1.unique1 = t2.unique1
AND t1.unique2 IN (
SELECT COUNT(1)
FROM t_tenk2
WHERE unique2 < 100
)
) mm
) nn;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Cluster Gather (cost=1655.02..1655.33 rows=1 width=8)
Plan id: 0
-> Aggregate (cost=655.02..655.03 rows=1 width=8)
Plan id: 1
-> Hash Join (cost=565.77..610.02 rows=2000 width=488)
Plan id: 4
Hash Cond: (t1.unique2 = (count(1)))
CTE t_tenk2
-> Seq Scan on tenk2 (cost=0.00..118.00 rows=1260 width=244)
Plan id: 16
Filter: (even < 100)
CTE t1
-> Seq Scan on onek (cost=0.00..20.88 rows=250 width=244)
Plan id: 17
Filter: (odd < 1000)
-> Cluster Reduce (cost=467.20..496.25 rows=4000 width=4)
Plan id: 5
-> Hash Join (cost=582.75..607.06 rows=1000 width=4)
Plan id: 6
Hash Cond: (t1.unique1 = t2.unique1)
-> CTE Scan on t1 (cost=0.00..20.88 rows=250 width=8)
Plan id: 7
-> Hash (cost=457.75..457.75 rows=10000 width=4)
Plan id: 8
-> Seq Scan on tenk1 t2 (cost=0.00..457.75 rows=10000 width=4)
Plan id: 9
-> Hash (cost=98.56..98.56 rows=1 width=8)
Plan id: 10
-> Finalize Aggregate (cost=98.54..98.55 rows=1 width=8)
Plan id: 12
-> Cluster Reduce (cost=97.92..98.53 rows=4 width=8)
Plan id: 13
-> Partial Aggregate (cost=121.15..121.16 rows=1 width=8)
Plan id: 14
-> CTE Scan on t_tenk2 (cost=0.00..118.00 rows=1260 width=0)
Plan id: 15
Filter: (unique2 < 100)
(37 rows)
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Aggregate (cost=48486.73..48486.74 rows=1 width=8)
Plan id: 0
-> Hash Join (cost=39022.18..39069.73 rows=2000 width=488)
Plan id: 3
Hash Cond: (t1.unique1 = unique1)
CTE t_tenk2
-> Data Node Scan on tenk2 "_REMOTE_TABLE_QUERY_" (cost=0.00..9372.00 rows=5042 width=244)
Plan id: 12
Node/s: dn1, dn2, dn3, dn4
CTE t1
-> Data Node Scan on onek "_REMOTE_TABLE_QUERY_" (cost=0.00..973.50 rows=1000 width=244)
Plan id: 13
Node/s: dn1, dn2, dn3, dn4
-> Hash Join (cost=117.68..141.48 rows=500 width=4)
Plan id: 4
Hash Cond: (t1.unique2 = (count(1)))
-> CTE Scan on t1 (cost=0.00..20.00 rows=1000 width=8)
Plan id: 5
-> Hash (cost=117.67..117.67 rows=1 width=8)
Plan id: 6
-> Aggregate (cost=117.65..117.66 rows=1 width=8)
Plan id: 8
-> CTE Scan on t_tenk2 (cost=0.00..113.44 rows=1681 width=0)
Plan id: 9
Filter: (unique2 < 100)
-> Hash (cost=37431.00..37431.00 rows=40000 width=4)
Plan id: 10
-> Data Node Scan on tenk1 "_REMOTE_TABLE_QUERY_" (cost=0.00..37431.00 rows=40000 width=4)
Plan id: 11
Node/s: dn1, dn2, dn3, dn4
(30 rows)
SELECT COUNT(1) AS c
FROM (
WITH t_tenk2 AS (
SELECT *
FROM tenk2
WHERE even < 100
)
SELECT *
FROM (
WITH t1 AS (
SELECT *
FROM onek
WHERE odd < 1000
)
SELECT *
FROM t1
LEFT JOIN tenk1 t2 ON 1 = 1
WHERE t1.unique1 = t2.unique1
AND t1.unique2 NOT IN (
SELECT COUNT(1)
FROM t_tenk2
WHERE unique2 < 100
)
) mm
) nn;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=653.28..653.29 rows=1 width=8)
Plan id: 0
-> Cluster Gather (cost=652.06..653.27 rows=4 width=8)
Plan id: 1
-> Partial Aggregate (cost=652.06..652.07 rows=1 width=8)
Plan id: 2
-> Hash Join (cost=582.75..607.06 rows=500 width=488)
Plan id: 5
Hash Cond: (t1.unique1 = t2.unique1)
CTE t_tenk2
-> Seq Scan on tenk2 (cost=0.00..118.00 rows=1260 width=244)
Plan id: 9
Filter: (even < 100)
CTE t1
-> Seq Scan on onek (cost=0.00..20.88 rows=250 width=244)
Plan id: 10
Filter: (odd < 1000)
-> CTE Scan on t1 (cost=0.00..20.88 rows=250 width=4)
Plan id: 6
Filter: (NOT (hashed SubPlan 3))
SubPlan 3
-> Cluster Reduce (cost=50.27..51.77 rows=1 width=8)
Plan id: 11
-> Finalize Aggregate (cost=98.54..98.55 rows=1 width=8)
Plan id: 12
-> Cluster Reduce (cost=97.92..98.53 rows=4 width=8)
Plan id: 13
-> Partial Aggregate (cost=121.15..121.16 rows=1 width=8)
Plan id: 14
-> CTE Scan on t_tenk2 (cost=0.00..118.00 rows=1260 width=0)
Plan id: 15
Filter: (unique2 < 100)
-> Hash (cost=457.75..457.75 rows=10000 width=4)
Plan id: 7
-> Seq Scan on tenk1 t2 (cost=0.00..457.75 rows=10000 width=4)
Plan id: 8
(36 rows)
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Aggregate (cost=48365.25..48365.26 rows=1 width=8)
Plan id: 0
-> Hash Join (cost=38904.50..38948.25 rows=2000 width=488)
Plan id: 3
Hash Cond: (t1.unique1 = unique1)
CTE t_tenk2
-> Data Node Scan on tenk2 "_REMOTE_TABLE_QUERY_" (cost=0.00..9372.00 rows=5042 width=244)
Plan id: 7
Node/s: dn1, dn2, dn3, dn4
CTE t1
-> Data Node Scan on onek "_REMOTE_TABLE_QUERY_" (cost=0.00..973.50 rows=1000 width=244)
Plan id: 8
Node/s: dn1, dn2, dn3, dn4
-> CTE Scan on t1 (cost=0.00..20.00 rows=500 width=4)
Plan id: 4
Filter: (NOT (hashed SubPlan 3))
SubPlan 3
-> Aggregate (cost=117.65..117.66 rows=1 width=8)
Plan id: 9
-> CTE Scan on t_tenk2 (cost=0.00..113.44 rows=1681 width=0)
Plan id: 10
Filter: (unique2 < 100)
-> Hash (cost=37431.00..37431.00 rows=40000 width=4)
Plan id: 5
-> Data Node Scan on tenk1 "_REMOTE_TABLE_QUERY_" (cost=0.00..37431.00 rows=40000 width=4)
Plan id: 6
Node/s: dn1, dn2, dn3, dn4
(27 rows)
SELECT COUNT(1) AS c
FROM (
WITH t_onek AS (
SELECT *
FROM onek
WHERE even < 100
)
SELECT *, 'xxx' AS x
FROM (
WITH t1 AS (
SELECT *
FROM tenk1
WHERE odd < 1000
),
t2 AS (
SELECT *
FROM tenk2
WHERE odd > 1000
AND odd < 2000
)
SELECT *
FROM t1
LEFT JOIN t2 ON 1 = 1
WHERE t1.unique1 = t2.unique1
AND t1.unique2 IN (
SELECT COUNT(1)
FROM t_onek
WHERE unique2 < 100
)
) mm
) nn;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Cluster Gather (cost=1534.06..1534.37 rows=1 width=8)
Plan id: 0
-> Aggregate (cost=534.06..534.07 rows=1 width=8)
Plan id: 1
-> Subquery Scan on mm (cost=19.85..532.81 rows=100 width=520)
Plan id: 3
CTE t_onek
-> Seq Scan on onek (cost=0.00..20.88 rows=250 width=244)
Plan id: 15
Filter: (even < 100)
-> Hash Join (cost=19.85..531.81 rows=100 width=488)
Plan id: 4
Hash Cond: (t1.unique2 = (count(1)))
CTE t1
-> Seq Scan on tenk1 (cost=0.00..482.75 rows=10000 width=244)
Plan id: 16
Filter: (odd < 1000)
CTE t2
-> Seq Scan on tenk2 (cost=0.00..124.25 rows=0 width=244)
Plan id: 17
Filter: ((odd > 1000) AND (odd < 2000))
-> Cluster Reduce (cost=1.00..512.20 rows=200 width=4)
Plan id: 5
-> Nested Loop (cost=0.00..638.25 rows=50 width=4)
Plan id: 6
Join Filter: (t1.unique1 = t2.unique1)
-> CTE Scan on t2 (cost=0.00..124.25 rows=0 width=4)
Plan id: 7
-> CTE Scan on t1 (cost=0.00..482.75 rows=10000 width=8)
Plan id: 8
-> Hash (cost=18.84..18.84 rows=1 width=8)
Plan id: 9
-> Finalize Aggregate (cost=18.82..18.83 rows=1 width=8)
Plan id: 11
-> Cluster Reduce (cost=18.20..18.81 rows=4 width=8)
Plan id: 12
-> Partial Aggregate (cost=21.50..21.51 rows=1 width=8)
Plan id: 13
-> CTE Scan on t_onek (cost=0.00..20.88 rows=250 width=0)
Plan id: 14
Filter: (unique2 < 100)
(41 rows)
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=48879.91..48879.92 rows=1 width=8)
Plan id: 0
-> Subquery Scan on mm (cost=47924.90..48878.66 rows=100 width=520)
Plan id: 2
CTE t_onek
-> Data Node Scan on onek "_REMOTE_TABLE_QUERY_" (cost=0.00..973.50 rows=1000 width=244)
Plan id: 12
Node/s: dn1, dn2, dn3, dn4
-> Hash Join (cost=46951.40..47904.16 rows=100 width=488)
Plan id: 3
Hash Cond: (t1.unique2 = (count(1)))
CTE t1
-> Data Node Scan on tenk1 "_REMOTE_TABLE_QUERY_" (cost=0.00..37531.00 rows=40000 width=244)
Plan id: 13
Node/s: dn1, dn2, dn3, dn4
CTE t2
-> Data Node Scan on tenk2 "_REMOTE_TABLE_QUERY_" (cost=0.00..9397.00 rows=1 width=244)
Plan id: 14
Node/s: dn1, dn2, dn3, dn4
-> Hash Join (cost=0.03..952.03 rows=200 width=4)
Plan id: 4
Hash Cond: (t1.unique1 = t2.unique1)
-> CTE Scan on t1 (cost=0.00..800.00 rows=40000 width=8)
Plan id: 5
-> Hash (cost=0.02..0.02 rows=1 width=4)
Plan id: 6
-> CTE Scan on t2 (cost=0.00..0.02 rows=1 width=4)
Plan id: 7
-> Hash (cost=23.35..23.35 rows=1 width=8)
Plan id: 8
-> Aggregate (cost=23.33..23.34 rows=1 width=8)
Plan id: 10
-> CTE Scan on t_onek (cost=0.00..22.50 rows=333 width=0)
Plan id: 11
Filter: (unique2 < 100)
(35 rows)