PostgreSQL在何处处理 sql查询之二十七

接前面,仔细看这个 :这 add_base_rels_to_query 是个递归调用嘛。

想像一下: select  * from tst01  where id in (select sid from tst02)  or id in (select sid from tst03) 之类的,此函数将层层深入,构造一个二叉树式样的语法树。

void

add_base_rels_to_query(PlannerInfo *root, Node *jtnode)

{

    if (jtnode == NULL)

        return;

    if (IsA(jtnode, RangeTblRef))

    {

        int    varno = ((RangeTblRef *) jtnode)->rtindex;

        (void) build_simple_rel(root, varno, RELOPT_BASEREL);

    }

    else if (IsA(jtnode, FromExpr))

    {

        FromExpr   *f = (FromExpr *) jtnode;

        ListCell   *l;



        foreach(l, f->fromlist)

            add_base_rels_to_query(root, lfirst(l));

    }

    else if (IsA(jtnode, JoinExpr))

    {

        JoinExpr   *j = (JoinExpr *) jtnode;



        add_base_rels_to_query(root, j->larg);

        add_base_rels_to_query(root, j->rarg);

    }

    else

        elog(ERROR, "unrecognized node type: %d",

             (int) nodeTag(jtnode));

}

 这个时候,我想到一个问题,在PostgreSQL中,有没有并行查询的可能呢?

那么,再上溯到更高的层面,再次梳理一下:

exec_simple_query中调用 了 plan_tree_list,从语法分析树得到了计划树...执行 PortalRun。

而上面的 add_base_rels_to_query 是plan_tree_list 里的一部分。

调用关系如下:

plan_tree_list -->pg_plan_queries-->pg_plan_query-->planner

planner-->subquery_planner-->groupingplanner

groupingplanner-->query_planner-->add_base_rels_to_query

/*

  * exec_simple_query

  *

  * Execute a "simple Query" protocol message.

  */

 static void

 exec_simple_query(const char *query_string)

 {

     ...

     isTopLevel = (list_length(parsetree_list) == 1);

     ...

     foreach(parsetree_item, parsetree_list)

     {

         ...

         querytree_list = pg_analyze_and_rewrite(parsetree, query_string,

                                                 NULL, 0);

 

         plantree_list = pg_plan_queries(querytree_list, 0, NULL);

         ...

 

         /*

          * Create unnamed portal to run the query or queries in. If there

          * already is one, silently drop it.

          */

         portal = CreatePortal("", true, true);

         /* Don't display the portal in pg_cursors */

         portal->visible = false;

 

         /*

          * We don't have to copy anything into the portal, because everything

          * we are passing here is in MessageContext, which will outlive the

          * portal anyway.

          */

         PortalDefineQuery(portal,

                           NULL,

                           query_string,

                           commandTag,

                           plantree_list,

                           NULL);

 

         /*

          * Start the portal.

          *

          * If we took a snapshot for parsing/planning, the portal may be able

          * to reuse it for the execution phase.  Currently, this will only

          * happen in PORTAL_ONE_SELECT mode.  But even if PortalStart doesn't

          * end up being able to do this, keeping the parse/plan snapshot

          * around until after we start the portal doesn't cost much.

          */

         PortalStart(portal, NULL, 0, snapshot_set);

         ...

 

         /*

          * Run the portal to completion, and then drop it (and the receiver).

          */

         (void) PortalRun(portal,

                          FETCH_ALL,

                          isTopLevel,

                          receiver,

                          receiver,

                          completionTag);

 

         (*receiver->rDestroy) (receiver);

 

         PortalDrop(portal, false);

 

         ...

     }                            /* end loop over parsetrees */

 

     /*

      * Close down transaction statement, if one is open.

      */

     finish_xact_command();

     ...

 }

 加入调试信息,看看一个带子查询的SQL,能否有并行查询的可能。

[postgres@lex pgsql]$ ./bin/psql

psql (9.2.1)

Type "help" for help.



postgres=# select * from tst01 where id IN (select sid from tst02) or id IN (select sid from tst03);

 id  

-----

 100

 200

(2 rows)



postgres=# 

看后台的信息,知道planner调用了三次,PortalRun只调用了一次。

[postgres@lex pgsql]$ ./bin/pg_ctl -D ./data start

server starting

[postgres@lex pgsql]$ LOG:  database system was shut down at 2013-05-29 10:12:02 CST

LOG:  autovacuum launcher started

LOG:  database system is ready to accept connections

In query_planner.... by process 11796

In get_relation_info ...oid of table is: 16387 ...by process 11796

In query_planner.... by process 11796

In query_planner.... by process 11796

In get_relation_info ...oid of table is: 16390 ...by process 11796

In query_planner.... by process 11796

In query_planner.... by process 11796

In get_relation_info ...oid of table is: 16384 ...by process 11796

In query_planner.... by process 11796

In PortalRun ... by 11796 

也就是说,PostgreSQL 没有子查询的并行查询能力了?!

至少目前,它还非常地不完整:

http://postgresql.1045698.n5.nabble.com/Parallel-query-execution-td5740431.html

http://wiki.postgresql.org/wiki/Parallel_Query_Execution

各种集群方案也作不到这点:

http://www.postgresql.org/docs/current/static/different-replication-solutions.html

Many of the above solutions allow multiple servers to handle multiple queries, but none allow a single query to use multiple servers to complete faster.

对比下Oracle的:

http://miracle.blog.51cto.com/255044/147058/

你可能感兴趣的:(PostgreSQL)