PostgreSQL在何处处理 sql查询之五十一

继续分析 query_planner:

/*

 * query_planner

 *      Generate a path (that is, a simplified plan) for a basic query,

 *      which may involve joins but not any fancier features.

 *

 * Since query_planner does not handle the toplevel processing (grouping,

 * sorting, etc) it cannot select the best path by itself.    It selects

 * two paths: the cheapest path that produces all the required tuples,

 * independent of any ordering considerations, and the cheapest path that

 * produces the expected fraction of the required tuples in the required

 * ordering, if there is a path that is cheaper for this than just sorting

 * the output of the cheapest overall path.  The caller (grouping_planner)

 * will make the final decision about which to use.

 *

 * Input parameters:

 * root describes the query to plan

 * tlist is the target list the query should produce

 *        (this is NOT necessarily root->parse->targetList!)

 * tuple_fraction is the fraction of tuples we expect will be retrieved

 * limit_tuples is a hard limit on number of tuples to retrieve,

 *        or -1 if no limit

 *

 * Output parameters:

 * *cheapest_path receives the overall-cheapest path for the query

 * *sorted_path receives the cheapest presorted path for the query,

 *                if any (NULL if there is no useful presorted path)

 * *num_groups receives the estimated number of groups, or 1 if query

 *                does not use grouping

 *

 * Note: the PlannerInfo node also includes a query_pathkeys field, which is

 * both an input and an output of query_planner().    The input value signals

 * query_planner that the indicated sort order is wanted in the final output

 * plan.  But this value has not yet been "canonicalized", since the needed

 * info does not get computed until we scan the qual clauses.  We canonicalize

 * it as soon as that task is done.  (The main reason query_pathkeys is a

 * PlannerInfo field and not a passed parameter is that the low-level routines

 * in indxpath.c need to see it.)

 *

 * Note: the PlannerInfo node includes other pathkeys fields besides

 * query_pathkeys, all of which need to be canonicalized once the info is

 * available.  See canonicalize_all_pathkeys.

 *

 * tuple_fraction is interpreted as follows:

 *      0: expect all tuples to be retrieved (normal case)

 *      0 < tuple_fraction < 1: expect the given fraction of tuples available

 *        from the plan to be retrieved

 *      tuple_fraction >= 1: tuple_fraction is the absolute number of tuples

 *        expected to be retrieved (ie, a LIMIT specification)

 * Note that a nonzero tuple_fraction could come from outer context; it is

 * therefore not redundant with limit_tuples.  We use limit_tuples to determine

 * whether a bounded sort can be used at runtime.

 */

void

query_planner(PlannerInfo *root, List *tlist,

              double tuple_fraction, double limit_tuples,

              Path **cheapest_path, Path **sorted_path,

              double *num_groups)

{

    Query       *parse = root->parse;

    List       *joinlist;

    RelOptInfo *final_rel;

    Path       *cheapestpath;

    Path       *sortedpath;

    Index        rti;

    double        total_pages;



    /* Make tuple_fraction, limit_tuples accessible to lower-level routines */

    root->tuple_fraction = tuple_fraction;

    root->limit_tuples = limit_tuples;



    *num_groups = 1;            /* default result */



    /*

     * If the query has an empty join tree, then it's something easy like

     * "SELECT 2+2;" or "INSERT ... VALUES()".    Fall through quickly.

     */

    if (parse->jointree->fromlist == NIL)

    {

        /* We need a trivial path result */

        *cheapest_path = (Path *)

            create_result_path((List *) parse->jointree->quals);

        *sorted_path = NULL;



        /*

         * We still are required to canonicalize any pathkeys, in case it's

         * something like "SELECT 2+2 ORDER BY 1".

         */

        root->canon_pathkeys = NIL;

        canonicalize_all_pathkeys(root);

        return;

    }



    /*

     * Init planner lists to empty.

     *

     * NOTE: append_rel_list was set up by subquery_planner, so do not touch

     * here; eq_classes and minmax_aggs may contain data already, too.

     */

    root->join_rel_list = NIL;

    root->join_rel_hash = NULL;

    root->join_rel_level = NULL;

    root->join_cur_level = 0;

    root->canon_pathkeys = NIL;

    root->left_join_clauses = NIL;

    root->right_join_clauses = NIL;

    root->full_join_clauses = NIL;

    root->join_info_list = NIL;

    root->placeholder_list = NIL;

    root->initial_rels = NIL;



    /*

     * Make a flattened version of the rangetable for faster access (this is

     * OK because the rangetable won't change any more), and set up an empty

     * array for indexing base relations.

     */

    setup_simple_rel_arrays(root);



    /*

     * Construct RelOptInfo nodes for all base relations in query, and

     * indirectly for all appendrel member relations ("other rels").  This

     * will give us a RelOptInfo for every "simple" (non-join) rel involved in

     * the query.

     *

     * Note: the reason we find the rels by searching the jointree and

     * appendrel list, rather than just scanning the rangetable, is that the

     * rangetable may contain RTEs for rels not actively part of the query,

     * for example views.  We don't want to make RelOptInfos for them.

     */

    add_base_rels_to_query(root, (Node *) parse->jointree);



    /*

     * Examine the targetlist and join tree, adding entries to baserel

     * targetlists for all referenced Vars, and generating PlaceHolderInfo

     * entries for all referenced PlaceHolderVars.    Restrict and join clauses

     * are added to appropriate lists belonging to the mentioned relations. We

     * also build EquivalenceClasses for provably equivalent expressions. The

     * SpecialJoinInfo list is also built to hold information about join order

     * restrictions.  Finally, we form a target joinlist for make_one_rel() to

     * work from.

     */

    build_base_rel_tlists(root, tlist);



    find_placeholders_in_jointree(root);



    joinlist = deconstruct_jointree(root);



    /*

     * Reconsider any postponed outer-join quals now that we have built up

     * equivalence classes.  (This could result in further additions or

     * mergings of classes.)

     */

    reconsider_outer_join_clauses(root);



    /*

     * If we formed any equivalence classes, generate additional restriction

     * clauses as appropriate.    (Implied join clauses are formed on-the-fly

     * later.)

     */

    generate_base_implied_equalities(root);



    /*

     * We have completed merging equivalence sets, so it's now possible to

     * convert previously generated pathkeys (in particular, the requested

     * query_pathkeys) to canonical form.

     */

    canonicalize_all_pathkeys(root);



    /*

     * Examine any "placeholder" expressions generated during subquery pullup.

     * Make sure that the Vars they need are marked as needed at the relevant

     * join level.    This must be done before join removal because it might

     * cause Vars or placeholders to be needed above a join when they weren't

     * so marked before.

     */

    fix_placeholder_input_needed_levels(root);



    /*

     * Remove any useless outer joins.    Ideally this would be done during

     * jointree preprocessing, but the necessary information isn't available

     * until we've built baserel data structures and classified qual clauses.

     */

    joinlist = remove_useless_joins(root, joinlist);



    /*

     * Now distribute "placeholders" to base rels as needed.  This has to be

     * done after join removal because removal could change whether a

     * placeholder is evaluatable at a base rel.

     */

    add_placeholders_to_base_rels(root);



    /*

     * We should now have size estimates for every actual table involved in

     * the query, and we also know which if any have been deleted from the

     * query by join removal; so we can compute total_table_pages.

     *

     * Note that appendrels are not double-counted here, even though we don't

     * bother to distinguish RelOptInfos for appendrel parents, because the

     * parents will still have size zero.

     *

     * XXX if a table is self-joined, we will count it once per appearance,

     * which perhaps is the wrong thing ... but that's not completely clear,

     * and detecting self-joins here is difficult, so ignore it for now.

     */

    total_pages = 0;

    for (rti = 1; rti < root->simple_rel_array_size; rti++)

    {

        RelOptInfo *brel = root->simple_rel_array[rti];



        if (brel == NULL)

            continue;



        Assert(brel->relid == rti);        /* sanity check on array */



        if (brel->reloptkind == RELOPT_BASEREL ||

            brel->reloptkind == RELOPT_OTHER_MEMBER_REL)

            total_pages += (double) brel->pages;

    }

    root->total_table_pages = total_pages;



    /*

     * Ready to do the primary planning.

     */

    final_rel = make_one_rel(root, joinlist);



    if (!final_rel || !final_rel->cheapest_total_path)

        elog(ERROR, "failed to construct the join relation");



    /*

     * If there's grouping going on, estimate the number of result groups. We

     * couldn't do this any earlier because it depends on relation size

     * estimates that were set up above.

     *

     * Then convert tuple_fraction to fractional form if it is absolute, and

     * adjust it based on the knowledge that grouping_planner will be doing

     * grouping or aggregation work with our result.

     *

     * This introduces some undesirable coupling between this code and

     * grouping_planner, but the alternatives seem even uglier; we couldn't

     * pass back completed paths without making these decisions here.

     */

    if (parse->groupClause)

    {

        List       *groupExprs;



        groupExprs = get_sortgrouplist_exprs(parse->groupClause,

                                             parse->targetList);

        *num_groups = estimate_num_groups(root,

                                          groupExprs,

                                          final_rel->rows);



        /*

         * In GROUP BY mode, an absolute LIMIT is relative to the number of

         * groups not the number of tuples.  If the caller gave us a fraction,

         * keep it as-is.  (In both cases, we are effectively assuming that

         * all the groups are about the same size.)

         */

        if (tuple_fraction >= 1.0)

            tuple_fraction /= *num_groups;



        /*

         * If both GROUP BY and ORDER BY are specified, we will need two

         * levels of sort --- and, therefore, certainly need to read all the

         * tuples --- unless ORDER BY is a subset of GROUP BY.    Likewise if we

         * have both DISTINCT and GROUP BY, or if we have a window

         * specification not compatible with the GROUP BY.

         */

        if (!pathkeys_contained_in(root->sort_pathkeys, root->group_pathkeys) ||

            !pathkeys_contained_in(root->distinct_pathkeys, root->group_pathkeys) ||

         !pathkeys_contained_in(root->window_pathkeys, root->group_pathkeys))

            tuple_fraction = 0.0;



        /* In any case, limit_tuples shouldn't be specified here */

        Assert(limit_tuples < 0);

    }

    else if (parse->hasAggs || root->hasHavingQual)

    {

        /*

         * Ungrouped aggregate will certainly want to read all the tuples, and

         * it will deliver a single result row (so leave *num_groups 1).

         */

        tuple_fraction = 0.0;



        /* limit_tuples shouldn't be specified here */

        Assert(limit_tuples < 0);

    }

    else if (parse->distinctClause)

    {

        /*

         * Since there was no grouping or aggregation, it's reasonable to

         * assume the UNIQUE filter has effects comparable to GROUP BY. Return

         * the estimated number of output rows for use by caller. (If DISTINCT

         * is used with grouping, we ignore its effects for rowcount

         * estimation purposes; this amounts to assuming the grouped rows are

         * distinct already.)

         */

        List       *distinctExprs;



        distinctExprs = get_sortgrouplist_exprs(parse->distinctClause,

                                                parse->targetList);

        *num_groups = estimate_num_groups(root,

                                          distinctExprs,

                                          final_rel->rows);



        /*

         * Adjust tuple_fraction the same way as for GROUP BY, too.

         */

        if (tuple_fraction >= 1.0)

            tuple_fraction /= *num_groups;



        /* limit_tuples shouldn't be specified here */

        Assert(limit_tuples < 0);

    }

    else

    {

        /*

         * Plain non-grouped, non-aggregated query: an absolute tuple fraction

         * can be divided by the number of tuples.

         */

        if (tuple_fraction >= 1.0)

            tuple_fraction /= final_rel->rows;

    }



    /*

     * Pick out the cheapest-total path and the cheapest presorted path for

     * the requested pathkeys (if there is one).  We should take the tuple

     * fraction into account when selecting the cheapest presorted path, but

     * not when selecting the cheapest-total path, since if we have to sort

     * then we'll have to fetch all the tuples.  (But there's a special case:

     * if query_pathkeys is NIL, meaning order doesn't matter, then the

     * "cheapest presorted" path will be the cheapest overall for the tuple

     * fraction.)

     *

     * The cheapest-total path is also the one to use if grouping_planner

     * decides to use hashed aggregation, so we return it separately even if

     * this routine thinks the presorted path is the winner.

     */

    cheapestpath = final_rel->cheapest_total_path;



    sortedpath =

        get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,

                                                  root->query_pathkeys,

                                                  NULL,

                                                  tuple_fraction);



    /* Don't return same path in both guises; just wastes effort */

    if (sortedpath == cheapestpath)

        sortedpath = NULL;



    /*

     * Forget about the presorted path if it would be cheaper to sort the

     * cheapest-total path.  Here we need consider only the behavior at the

     * tuple fraction point.

     */

    if (sortedpath)

    {

        Path        sort_path;    /* dummy for result of cost_sort */



        if (root->query_pathkeys == NIL ||

            pathkeys_contained_in(root->query_pathkeys,

                                  cheapestpath->pathkeys))

        {

            /* No sort needed for cheapest path */

            sort_path.startup_cost = cheapestpath->startup_cost;

            sort_path.total_cost = cheapestpath->total_cost;

        }

        else

        {

            /* Figure cost for sorting */

            cost_sort(&sort_path, root, root->query_pathkeys,

                      cheapestpath->total_cost,

                      final_rel->rows, final_rel->width,

                      0.0, work_mem, limit_tuples);

        }



        if (compare_fractional_path_costs(sortedpath, &sort_path,

                                          tuple_fraction) > 0)

        {

            /* Presorted path is a loser */

            sortedpath = NULL;

        }

    }



    *cheapest_path = cheapestpath;

    *sorted_path = sortedpath;

}

这个就是对基本的查询的基本分析了。

下面,慢慢进行:

我的简单查询里面,有 from 子句,所以不满足如下这段,不会直接退出。

    /*

     * If the query has an empty join tree, then it's something easy like

     * "SELECT 2+2;" or "INSERT ... VALUES()".    Fall through quickly.

     */

    if (parse->jointree->fromlist == NIL)

    {

        /* We need a trivial path result */

        *cheapest_path = (Path *)

            create_result_path((List *) parse->jointree->quals);

        *sorted_path = NULL;



        /*

         * We still are required to canonicalize any pathkeys, in case it's

         * something like "SELECT 2+2 ORDER BY 1".

         */

        root->canon_pathkeys = NIL;

        canonicalize_all_pathkeys(root);

        return;

    }

下面是初始化动作:

    /*

     * Init planner lists to empty.

     *

     * NOTE: append_rel_list was set up by subquery_planner, so do not touch

     * here; eq_classes and minmax_aggs may contain data already, too.

     */

    root->join_rel_list = NIL;

    root->join_rel_hash = NULL;

    root->join_rel_level = NULL;

    root->join_cur_level = 0;

    root->canon_pathkeys = NIL;

    root->left_join_clauses = NIL;

    root->right_join_clauses = NIL;

    root->full_join_clauses = NIL;

    root->join_info_list = NIL;

    root->placeholder_list = NIL;

    root->initial_rels = NIL;

下面这一段,也先把它当作一个简单的初始化理解:

    /*

     * Make a flattened version of the rangetable for faster access (this is

     * OK because the rangetable won't change any more), and set up an empty

     * array for indexing base relations.

     */

    setup_simple_rel_arrays(root);

接着,这是把 from 和 where 条件的 FromExpr型指针,变换为Node型指针,挂到 root上。

    /*

     * Construct RelOptInfo nodes for all base relations in query, and

     * indirectly for all appendrel member relations ("other rels").  This

     * will give us a RelOptInfo for every "simple" (non-join) rel involved in

     * the query.

     *

     * Note: the reason we find the rels by searching the jointree and

     * appendrel list, rather than just scanning the rangetable, is that the

     * rangetable may contain RTEs for rels not actively part of the query,

     * for example views.  We don't want to make RelOptInfos for them.

     */

    add_base_rels_to_query(root, (Node *) parse->jointree);

将其展开:

/*****************************************************************************

 *

 *     JOIN TREES

 *

 *****************************************************************************/



/*

 * add_base_rels_to_query

 *

 *      Scan the query's jointree and create baserel RelOptInfos for all

 *      the base relations (ie, table, subquery, and function RTEs)

 *      appearing in the jointree.

 *

 * The initial invocation must pass root->parse->jointree as the value of

 * jtnode.    Internally, the function recurses through the jointree.

 *

 * At the end of this process, there should be one baserel RelOptInfo for

 * every non-join RTE that is used in the query.  Therefore, this routine

 * is the only place that should call build_simple_rel with reloptkind

 * RELOPT_BASEREL.    (Note: build_simple_rel recurses internally to build

 * "other rel" RelOptInfos for the members of any appendrels we find here.)

 */

void

add_base_rels_to_query(PlannerInfo *root, Node *jtnode)

{

    if (jtnode == NULL)

        return;

    if (IsA(jtnode, RangeTblRef))

    {

        int            varno = ((RangeTblRef *) jtnode)->rtindex;



        (void) build_simple_rel(root, varno, RELOPT_BASEREL);

    }

    else if (IsA(jtnode, FromExpr))

    {

        FromExpr   *f = (FromExpr *) jtnode;

        ListCell   *l;



        foreach(l, f->fromlist)

            add_base_rels_to_query(root, lfirst(l));

    }

    else if (IsA(jtnode, JoinExpr))

    {

        JoinExpr   *j = (JoinExpr *) jtnode;



        add_base_rels_to_query(root, j->larg);

        add_base_rels_to_query(root, j->rarg);

    }

    else

        elog(ERROR, "unrecognized node type: %d",

             (int) nodeTag(jtnode));

}

这里面,有递归调用。

主要是因为这个:就是把from 里面每个表明,都挂在root上。

        foreach(l, f->fromlist)

            add_base_rels_to_query(root, lfirst(l));

上述的递归调用时,进了  if (IsA(jtnode, RangeTblRef)) 的判断。会调用 build_simple_rel函数,展开了看:

/*

 * build_simple_rel

 *      Construct a new RelOptInfo for a base relation or 'other' relation.

 */

RelOptInfo *

build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)

{

    RelOptInfo *rel;

    RangeTblEntry *rte;



    /* Rel should not exist already */

    Assert(relid > 0 && relid < root->simple_rel_array_size);

    if (root->simple_rel_array[relid] != NULL)

        elog(ERROR, "rel %d already exists", relid);



    /* Fetch RTE for relation */

    rte = root->simple_rte_array[relid];

    Assert(rte != NULL);



    rel = makeNode(RelOptInfo);

    rel->reloptkind = reloptkind;

    rel->relids = bms_make_singleton(relid);

    rel->rows = 0;

    rel->width = 0;

    rel->reltargetlist = NIL;

    rel->pathlist = NIL;

    rel->ppilist = NIL;

    rel->cheapest_startup_path = NULL;

    rel->cheapest_total_path = NULL;

    rel->cheapest_unique_path = NULL;

    rel->cheapest_parameterized_paths = NIL;

    rel->relid = relid;

    rel->rtekind = rte->rtekind;

    /* min_attr, max_attr, attr_needed, attr_widths are set below */

    rel->indexlist = NIL;

    rel->pages = 0;

    rel->tuples = 0;

    rel->allvisfrac = 0;

    rel->subplan = NULL;

    rel->subroot = NULL;

    rel->fdwroutine = NULL;

    rel->fdw_private = NULL;

    rel->baserestrictinfo = NIL;

    rel->baserestrictcost.startup = 0;

    rel->baserestrictcost.per_tuple = 0;

    rel->joininfo = NIL;

    rel->has_eclass_joins = false;



    /* Check type of rtable entry */

    switch (rte->rtekind)

    {

        case RTE_RELATION:

            /* Table --- retrieve statistics from the system catalogs */

            get_relation_info(root, rte->relid, rte->inh, rel);

            break;

        case RTE_SUBQUERY:

        case RTE_FUNCTION:

        case RTE_VALUES:

        case RTE_CTE:



            /*

             * Subquery, function, or values list --- set up attr range and

             * arrays

             *

             * Note: 0 is included in range to support whole-row Vars

             */

            rel->min_attr = 0;

            rel->max_attr = list_length(rte->eref->colnames);

            rel->attr_needed = (Relids *)

                palloc0((rel->max_attr - rel->min_attr + 1) * sizeof(Relids));

            rel->attr_widths = (int32 *)

                palloc0((rel->max_attr - rel->min_attr + 1) * sizeof(int32));

            break;

        default:

            elog(ERROR, "unrecognized RTE kind: %d",

                 (int) rte->rtekind);

            break;

    }



    /* Save the finished struct in the query's simple_rel_array */

    root->simple_rel_array[relid] = rel;



    /*

     * If this rel is an appendrel parent, recurse to build "other rel"

     * RelOptInfos for its children.  They are "other rels" because they are

     * not in the main join tree, but we will need RelOptInfos to plan access

     * to them.

     */

    if (rte->inh)

    {

        ListCell   *l;



        foreach(l, root->append_rel_list)

        {

            AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);



            /* append_rel_list contains all append rels; ignore others */

            if (appinfo->parent_relid != relid)

                continue;



            (void) build_simple_rel(root, appinfo->child_relid,

                                    RELOPT_OTHER_MEMBER_REL);

        }

    }



    return rel;

}

核心是这个:

    /* Save the finished struct in the query's simple_rel_array */

    root->simple_rel_array[relid] = rel;

回到 query_planner 这个层面。

    /*

     * Examine the targetlist and join tree, adding entries to baserel

     * targetlists for all referenced Vars, and generating PlaceHolderInfo

     * entries for all referenced PlaceHolderVars.    Restrict and join clauses

     * are added to appropriate lists belonging to the mentioned relations. We

     * also build EquivalenceClasses for provably equivalent expressions. The

     * SpecialJoinInfo list is also built to hold information about join order

     * restrictions.  Finally, we form a target joinlist for make_one_rel() to

     * work from.

     */

    build_base_rel_tlists(root, tlist);

展开:

这个和我的简单查询关系也不大,因为我的简单查询没有什么表达式要计算:

/*

 * build_base_rel_tlists

 *      Add targetlist entries for each var needed in the query's final tlist

 *      to the appropriate base relations.

 *

 * We mark such vars as needed by "relation 0" to ensure that they will

 * propagate up through all join plan steps.

 */

void

build_base_rel_tlists(PlannerInfo *root, List *final_tlist)

{

    List       *tlist_vars = pull_var_clause((Node *) final_tlist,

                                             PVC_RECURSE_AGGREGATES,

                                             PVC_INCLUDE_PLACEHOLDERS);



    if (tlist_vars != NIL)

    {

        add_vars_to_targetlist(root, tlist_vars, bms_make_singleton(0), true);

        list_free(tlist_vars);

    }

}

那么下面这一段:

find_placeholders_in_jointree(root);

展开后可以看到,对于我的简单查询,没有 placeholderz,所以会直接返回:

/*

 * find_placeholders_in_jointree

 *        Search the jointree for PlaceHolderVars, and build PlaceHolderInfos

 *

 * We don't need to look at the targetlist because build_base_rel_tlists()

 * will already have made entries for any PHVs in the tlist.

 */

void

find_placeholders_in_jointree(PlannerInfo *root)

{



    /* We need do nothing if the query contains no PlaceHolderVars */

    if (root->glob->lastPHId != 0)

    {

        /* Start recursion at top of jointree */

        Assert(root->parse->jointree != NULL &&

               IsA(root->parse->jointree, FromExpr));

        (void) find_placeholders_recurse(root, (Node *) root->parse->jointree);

    }

}

接着往下看,这个先暂时无视。

    joinlist = deconstruct_jointree(root);

接着:由于我的是简单查询,如下的两段,可以被无视。

    /*

     * Reconsider any postponed outer-join quals now that we have built up

     * equivalence classes.  (This could result in further additions or

     * mergings of classes.)

     */

    reconsider_outer_join_clauses(root);



    /*

     * If we formed any equivalence classes, generate additional restriction

     * clauses as appropriate.    (Implied join clauses are formed on-the-fly

     * later.)

     */

    generate_base_implied_equalities(root);

再接着看下面这一段好了:

    /*

     * We have completed merging equivalence sets, so it's now possible to

     * convert previously generated pathkeys (in particular, the requested

     * query_pathkeys) to canonical form.

     */

    canonicalize_all_pathkeys(root);

展开:

/*

 * canonicalize_all_pathkeys

 *        Canonicalize all pathkeys that were generated before entering

 *        query_planner and then stashed in PlannerInfo.

 */

static void

canonicalize_all_pathkeys(PlannerInfo *root)

{

    root->query_pathkeys = canonicalize_pathkeys(root, root->query_pathkeys);

    root->group_pathkeys = canonicalize_pathkeys(root, root->group_pathkeys);

    root->window_pathkeys = canonicalize_pathkeys(root, root->window_pathkeys);

    root->distinct_pathkeys = canonicalize_pathkeys(root, root->distinct_pathkeys);

    root->sort_pathkeys = canonicalize_pathkeys(root, root->sort_pathkeys);

}
/*

 * canonicalize_pathkeys

 *       Convert a not-necessarily-canonical pathkeys list to canonical form.

 *

 * Note that this function must not be used until after we have completed

 * merging EquivalenceClasses.

 */

List *

canonicalize_pathkeys(PlannerInfo *root, List *pathkeys)

{

    List       *new_pathkeys = NIL;

    ListCell   *l;



    foreach(l, pathkeys)

    {

        PathKey    *pathkey = (PathKey *) lfirst(l);

        EquivalenceClass *eclass;

        PathKey    *cpathkey;



        /* Find the canonical (merged) EquivalenceClass */

        eclass = pathkey->pk_eclass;

        while (eclass->ec_merged)

            eclass = eclass->ec_merged;



        /*

         * If we can tell it's redundant just from the EC, skip.

         * pathkey_is_redundant would notice that, but we needn't even bother

         * constructing the node...

         */

        if (EC_MUST_BE_REDUNDANT(eclass))

            continue;



        /* OK, build a canonicalized PathKey struct */

        cpathkey = make_canonical_pathkey(root,

                                          eclass,

                                          pathkey->pk_opfamily,

                                          pathkey->pk_strategy,

                                          pathkey->pk_nulls_first);



        /* Add to list unless redundant */

        if (!pathkey_is_redundant(cpathkey, new_pathkeys))

            new_pathkeys = lappend(new_pathkeys, cpathkey);

    }

    return new_pathkeys;

}

到这一步,心里有个感触,是否我需要把所有的 sql文再复习一遍?

由于我的简单查询 root->query_pathkeys 是NG, 所以就直接返回了。

接着再分析下一个调用:

这下面三段,都可以无视:

    /*

     * Examine any "placeholder" expressions generated during subquery pullup.

     * Make sure that the Vars they need are marked as needed at the relevant

     * join level.    This must be done before join removal because it might

     * cause Vars or placeholders to be needed above a join when they weren't

     * so marked before.

     */

    fix_placeholder_input_needed_levels(root);



    /*

     * Remove any useless outer joins.    Ideally this would be done during

     * jointree preprocessing, but the necessary information isn't available

     * until we've built baserel data structures and classified qual clauses.

     */

    joinlist = remove_useless_joins(root, joinlist);



    /*

     * Now distribute "placeholders" to base rels as needed.  This has to be

     * done after join removal because removal could change whether a

     * placeholder is evaluatable at a base rel.

     */

    add_placeholders_to_base_rels(root);

再接着分析:

    /*

     * We should now have size estimates for every actual table involved in

     * the query, and we also know which if any have been deleted from the

     * query by join removal; so we can compute total_table_pages.

     *

     * Note that appendrels are not double-counted here, even though we don't

     * bother to distinguish RelOptInfos for appendrel parents, because the

     * parents will still have size zero.

     *

     * XXX if a table is self-joined, we will count it once per appearance,

     * which perhaps is the wrong thing ... but that's not completely clear,

     * and detecting self-joins here is difficult, so ignore it for now.

     */

    total_pages = 0;

    for (rti = 1; rti < root->simple_rel_array_size; rti++)

    {

        RelOptInfo *brel = root->simple_rel_array[rti];



        if (brel == NULL)

            continue;



        Assert(brel->relid == rti);        /* sanity check on array */



        if (brel->reloptkind == RELOPT_BASEREL ||

            brel->reloptkind == RELOPT_OTHER_MEMBER_REL)

            total_pages += (double) brel->pages;

    }

    root->total_table_pages = total_pages;

这是说,要得到总共的 页面数,好作为计划的依据。

但是我执行到此处,令我惊诧的是,得到的 root->total_table_pages 居然是0!

接下来的应该算是戏肉吧:

    /*

     * Ready to do the primary planning.

     */

    final_rel = make_one_rel(root, joinlist);

 

 

你可能感兴趣的:(PostgreSQL)