cuichao1900

PostgreSQL 源码解读（102）- 分区表#8（数据查询路由#5-构建APPEND访问路径）

本节介绍了PG如何为append relation(分区表)构建访问路径，主要的实现逻辑在函数set_rel_pathlist中实现，该函数首先判断RTE是否分区表，如是则调用set_append_rel_pathlist函数对仍“存活”的子关系构建访问路径，然后把这些访问路径Merge到APPEND访问路径中。

一、数据结构

AppendRelInfo
当我们将可继承表(分区表)或UNION-ALL子查询展开为“追加关系”(本质上是子RTE的链表)时，为每个子RTE构建一个AppendRelInfo。

/*
 * Append-relation info.
 * Append-relation信息.
 * 
 * When we expand an inheritable table or a UNION-ALL subselect into an
 * "append relation" (essentially, a list of child RTEs), we build an
 * AppendRelInfo for each child RTE.  The list of AppendRelInfos indicates
 * which child RTEs must be included when expanding the parent, and each node
 * carries information needed to translate Vars referencing the parent into
 * Vars referencing that child.
 * 当我们将可继承表(分区表)或UNION-ALL子查询展开为“追加关系”(本质上是子RTE的链表)时，
 *   为每个子RTE构建一个AppendRelInfo。
 * AppendRelInfos链表指示在展开父节点时必须包含哪些子rte，
 *   每个节点具有将引用父节点的Vars转换为引用该子节点的Vars所需的所有信息。
 * 
 * These structs are kept in the PlannerInfo node's append_rel_list.
 * Note that we just throw all the structs into one list, and scan the
 * whole list when desiring to expand any one parent.  We could have used
 * a more complex data structure (eg, one list per parent), but this would
 * be harder to update during operations such as pulling up subqueries,
 * and not really any easier to scan.  Considering that typical queries
 * will not have many different append parents, it doesn't seem worthwhile
 * to complicate things.
 * 这些结构体保存在PlannerInfo节点的append_rel_list中。
 * 注意，只是将所有的结构体放入一个链表中，并在希望展开任何父类时扫描整个链表。
 * 本可以使用更复杂的数据结构(例如，每个父节点一个列表)，
 *   但是在提取子查询之类的操作中更新它会更困难，
 *   而且实际上也不会更容易扫描。
 * 考虑到典型的查询不会有很多不同的附加项，因此似乎不值得将事情复杂化。
 * 
 * Note: after completion of the planner prep phase, any given RTE is an
 * append parent having entries in append_rel_list if and only if its
 * "inh" flag is set.  We clear "inh" for plain tables that turn out not
 * to have inheritance children, and (in an abuse of the original meaning
 * of the flag) we set "inh" for subquery RTEs that turn out to be
 * flattenable UNION ALL queries.  This lets us avoid useless searches
 * of append_rel_list.
 * 注意:计划准备阶段完成后,
 *   当且仅当它的“inh”标志已设置时,给定的RTE是一个append parent在append_rel_list中的一个条目。
 * 我们为没有child的平面表清除“inh”标记,
 *   同时(有滥用标记的嫌疑)为UNION ALL查询中的子查询RTEs设置“inh”标记。
 * 这样可以避免对append_rel_list进行无用的搜索。
 * 
 * Note: the data structure assumes that append-rel members are single
 * baserels.  This is OK for inheritance, but it prevents us from pulling
 * up a UNION ALL member subquery if it contains a join.  While that could
 * be fixed with a more complex data structure, at present there's not much
 * point because no improvement in the plan could result.
 * 注意:数据结构假定附加的rel成员是独立的baserels。
 * 这对于继承来说是可以的，但是如果UNION ALL member子查询包含一个join，
 *   那么它将阻止我们提取UNION ALL member子查询。
 * 虽然可以用更复杂的数据结构解决这个问题，但目前没有太大意义，因为该计划可能不会有任何改进。
 */

typedef struct AppendRelInfo
{
    NodeTag     type;

    /*
     * These fields uniquely identify this append relationship.  There can be
     * (in fact, always should be) multiple AppendRelInfos for the same
     * parent_relid, but never more than one per child_relid, since a given
     * RTE cannot be a child of more than one append parent.
     * 这些字段惟一地标识这个append relationship。
     * 对于同一个parent_relid可以有(实际上应该总是)多个AppendRelInfos，
     *   但是每个child_relid不能有多个AppendRelInfos，
     *   因为给定的RTE不能是多个append parent的子节点。
     */
    Index       parent_relid;   /* parent rel的RT索引;RT index of append parent rel */
    Index       child_relid;    /* child rel的RT索引;RT index of append child rel */

    /*
     * For an inheritance appendrel, the parent and child are both regular
     * relations, and we store their rowtype OIDs here for use in translating
     * whole-row Vars.  For a UNION-ALL appendrel, the parent and child are
     * both subqueries with no named rowtype, and we store InvalidOid here.
     * 对于继承appendrel，父类和子类都是普通关系，
     *   我们将它们的rowtype OIDs存储在这里，用于转换whole-row Vars。
     * 对于UNION-ALL appendrel，父查询和子查询都是没有指定行类型的子查询，
     * 我们在这里存储InvalidOid。
     */
    Oid         parent_reltype; /* OID of parent's composite type */
    Oid         child_reltype;  /* OID of child's composite type */

    /*
     * The N'th element of this list is a Var or expression representing the
     * child column corresponding to the N'th column of the parent. This is
     * used to translate Vars referencing the parent rel into references to
     * the child.  A list element is NULL if it corresponds to a dropped
     * column of the parent (this is only possible for inheritance cases, not
     * UNION ALL).  The list elements are always simple Vars for inheritance
     * cases, but can be arbitrary expressions in UNION ALL cases.
     * 这个列表的第N个元素是一个Var或表达式，表示与父元素的第N列对应的子列。
     * 这用于将引用parent rel的Vars转换为对子rel的引用。
     * 如果链表元素与父元素的已删除列相对应，则该元素为NULL
     *   (这只适用于继承情况，而不是UNION ALL)。
     * 对于继承情况，链表元素总是简单的变量，但是可以是UNION ALL情况下的任意表达式。
     *
     * Notice we only store entries for user columns (attno > 0).  Whole-row
     * Vars are special-cased, and system columns (attno < 0) need no special
     * translation since their attnos are the same for all tables.
     * 注意，我们只存储用户列的条目(attno > 0)。
     * Whole-row Vars是大小写敏感的，系统列(attno < 0)不需要特别的转换，
     *   因为它们的attno对所有表都是相同的。
     *
     * Caution: the Vars have varlevelsup = 0.  Be careful to adjust as needed
     * when copying into a subquery.
     * 注意:Vars的varlevelsup = 0。
     * 在将数据复制到子查询时，要注意根据需要进行调整。
     */
    //child's Vars中的表达式
    List       *translated_vars;    /* Expressions in the child's Vars */

    /*
     * We store the parent table's OID here for inheritance, or InvalidOid for
     * UNION ALL.  This is only needed to help in generating error messages if
     * an attempt is made to reference a dropped parent column.
     * 我们将父表的OID存储在这里用于继承，
     *   如为UNION ALL,则这里存储的是InvalidOid。
     * 只有在试图引用已删除的父列时，才需要这样做来帮助生成错误消息。
     */
    Oid         parent_reloid;  /* OID of parent relation */
} AppendRelInfo;

RelOptInfo
规划器/优化器使用的关系信息结构体
参见PostgreSQL 源码解读（99）- 分区表#5（数据查询路由#2-RelOptInfo数据结构）

二、源码解读

set_rel_pathlist函数为基础关系构建访问路径.


//是否DUMMY访问路径
#define IS_DUMMY_PATH(p) \
    (IsA((p), AppendPath) && ((AppendPath *) (p))->subpaths == NIL)

/* A relation that's been proven empty will have one path that is dummy */
//已被证明为空的关系会含有一个虚拟dummy的访问路径
#define IS_DUMMY_REL(r) \
    ((r)->cheapest_total_path != NULL && \
     IS_DUMMY_PATH((r)->cheapest_total_path))


/*
 * set_rel_pathlist
 *    Build access paths for a base relation
 *    为基础关系构建访问路径
 */
static void
set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
                 Index rti, RangeTblEntry *rte)
{
    if (IS_DUMMY_REL(rel))
    {
        /* We already proved the relation empty, so nothing more to do */
        //不需要做任何处理
    }
    else if (rte->inh)
    {
        /* It's an "append relation", process accordingly */
        //append relation,调用set_append_rel_pathlist处理
        set_append_rel_pathlist(root, rel, rti, rte);
    }
    else
    {//其他类型的关系
        switch (rel->rtekind)
        {
            case RTE_RELATION://基础关系
                if (rte->relkind == RELKIND_FOREIGN_TABLE)
                {
                    /* Foreign table */
                    //外部表
                    set_foreign_pathlist(root, rel, rte);
                }
                else if (rte->tablesample != NULL)
                {
                    /* Sampled relation */
                    //数据表采样
                    set_tablesample_rel_pathlist(root, rel, rte);
                }
                else
                {
                    /* Plain relation */
                    //常规关系
                    set_plain_rel_pathlist(root, rel, rte);
                }
                break;
            case RTE_SUBQUERY:
                /* Subquery --- fully handled during set_rel_size */
                //子查询
                break;
            case RTE_FUNCTION:
                /* RangeFunction */
                set_function_pathlist(root, rel, rte);
                break;
            case RTE_TABLEFUNC:
                /* Table Function */
                set_tablefunc_pathlist(root, rel, rte);
                break;
            case RTE_VALUES:
                /* Values list */
                set_values_pathlist(root, rel, rte);
                break;
            case RTE_CTE:
                /* CTE reference --- fully handled during set_rel_size */
                break;
            case RTE_NAMEDTUPLESTORE:
                /* tuplestore reference --- fully handled during set_rel_size */
                break;
            default:
                elog(ERROR, "unexpected rtekind: %d", (int) rel->rtekind);
                break;
        }
    }

    /*
     * If this is a baserel, we should normally consider gathering any partial
     * paths we may have created for it.
     * 如为基础关系,通常来说需要考虑聚集gathering所有的部分路径(先前已构建)
     *
     * However, if this is an inheritance child, skip it.  Otherwise, we could
     * end up with a very large number of gather nodes, each trying to grab
     * its own pool of workers.  Instead, we'll consider gathering partial
     * paths for the parent appendrel.
     * 但是,如果这是一个继承关系中的子表,忽略它.
     * 否则的话,我们最好可能会有很大量的Gather节点,每一个都尝试grab它自己worker的输出.
     * 相反我们的处理是为父append relation生成gathering部分访问路径.
     *
     * Also, if this is the topmost scan/join rel (that is, the only baserel),
     * we postpone this until the final scan/join targelist is available (see
     * grouping_planner).
     * 另外,如果这是最高层的scan/join关系(基础关系),
     *   在完成了最后的scan/join投影列处理后才进行相应的处理.
     */
    if (rel->reloptkind == RELOPT_BASEREL &&
        bms_membership(root->all_baserels) != BMS_SINGLETON)
        generate_gather_paths(root, rel, false);

    /*
     * Allow a plugin to editorialize on the set of Paths for this base
     * relation.  It could add new paths (such as CustomPaths) by calling
     * add_path(), or delete or modify paths added by the core code.
     * 调用钩子函数.
     * 插件可以通过调用add_path添加新的访问路径(比如CustomPaths等),
     *   或者删除/修改核心代码生成的访问路径.
     */
    if (set_rel_pathlist_hook)
        (*set_rel_pathlist_hook) (root, rel, rti, rte);

    /* Now find the cheapest of the paths for this rel */
    //为rel找到成本最低的访问路径
    set_cheapest(rel);

#ifdef OPTIMIZER_DEBUG
    debug_print_rel(root, rel);
#endif
}


 /*
 * set_append_rel_pathlist
 *    Build access paths for an "append relation"
 *    为append relation构建访问路径
 */
static void
set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
                        Index rti, RangeTblEntry *rte)
{
    int         parentRTindex = rti;
    List       *live_childrels = NIL;
    ListCell   *l;

    /*
     * Generate access paths for each member relation, and remember the
     * non-dummy children.
     * 为每个成员关系生成访问路径，并"记住"非虚拟子节点。
     */
    foreach(l, root->append_rel_list)//遍历链表
    {
        AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);//获取AppendRelInfo
        int         childRTindex;
        RangeTblEntry *childRTE;
        RelOptInfo *childrel;

        /* append_rel_list contains all append rels; ignore others */
        //append_rel_list含有所有的append relations,忽略其他的rels
        if (appinfo->parent_relid != parentRTindex)
            continue;

        /* Re-locate the child RTE and RelOptInfo */
        //重新定位子RTE和RelOptInfo
        childRTindex = appinfo->child_relid;
        childRTE = root->simple_rte_array[childRTindex];
        childrel = root->simple_rel_array[childRTindex];

        /*
         * If set_append_rel_size() decided the parent appendrel was
         * parallel-unsafe at some point after visiting this child rel, we
         * need to propagate the unsafety marking down to the child, so that
         * we don't generate useless partial paths for it.
         * 如果set_append_rel_size()函数在访问了子关系后,
         *   在某些点上断定父append relation是非并行安全的,
         *   我们需要分发不安全的标记到子关系中,以免产生无用的部分访问路径.
         */
        if (!rel->consider_parallel)
            childrel->consider_parallel = false;

        /*
         * Compute the child's access paths.
         * "计算"子关系的访问路径
         */
        set_rel_pathlist(root, childrel, childRTindex, childRTE);

        /*
         * If child is dummy, ignore it.
         * 如为虚拟关系,忽略之
         */
        if (IS_DUMMY_REL(childrel))
            continue;

        /* Bubble up childrel's partitioned children. */
        //
        if (rel->part_scheme)
            rel->partitioned_child_rels =
                list_concat(rel->partitioned_child_rels,
                            list_copy(childrel->partitioned_child_rels));

        /*
         * Child is live, so add it to the live_childrels list for use below.
         * 添加到live_childrels链表中
         */
        live_childrels = lappend(live_childrels, childrel);
    }

    /* Add paths to the append relation. */
    //添加访问路径到append relation中
    add_paths_to_append_rel(root, rel, live_childrels);
}


/*
 * add_paths_to_append_rel
 *      Generate paths for the given append relation given the set of non-dummy
 *      child rels.
 *      基于非虚拟子关系集合为给定的append relation生成访问路径
 *
 * The function collects all parameterizations and orderings supported by the
 * non-dummy children. For every such parameterization or ordering, it creates
 * an append path collecting one path from each non-dummy child with given
 * parameterization or ordering. Similarly it collects partial paths from
 * non-dummy children to create partial append paths.
 * 该函数收集所有的非虚拟关系支持的参数化和排序访问路径.
 * 对于每一个参数化或排序的访问路径,它创建一个附加路径，
 *   从每个具有给定参数化或排序的非虚拟子节点收集相关信息。
 * 类似的,从非虚拟子关系中收集部分访问路径用以创建部分append路径.
 */
void
add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
                        List *live_childrels)
{
    List       *subpaths = NIL;
    bool        subpaths_valid = true;
    List       *partial_subpaths = NIL;
    List       *pa_partial_subpaths = NIL;
    List       *pa_nonpartial_subpaths = NIL;
    bool        partial_subpaths_valid = true;
    bool        pa_subpaths_valid;
    List       *all_child_pathkeys = NIL;
    List       *all_child_outers = NIL;
    ListCell   *l;
    List       *partitioned_rels = NIL;
    double      partial_rows = -1;

    /* If appropriate, consider parallel append */
    //如可以,考虑并行append
    pa_subpaths_valid = enable_parallel_append && rel->consider_parallel;

    /*
     * AppendPath generated for partitioned tables must record the RT indexes
     * of partitioned tables that are direct or indirect children of this
     * Append rel.
     * 为分区表生成的AppendPath必须记录属于该分区表(Append Rel)的直接或间接子关系的RT索引
     *
     * AppendPath may be for a sub-query RTE (UNION ALL), in which case, 'rel'
     * itself does not represent a partitioned relation, but the child sub-
     * queries may contain references to partitioned relations.  The loop
     * below will look for such children and collect them in a list to be
     * passed to the path creation function.  (This assumes that we don't need
     * to look through multiple levels of subquery RTEs; if we ever do, we
     * could consider stuffing the list we generate here into sub-query RTE's
     * RelOptInfo, just like we do for partitioned rels, which would be used
     * when populating our parent rel with paths.  For the present, that
     * appears to be unnecessary.)
     * AppendPath可能是子查询RTE(UNION ALL),在这种情况下,rel自身并不代表分区表,
     *   但child sub-queries可能含有分区表的依赖.
     * 以下的循环会寻找这样的子关系,存储在链表中,并作为参数传给访问路径创建函数.
     * (这是假设我们不需要遍历多个层次的子查询RTEs,如果我们曾经这样做了,
     *  这好比分区表的做法,可以考虑把生成的链表放到子查询的RTE's RelOptInfo结构体中,
     *  用于使用访问路径填充父关系.不过目前来说,这样做是不需要的.)
     */
    if (rel->part_scheme != NULL)
    {
        if (IS_SIMPLE_REL(rel))
            partitioned_rels = list_make1(rel->partitioned_child_rels);
        else if (IS_JOIN_REL(rel))
        {
            int         relid = -1;
            List       *partrels = NIL;

            /*
             * For a partitioned joinrel, concatenate the component rels'
             * partitioned_child_rels lists.
             * 对于分区连接rel,连接rels的partitioned_child_rels链表
             *
             */
            while ((relid = bms_next_member(rel->relids, relid)) >= 0)
            {
                RelOptInfo *component;

                Assert(relid >= 1 && relid < root->simple_rel_array_size);
                component = root->simple_rel_array[relid];
                Assert(component->part_scheme != NULL);
                Assert(list_length(component->partitioned_child_rels) >= 1);
                partrels =
                    list_concat(partrels,
                                list_copy(component->partitioned_child_rels));
            }

            partitioned_rels = list_make1(partrels);
        }

        Assert(list_length(partitioned_rels) >= 1);
    }

    /*
     * For every non-dummy child, remember the cheapest path.  Also, identify
     * all pathkeys (orderings) and parameterizations (required_outer sets)
     * available for the non-dummy member relations.
     * 对于每一个非虚拟子关系,记录成本最低的访问路径.
     * 同时,为每一个非虚拟成员关系标识所有的路径键(排序)和可用的参数化信息(required_outer集合)
     */
    foreach(l, live_childrels)//遍历
    {
        RelOptInfo *childrel = lfirst(l);
        ListCell   *lcp;
        Path       *cheapest_partial_path = NULL;

        /*
         * For UNION ALLs with non-empty partitioned_child_rels, accumulate
         * the Lists of child relations.
         * 对于包含非空的partitioned_child_rels的UNION ALLs操作,
         *   累积子关系链表
         */
        if (rel->rtekind == RTE_SUBQUERY && childrel->partitioned_child_rels != NIL)
            partitioned_rels = lappend(partitioned_rels,
                                       childrel->partitioned_child_rels);

        /*
         * If child has an unparameterized cheapest-total path, add that to
         * the unparameterized Append path we are constructing for the parent.
         * If not, there's no workable unparameterized path.
         * 如果子关系存在非参数化的总成本最低的访问路径,
         *   添加此路径到我们为父关系构建的非参数化的Append访问路径中.
         *
         * With partitionwise aggregates, the child rel's pathlist may be
         * empty, so don't assume that a path exists here.
         * 使用partitionwise聚合,子关系的访问路径链表可能是空的,不能假设其中存在访问路径
         */
        if (childrel->pathlist != NIL &&
            childrel->cheapest_total_path->param_info == NULL)
            accumulate_append_subpath(childrel->cheapest_total_path,
                                      &subpaths, NULL);
        else
            subpaths_valid = false;

        /* Same idea, but for a partial plan. */
        //同样的思路,处理并行处理中的部分计划
        if (childrel->partial_pathlist != NIL)
        {
            cheapest_partial_path = linitial(childrel->partial_pathlist);
            accumulate_append_subpath(cheapest_partial_path,
                                      &partial_subpaths, NULL);
        }
        else
            partial_subpaths_valid = false;

        /*
         * Same idea, but for a parallel append mixing partial and non-partial
         * paths.
         * 同样的,处理并行append混合并行/非并行访问路径
         */
        if (pa_subpaths_valid)
        {
            Path       *nppath = NULL;

            nppath =
                get_cheapest_parallel_safe_total_inner(childrel->pathlist);

            if (cheapest_partial_path == NULL && nppath == NULL)
            {
                /* Neither a partial nor a parallel-safe path?  Forget it. */
                //不是部分路径,也不是并行安全的路径,跳过
                pa_subpaths_valid = false;
            }
            else if (nppath == NULL ||
                     (cheapest_partial_path != NULL &&
                      cheapest_partial_path->total_cost < nppath->total_cost))
            {
                /* Partial path is cheaper or the only option. */
                //部分路径成本更低或者是唯一的选项
                Assert(cheapest_partial_path != NULL);
                accumulate_append_subpath(cheapest_partial_path,
                                          &pa_partial_subpaths,
                                          &pa_nonpartial_subpaths);

            }
            else
            {
                /*
                 * Either we've got only a non-partial path, or we think that
                 * a single backend can execute the best non-partial path
                 * faster than all the parallel backends working together can
                 * execute the best partial path.
                 * 这时候,要么得到了一个非并行访问路径,或者我们认为一个单独的后台进程
                 *   执行最好的非并行访问路径会比索引的并行进程一起执行最好的部分路径还要好.
                 *
                 * It might make sense to be more aggressive here.  Even if
                 * the best non-partial path is more expensive than the best
                 * partial path, it could still be better to choose the
                 * non-partial path if there are several such paths that can
                 * be given to different workers.  For now, we don't try to
                 * figure that out.
                 * 在这里,采取更积极的态度是有道理的.
                 * 甚至最好的非部分路径比最好的并行部分路径成本更高,仍然需要选择非并行路径,
                 *   如果多个这样的路径可能会分派到不同的worker上.
                 * 现在不需要指出这一点.
                 */
                accumulate_append_subpath(nppath,
                                          &pa_nonpartial_subpaths,
                                          NULL);
            }
        }

        /*
         * Collect lists of all the available path orderings and
         * parameterizations for all the children.  We use these as a
         * heuristic to indicate which sort orderings and parameterizations we
         * should build Append and MergeAppend paths for.
         * 收集子关系所有可用的排序和参数化路径链表.
         * 我们采用启发式的规则判断Append和MergeAppend访问路径使用哪个排序和参数化信息
         */
        foreach(lcp, childrel->pathlist)
        {
            Path       *childpath = (Path *) lfirst(lcp);
            List       *childkeys = childpath->pathkeys;
            Relids      childouter = PATH_REQ_OUTER(childpath);

            /* Unsorted paths don't contribute to pathkey list */
            //未排序的访问路径,不需要分发到路径键链表中
            if (childkeys != NIL)
            {
                ListCell   *lpk;
                bool        found = false;

                /* Have we already seen this ordering? */
                foreach(lpk, all_child_pathkeys)
                {
                    List       *existing_pathkeys = (List *) lfirst(lpk);

                    if (compare_pathkeys(existing_pathkeys,
                                         childkeys) == PATHKEYS_EQUAL)
                    {
                        found = true;
                        break;
                    }
                }
                if (!found)
                {
                    /* No, so add it to all_child_pathkeys */
                    all_child_pathkeys = lappend(all_child_pathkeys,
                                                 childkeys);
                }
            }

            /* Unparameterized paths don't contribute to param-set list */
            //非参数访问路径无需分发到参数化集合链表中
            if (childouter)
            {
                ListCell   *lco;
                bool        found = false;

                /* Have we already seen this param set? */
                foreach(lco, all_child_outers)
                {
                    Relids      existing_outers = (Relids) lfirst(lco);

                    if (bms_equal(existing_outers, childouter))
                    {
                        found = true;
                        break;
                    }
                }
                if (!found)
                {
                    /* No, so add it to all_child_outers */
                    all_child_outers = lappend(all_child_outers,
                                               childouter);
                }
            }
        }
    }

    /*
     * If we found unparameterized paths for all children, build an unordered,
     * unparameterized Append path for the rel.  (Note: this is correct even
     * if we have zero or one live subpath due to constraint exclusion.)
     * 如存在子关系的非参数化访问路径,构建未排序/未参数化的Append访问路径.
     * (注意:如果存在约束排除,我们只剩下有0或1个存活的subpath,这样的处理也说OK的)
     */
    if (subpaths_valid)
        add_path(rel, (Path *) create_append_path(root, rel, subpaths, NIL,
                                                  NULL, 0, false,
                                                  partitioned_rels, -1));

    /*
     * Consider an append of unordered, unparameterized partial paths.  Make
     * it parallel-aware if possible.
     * 尝试未排序/未参数化的部分Append访问路径.
     * 如可能,构建parallel-aware访问路径.
     */
    if (partial_subpaths_valid)
    {
        AppendPath *appendpath;
        ListCell   *lc;
        int         parallel_workers = 0;

        /* Find the highest number of workers requested for any subpath. */
        //为子访问路径寻找最多数量的wokers
        foreach(lc, partial_subpaths)
        {
            Path       *path = lfirst(lc);

            parallel_workers = Max(parallel_workers, path->parallel_workers);
        }
        Assert(parallel_workers > 0);

        /*
         * If the use of parallel append is permitted, always request at least
         * log2(# of children) workers.  We assume it can be useful to have
         * extra workers in this case because they will be spread out across
         * the children.  The precise formula is just a guess, but we don't
         * want to end up with a radically different answer for a table with N
         * partitions vs. an unpartitioned table with the same data, so the
         * use of some kind of log-scaling here seems to make some sense.
         * 如允许使用并行append,那么至少需要log2(子关系个数)个workers.
         * 经过扩展后,假定可以使用额外的workers.
         * 并不存在精确的计算公式,目前只是猜测而已,但是对于有相同数据的N个分区的分区表和非分区表来说,
         *   答案是不一样的,因此在这里使用对数的计算方法是OK的.
         */
        if (enable_parallel_append)
        {
            parallel_workers = Max(parallel_workers,
                                   fls(list_length(live_childrels)));//上限值
            parallel_workers = Min(parallel_workers,
                                   max_parallel_workers_per_gather);//下限值
        }
        Assert(parallel_workers > 0);

        /* Generate a partial append path. */
        //生成并行部分append访问路径
        appendpath = create_append_path(root, rel, NIL, partial_subpaths,
                                        NULL, parallel_workers,
                                        enable_parallel_append,
                                        partitioned_rels, -1);

        /*
         * Make sure any subsequent partial paths use the same row count
         * estimate.
         * 确保所有的子部分路径使用相同的行数估算
         */
        partial_rows = appendpath->path.rows;

        /* Add the path. */
        //添加路径
        add_partial_path(rel, (Path *) appendpath);
    }

    /*
     * Consider a parallel-aware append using a mix of partial and non-partial
     * paths.  (This only makes sense if there's at least one child which has
     * a non-partial path that is substantially cheaper than any partial path;
     * otherwise, we should use the append path added in the previous step.)
     * 使用混合的部分和非部分并行的append.
     */
    if (pa_subpaths_valid && pa_nonpartial_subpaths != NIL)
    {
        AppendPath *appendpath;
        ListCell   *lc;
        int         parallel_workers = 0;

        /*
         * Find the highest number of workers requested for any partial
         * subpath.
         */
        foreach(lc, pa_partial_subpaths)
        {
            Path       *path = lfirst(lc);

            parallel_workers = Max(parallel_workers, path->parallel_workers);
        }

        /*
         * Same formula here as above.  It's even more important in this
         * instance because the non-partial paths won't contribute anything to
         * the planned number of parallel workers.
         */
        parallel_workers = Max(parallel_workers,
                               fls(list_length(live_childrels)));
        parallel_workers = Min(parallel_workers,
                               max_parallel_workers_per_gather);
        Assert(parallel_workers > 0);

        appendpath = create_append_path(root, rel, pa_nonpartial_subpaths,
                                        pa_partial_subpaths,
                                        NULL, parallel_workers, true,
                                        partitioned_rels, partial_rows);
        add_partial_path(rel, (Path *) appendpath);
    }

    /*
     * Also build unparameterized MergeAppend paths based on the collected
     * list of child pathkeys.
     * 基于收集的子路径键,构建非参数化的MergeAppend访问路径
     */
    if (subpaths_valid)
        generate_mergeappend_paths(root, rel, live_childrels,
                                   all_child_pathkeys,
                                   partitioned_rels);

    /*
     * Build Append paths for each parameterization seen among the child rels.
     * (This may look pretty expensive, but in most cases of practical
     * interest, the child rels will expose mostly the same parameterizations,
     * so that not that many cases actually get considered here.)
     * 为每一个参数化的子关系构建Append访问路径.
     * (看起来成本客观,但在大多数情况下,子关系使用同样的参数化信息,因此实际上并不是经常发现)
     *
     * The Append node itself cannot enforce quals, so all qual checking must
     * be done in the child paths.  This means that to have a parameterized
     * Append path, we must have the exact same parameterization for each
     * child path; otherwise some children might be failing to check the
     * moved-down quals.  To make them match up, we can try to increase the
     * parameterization of lesser-parameterized paths.
     */
    foreach(l, all_child_outers)
    {
        Relids      required_outer = (Relids) lfirst(l);
        ListCell   *lcr;

        /* Select the child paths for an Append with this parameterization */
        subpaths = NIL;
        subpaths_valid = true;
        foreach(lcr, live_childrels)
        {
            RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
            Path       *subpath;

            if (childrel->pathlist == NIL)
            {
                /* failed to make a suitable path for this child */
                subpaths_valid = false;
                break;
            }

            subpath = get_cheapest_parameterized_child_path(root,
                                                            childrel,
                                                            required_outer);
            if (subpath == NULL)
            {
                /* failed to make a suitable path for this child */
                subpaths_valid = false;
                break;
            }
            accumulate_append_subpath(subpath, &subpaths, NULL);
        }

        if (subpaths_valid)
            add_path(rel, (Path *)
                     create_append_path(root, rel, subpaths, NIL,
                                        required_outer, 0, false,
                                        partitioned_rels, -1));
    }
}

三、跟踪分析

测试脚本如下

testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 Append  (cost=0.00..30.53 rows=6 width=200)
   ->  Seq Scan on public.t_hash_partition_1  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3
         Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2))
   ->  Seq Scan on public.t_hash_partition_3  (cost=0.00..15.25 rows=3 width=200)
         Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3
         Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))
(7 rows)

启动gdb,设置断点

(gdb) b set_rel_pathlist
Breakpoint 1 at 0x796823: file allpaths.c, line 425.
(gdb) c
Continuing.

Breakpoint 1, set_rel_pathlist (root=0x1f1e400, rel=0x1efaba0, rti=1, rte=0x1efa3d0) at allpaths.c:425
425     if (IS_DUMMY_REL(rel))
(gdb)

通过rte->inh判断是否分区表或者UNION ALL

(gdb) p rte->inh
$1 = true
(gdb)

进入set_append_rel_pathlist函数

(gdb) n
429     else if (rte->inh)
(gdb) 
432         set_append_rel_pathlist(root, rel, rti, rte);
(gdb) step
set_append_rel_pathlist (root=0x1f1e400, rel=0x1efaba0, rti=1, rte=0x1efa3d0) at allpaths.c:1296
1296        int         parentRTindex = rti;

遍历子关系

(gdb) n
1297        List       *live_childrels = NIL;
(gdb) 
1304        foreach(l, root->append_rel_list)
(gdb) 
(gdb) p *root->append_rel_list
$2 = {type = T_List, length = 6, head = 0x1fc1f98, tail = 0x1fc2ae8}

获取AppendRelInfo,判断父关系是否正在处理的父关系

(gdb) n
1306            AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
(gdb) 
1312            if (appinfo->parent_relid != parentRTindex)
(gdb) p *appinfo
$3 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, 
  translated_vars = 0x1fc1e60, parent_reloid = 16986}
(gdb)

获取子关系的相关信息,递归调用set_rel_pathlist

(gdb) n
1316            childRTindex = appinfo->child_relid;
(gdb) 
1317            childRTE = root->simple_rte_array[childRTindex];
(gdb) 
1318            childrel = root->simple_rel_array[childRTindex];
(gdb) 
1326            if (!rel->consider_parallel)
(gdb) 
1332            set_rel_pathlist(root, childrel, childRTindex, childRTE);
(gdb) 
(gdb) 

Breakpoint 1, set_rel_pathlist (root=0x1f1e400, rel=0x1f1c8a0, rti=3, rte=0x1efa658) at allpaths.c:425
425     if (IS_DUMMY_REL(rel))
(gdb) finish
Run till exit from #0  set_rel_pathlist (root=0x1f1e400, rel=0x1f1c8a0, rti=3, rte=0x1efa658) at allpaths.c:425
set_append_rel_pathlist (root=0x1f1e400, rel=0x1efaba0, rti=1, rte=0x1efa3d0) at allpaths.c:1337

如为虚拟关系,则忽略之

1337            if (IS_DUMMY_REL(childrel))

该子关系不是虚拟关系,继续处理,加入到rel->partitioned_child_rels和live_childrels链表中

(gdb) n
1341            if (rel->part_scheme)
(gdb) 
1344                                list_copy(childrel->partitioned_child_rels));
(gdb) 
1343                    list_concat(rel->partitioned_child_rels,
(gdb) 
1342                rel->partitioned_child_rels =
(gdb) 
1349            live_childrels = lappend(live_childrels, childrel);
(gdb) p *rel->partitioned_child_rels
$4 = {type = T_IntList, length = 1, head = 0x1fc4d78, tail = 0x1fc4d78}
(gdb) p rel->partitioned_child_rels->head->data.int_value
$6 = 1
(gdb) 
(gdb) n
1304        foreach(l, root->append_rel_list)
(gdb) p live_childrels
$7 = (List *) 0x1fd0a60
(gdb) p *live_childrels
$8 = {type = T_List, length = 1, head = 0x1fd0a38, tail = 0x1fd0a38}
(gdb) p *(Node *)live_childrels->head->data.ptr_value
$9 = {type = T_RelOptInfo}
(gdb) p *(RelOptInfo *)live_childrels->head->data.ptr_value
$10 = {type = T_RelOptInfo, reloptkind = RELOPT_OTHER_MEMBER_REL, relids = 0x1fc3590, rows = 3, consider_startup = false, 
  consider_param_startup = false, consider_parallel = true, reltarget = 0x1fc35b0, pathlist = 0x1fd0940, ppilist = 0x0, 
  partial_pathlist = 0x1fd09a0, cheapest_startup_path = 0x1fc44f8, cheapest_total_path = 0x1fc44f8, 
  cheapest_unique_path = 0x0, cheapest_parameterized_paths = 0x1fd0a00, direct_lateral_relids = 0x0, lateral_relids = 0x0, 
  relid = 3, reltablespace = 0, rtekind = RTE_RELATION, min_attr = -7, max_attr = 3, attr_needed = 0x1fc2e38, 
  attr_widths = 0x1fc3628, lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x0, statlist = 0x0, pages = 10, 
  tuples = 350, allvisfrac = 0, subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0, 
  useridiscurrent = false, fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0, 
  baserestrictinfo = 0x1fc68d8, baserestrictcost = {startup = 0, per_tuple = 0.0050000000000000001}, 
  baserestrict_min_security = 0, joininfo = 0x0, has_eclass_joins = false, consider_partitionwise_join = false, 
  top_parent_relids = 0x1fc3608, part_scheme = 0x0, nparts = 0, boundinfo = 0x0, partition_qual = 0x0, part_rels = 0x0, 
  partexprs = 0x0, nullable_partexprs = 0x0, partitioned_child_rels = 0x0}

对于虚拟子关系(上一节介绍的被裁剪的分区),直接跳过

(gdb) n
1306            AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(l);
(gdb) 
1312            if (appinfo->parent_relid != parentRTindex)
(gdb) 
1316            childRTindex = appinfo->child_relid;
(gdb) 
1317            childRTE = root->simple_rte_array[childRTindex];
(gdb) 
1318            childrel = root->simple_rel_array[childRTindex];
(gdb) 
1326            if (!rel->consider_parallel)
(gdb) 
1332            set_rel_pathlist(root, childrel, childRTindex, childRTE);
(gdb) 
1337            if (IS_DUMMY_REL(childrel))
(gdb) 
1338                continue;

设置断点,进入add_paths_to_append_rel函数

(gdb) b add_paths_to_append_rel
Breakpoint 2 at 0x797d88: file allpaths.c, line 1372.
(gdb) c
Continuing.

Breakpoint 2, add_paths_to_append_rel (root=0x1f1cdb8, rel=0x1fc1800, live_childrels=0x1fcfb10) at allpaths.c:1372
1372        List       *subpaths = NIL;
(gdb)

输入参数,其中rel是父关系,live_childrels是经裁剪后仍存活的分区(子关系)

(gdb) n
1373        bool        subpaths_valid = true;
(gdb) p *rel
$11 = {type = T_RelOptInfo, reloptkind = RELOPT_BASEREL, relids = 0x1fc1a18, rows = 6, consider_startup = false, 
  consider_param_startup = false, consider_parallel = true, reltarget = 0x1fc1a38, pathlist = 0x0, ppilist = 0x0, 
  partial_pathlist = 0x0, cheapest_startup_path = 0x0, cheapest_total_path = 0x0, cheapest_unique_path = 0x0, 
  cheapest_parameterized_paths = 0x0, direct_lateral_relids = 0x0, lateral_relids = 0x0, relid = 1, reltablespace = 0, 
  rtekind = RTE_RELATION, min_attr = -7, max_attr = 3, attr_needed = 0x1fc1a90, attr_widths = 0x1fc1b28, 
  lateral_vars = 0x0, lateral_referencers = 0x0, indexlist = 0x0, statlist = 0x0, pages = 0, tuples = 6, allvisfrac = 0, 
  subroot = 0x0, subplan_params = 0x0, rel_parallel_workers = -1, serverid = 0, userid = 0, useridiscurrent = false, 
  fdwroutine = 0x0, fdw_private = 0x0, unique_for_rels = 0x0, non_unique_for_rels = 0x0, baserestrictinfo = 0x1fc3e20, 
  baserestrictcost = {startup = 0, per_tuple = 0}, baserestrict_min_security = 0, joininfo = 0x0, has_eclass_joins = false, 
  consider_partitionwise_join = false, top_parent_relids = 0x0, part_scheme = 0x1fc1b80, nparts = 6, boundinfo = 0x1fc1d30, 
  partition_qual = 0x0, part_rels = 0x1fc2000, partexprs = 0x1fc1f08, nullable_partexprs = 0x1fc1fe0, 
  partitioned_child_rels = 0x1fc3ea0}

初始化变量

(gdb) n
1374        List       *partial_subpaths = NIL;
(gdb) 
1375        List       *pa_partial_subpaths = NIL;
(gdb) 
1376        List       *pa_nonpartial_subpaths = NIL;
(gdb) 
1377        bool        partial_subpaths_valid = true;
(gdb) 
1379        List       *all_child_pathkeys = NIL;
(gdb) 
1380        List       *all_child_outers = NIL;
(gdb) 
1382        List       *partitioned_rels = NIL;
(gdb) 
1383        double      partial_rows = -1;
(gdb) 
1386        pa_subpaths_valid = enable_parallel_append && rel->consider_parallel;
(gdb) 
1404        if (rel->part_scheme != NULL)
(gdb) p pa_subpaths_valid
$17 = true

构建partitioned_rels链表

(gdb) n
1406            if (IS_SIMPLE_REL(rel))
(gdb) 
1407                partitioned_rels = list_make1(rel->partitioned_child_rels);
(gdb) 
1433            Assert(list_length(partitioned_rels) >= 1);
(gdb) 
1441        foreach(l, live_childrels)
(gdb) p *partitioned_rels
$18 = {type = T_List, length = 1, head = 0x1fcff40, tail = 0x1fcff40}

开始遍历live_childrels,对于每一个非虚拟子关系,记录成本最低的访问路径.
如果子关系存在非参数化的总成本最低的访问路径,添加此路径到我们为父关系构建的非参数化的Append访问路径中.

(gdb) n
1443            RelOptInfo *childrel = lfirst(l);
(gdb) 
1445            Path       *cheapest_partial_path = NULL;
(gdb) 
1451            if (rel->rtekind == RTE_SUBQUERY && childrel->partitioned_child_rels != NIL)
(gdb) 
1463            if (childrel->pathlist != NIL &&
(gdb) 
1464                childrel->cheapest_total_path->param_info == NULL)
(gdb) 
1463            if (childrel->pathlist != NIL &&
(gdb) 
1465                accumulate_append_subpath(childrel->cheapest_total_path,

同样的思路,处理并行处理中的部分计划

(gdb) n
1471            if (childrel->partial_pathlist != NIL)
(gdb) 
1473                cheapest_partial_path = linitial(childrel->partial_pathlist);
(gdb) 
1474                accumulate_append_subpath(cheapest_partial_path,

同样的,处理并行append混合并行/非并行访问路径

(gdb) n
1486                Path       *nppath = NULL;
(gdb) 
1489                    get_cheapest_parallel_safe_total_inner(childrel->pathlist);
(gdb) 
1488                nppath =
(gdb) 
1491                if (cheapest_partial_path == NULL && nppath == NULL)
(gdb) 
1496                else if (nppath == NULL ||
(gdb) 
1498                          cheapest_partial_path->total_cost < nppath->total_cost))
(gdb) 
1497                         (cheapest_partial_path != NULL &&
(gdb) 
1501                    Assert(cheapest_partial_path != NULL);
(gdb) 
1502                    accumulate_append_subpath(cheapest_partial_path,

收集子关系所有可用的排序和参数化路径链表.

(gdb) 
1534            foreach(lcp, childrel->pathlist)
(gdb) 
(gdb) n
1536                Path       *childpath = (Path *) lfirst(lcp);
(gdb) 
1537                List       *childkeys = childpath->pathkeys;
(gdb) 
1538                Relids      childouter = PATH_REQ_OUTER(childpath);
(gdb) 
1541                if (childkeys != NIL)
(gdb) 
1567                if (childouter)
(gdb) 
1534            foreach(lcp, childrel->pathlist)

继续下一个子关系,完成处理

...
(gdb) 
1441        foreach(l, live_childrels)
(gdb) 
1598        if (subpaths_valid)

如存在子关系的非参数化访问路径,构建未排序/未参数化的Append访问路径.

(gdb) n
1599            add_path(rel, (Path *) create_append_path(root, rel, subpaths, NIL,
    (gdb) p *rel->pathlist
Cannot access memory at address 0x0
(gdb) n
1607        if (partial_subpaths_valid)
(gdb) p *rel->pathlist
$22 = {type = T_List, length = 1, head = 0x1fd0230, tail = 0x1fd0230}

尝试未排序/未参数化的部分Append访问路径.如可能,构建parallel-aware访问路径.

...
(gdb) 
1641            appendpath = create_append_path(root, rel, NIL, partial_subpaths,
(gdb) 
1650            partial_rows = appendpath->path.rows;
(gdb) 
1653            add_partial_path(rel, (Path *) appendpath);
(gdb)

使用混合的部分和非部分并行的append.

1662        if (pa_subpaths_valid && pa_nonpartial_subpaths != NIL)
(gdb)

基于收集的子路径键,构建非参数化的MergeAppend访问路径

1701        if (subpaths_valid)
(gdb) 
1702            generate_mergeappend_paths(root, rel, live_childrels,

完成调用

(gdb) 
1719        foreach(l, all_child_outers)
(gdb) 
1757    }
(gdb) 
set_append_rel_pathlist (root=0x1f1cdb8, rel=0x1fc1800, rti=1, rte=0x1efa3d0) at allpaths.c:1354
1354    }
(gdb) p *rel->pathlist
$23 = {type = T_List, length = 1, head = 0x1fd0230, tail = 0x1fd0230}
(gdb) 
(gdb) p *(Node *)rel->pathlist->head->data.ptr_value
$24 = {type = T_AppendPath}
(gdb) p *(AppendPath *)rel->pathlist->head->data.ptr_value
$25 = {path = {type = T_AppendPath, pathtype = T_Append, parent = 0x1fc1800, pathtarget = 0x1fc1a38, param_info = 0x0, 
    parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 6, startup_cost = 0, 
    total_cost = 30.530000000000001, pathkeys = 0x0}, partitioned_rels = 0x1fd01f8, subpaths = 0x1fcffc8, 
  first_partial_path = 2}

结束调用

(gdb) n
set_rel_pathlist (root=0x1f1cdb8, rel=0x1fc1800, rti=1, rte=0x1efa3d0) at allpaths.c:495
495     if (rel->reloptkind == RELOPT_BASEREL &&
(gdb) 
496         bms_membership(root->all_baserels) != BMS_SINGLETON)
(gdb) 
495     if (rel->reloptkind == RELOPT_BASEREL &&
(gdb) 
504     if (set_rel_pathlist_hook)
(gdb) 
508     set_cheapest(rel);
(gdb) 
513 }
(gdb)

DONE!

四、参考资料

Parallel Append implementation
Partition Elimination in PostgreSQL 11

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/6906/viewspace-2374788/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/6906/viewspace-2374788/

你可能感兴趣的:(PostgreSQL 源码解读（102）- 分区表#8（数据查询路由#5-构建APPEND访问路径）)

【硬核拆解】英伟达Blackwell芯片架构如何重构AI算力边界？ HeartException 人工智能
前言前些天发现了一个巨牛的人工智能免费学习网站，通俗易懂，风趣幽默，忍不住分享一下给大家。点击跳转到网站一、Blackwell诞生的算力危机（2025现状）graphTDA[2025年AI算力需求]-->B[千亿参数模型训练能耗>20GWh]A-->C[10万亿参数模型涌现]A-->D[传统架构内存墙：数据搬运耗能占68%]行业拐点事件：2025年3月：OpenAI宣布训练125万亿参数MoE模型
wpf打包一个独立的库 null_null999 windows
https://www.google.com/search?q=wpf+%E6%89%93%E5%8C%85%E4%B8%80%E4%B8%AA%E7%8B%AC%E7%AB%8B%E5%BA%93&newwindow=1&sca_esv=32f9ae821a1b1a5d&sxsrf=AE3TifNo_KqCzke3ZkSz6zdxZGXDQv6lWA%3A1751356705342&ei=IZV
长尾形分布论文速览三十篇【60-89】木木阳 Long-tailed 人工智能
长尾形分布速览（60-89）这些研究展示了LLMs在长尾数据分布、持续学习、异常检测、联邦学习、对比学习、知识图谱、推荐系统、多目标跟踪、标签修复、对象检测、医疗生物医学以及其他应用中的广泛应用。通过优化和创新，LLMs在这些领域展现了卓越的性能，并为解决长尾问题提供了有效的工具和方法。1.长尾持续学习与对抗学习长尾持续学习(Paper60):通过优化器状态重用来减少遗忘，提高在长尾任务中的持续学
长尾形分布论文速览【80-119】木木阳 Long-tailed 人工智能
为便于理解和应用，以下将30篇关于长尾分布的研究文献按主题进行分类整理。每一大类包含相应的工作，帮助我们从整体上把握各方向的研究进展。1.长尾半监督学习与伪标签优化Paper90:Uncertainty-awareSamplingforLong-tailedSemi-supervisedLearning提出了一种动态阈值选择方法（UDTS），能有效改善尾部分类性能，适用于不平衡类别的半监督学习。P
MyBatis SQL 执行过程原理分析（附源码）代理层：Mapper 接口动态代理路由层：MapperMethod 分发核心引擎：SqlSession 执行夜雨hiyeyu.com mybatis sql 数据库数据库架构 java spring boot db
MyBatisSQL执行过程原理分析（附源码）1.代理层：Mapper接口动态代理2.路由层：MapperMethod分发3.核心引擎：SqlSession执行4.执行器：Executor调度5.处理器层：StatementHandler执行6.结果映射：ResultSetHandler转换核心执行流程图关键设计亮点性能优化建议MyBatis的SQL执行过程可以分为6个核心阶段，我们将通过源码逐层
【Maven】Maven核心机制的万字深度解析夜雨hiyeyu.com maven java spring spring boot mvc 系统架构后端
Maven核心机制的万字深度解析一、依赖管理机制全解（工业级依赖治理方案）1.坐标体系的本质与设计哲学2.依赖传递与仲裁算法的工程实现**冲突仲裁核心算法**企业级仲裁策略3.Scope作用域的类加载隔离原理4.多级仓库体系架构设计二、构建生命周期底层原理（工业级流水线解析）1.生命周期模型架构2.Default生命周期核心阶段详解3.插件执行机制内核剖析三、企业级工程化实践（千亿级项目的解决方案
Maven的概念与核心配置代码如疯. maven java
安装步骤：1、安装jdk2、从官网中下载对应的版本3、解压安装，然后配置环境变量，将bin目录添加到path路径下4、在命令行中输入mvn-v,看到版本信息表示安装成功3、maven的基本常识3.1maven如何获取jar包maven通过坐标的方式来获取jar包，坐标组成为：公司/组织（groupId）+项目名（artifactId）+版本（version）--GAV组成，可以从互联网，本地等多种
GlobalFilter、Filter关系 m0_63486540 java java
维度GlobalFilterFilter技术体系SpringCloudGateway+WebFluxJavaServletAPI编程模型响应式(Reactive)阻塞式(Imperative)作用范围全局（所有路由）可配置路径模式执行效率更高（基于事件循环）较低（线程池模型）配置方式SpringBean自动注册web.xml或@WebFilter如何选择？如果你正在开发API网关或微服务入口，使用
VUE 2+3 一只鱼^_ Web vue.js 前端 javascript 数据结构算法前端框架
Vue是什么？概念：Vue是一个用于构建用户界面的渐进式框架构建用户界面：基于数据动态渲染页面Vue的两种使用方式：1.Vue核心包开发场景：局部模块改造2.Vue核心包&Vue插件工程化开发场景：整站开发注：Vuejs核心包(声明式渲染，组件系统)优点：大大提升开发效率----------------------------------------------------------------
# 国产高性能VPX6U模块：飞腾4核/8核处理器助力数据处理与通信
#产品概述今天为大家介绍一款高性能国产VPX6U模块——飞腾4核/8核VPX6U模块。这款产品采用国产飞腾处理器，具备强大的数据处理能力和丰富的接口配置，是军工、通信、存储等领域的理想选择。##核心特点###1.国产飞腾处理器，性能灵活可选-模块兼容FT2000-4或D2000-8两种处理器-用户可根据实际性能需求自由选择-完全国产化方案，安全可控###2.丰富接口配置，多功能应用-万兆以太网、千
【Maven】Maven 新手全面入门指南，核心概念 maven安装配置优化，项目创建与项目结构介绍核心Maven命令夜雨hiyeyu.com java maven java spring boot 后端 gradle 系统架构软件构建
Maven新手全面入门指南一、Maven简介Mavenvs其他构建工具二、核心概念1.POM（ProjectObjectModel）2.坐标系统（GAV）3.依赖管理4.仓库（Repository）5.构建生命周期三、Maven安装与配置1.安装步骤2.配置优化（settings.xml）四、项目创建与结构1.创建新项目2.标准项目结构五、核心Maven命令基本命令进阶命令六、完整pom.xml示
【Maven 】＜resources＞配置中排除 fonts/** 目录无效，可能是由于以下原因及解决方案： ladymorgana 日常工作总结 maven java
如果Maven的配置中排除fonts/**目录无效，可能是由于以下原因及解决方案：总结：用方法一即可1.检查资源过滤是否生效确保部分正确配置了resources插件：src/main/resourcesfonts/**false2.验证目录结构确认fonts文件夹的物理路径是否正确：src/└──main/└──resources/└──fonts/#确保这是要排除的目录├──font1.ttf└
Paper Reading《SoK: Prudent Evaluation Practices for Fuzzing》小苑同学安全性测试网络安全
论文链接：https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10646824IEEESSP20241Introduction（背景意义）模糊测试是发现软件漏洞的高效方法，被Meta、Google等企业广泛应用，且大量学术研究持续改进其技术（如覆盖反馈、领域扩展）。过去六年（2018-2023）中，顶级安全与软件工程会议上发表了超280
QT中翻译文件生成步骤小小码农Come on Qt qt
1、配置工具环境变量设置path：D:\Qt\Qt5.15\5.15.2\msvc2019_64\binD:\Qt\Qt5.15\5.15.2\msvc2019\bin主要使用工具：lupdate、linguist、lrelease都在如上路径的bin目录下2、提取翻译字符串(lupdate)用来生成/更新.ts翻译文件进入目录D:\Code\ds-cmifinaldetect\plugins\p
【翻译】Part4: Texture samplers.
AtripthroughtheGraphicsPipeline2011,part4|Therygblog欢迎回来。上一部分讲的是顶点着色器，还大致介绍了通用的GPU着色器单元。总的来说，它们只是向量处理器，但它们可以访问一种在其他向量处理架构中不存在的资源：纹理采样器。纹理采样器是GPU流水线不可或缺的一部分，其复杂程度（以及趣味性！）足以单独写一篇文章来介绍，那接下来就开始吧。纹理状态在开始实际
为 Agentic AI 的黎明构建地基人工智能
在技术领域，我们常常被那些闪耀的、可见的成果所吸引。今天，这个焦点无疑是大语言模型技术。它们的流畅对话、惊人的创造力，让我们得以一窥未来的轮廓。然而，作为在企业一线构建、部署和维护复杂系统的实践者，我们深知，一个卓越的模型，本身并不能构成一个成功的企业级解决方案。它就像一座精心设计的摩天大楼的塔尖，倘若没有深植于地下的坚实地基，再璀璨的光芒也终将是昙花一现。真正的挑战，也是真正的价值所在，在于构建
论文学习_SoK: An Essential Guide For Using Malware Sandboxes In Security Applications: Challenges, Pitfa kitsch0x97 学习
0.文章概述恶意软件沙箱尽管在安全应用程序中带来许多优势，但其复杂的选择、配置和使用过程常让新用户不知所措，甚至可能导致错误的部署，进而对安全分析结果产生负面影响。目前，缺乏系统化的指导来帮助用户正确选择和应用沙箱工具，这种知识空白阻碍了沙箱在不同研究领域中的有效应用。为了填补这一知识空白，研究团队系统分析了84篇关于x86/64恶意软件沙箱的学术论文，并提出了一种新颖的框架，以简化沙箱组件和操作
CentOS 6操作系统安装
【版本】选【CentOS664位】【此虚拟机内存】4096【最大磁盘大小】100G上传映像文件CentOS-6.10-x86_64-bin-DVD1中文安装、美国英语式键盘安装将使用哪种设备？【基本存储设备】设置主机名→【配置网络】→【编辑】→勾选【自动连接】→【IPv4设置】手动设置→【应用】【创建自定义布局】→创建【swap】（指定空间大小8192，方法16G法同上）和【/】（文件系统类型ex
Prompt Engineering 指南教程班磊闯Andrea
PromptEngineering指南教程Prompt-Engineering-Guidedair-ai/Prompt-Engineering-Guide:是一个用于指导对话人工智能开发的文档。适合用于学习对话人工智能开发和自然语言处理。特点是提供了详细的指南和参考资料，涵盖了多种对话人工智能技术和算法，并且可以自定义学习路径和行为。项目地址:https://gitcode.com/gh_mirr
Redis 集群与分布式实现：从原理到实战一切皆有迹可循 redis redis 分布式数据库后端缓存
前言在大数据与高并发场景下，单节点Redis的容量与可用性已无法满足需求。Redis通过集群与分布式技术，实现了数据的分片存储与高可用部署，成为分布式系统的核心组件。本文将深入解析Redis集群的底层原理、架构模式与实战经验，结合代码示例与最佳实践，帮助开发者构建高性能、高可用的分布式缓存系统。一、集群基础架构与核心原理1.数据分片机制Redis集群采用哈希槽（HashSlot）实现数据分片，共有
JavaScript代码审计工具叶梓诺 javascript 开发语言 ecmascript 前端
我整理的一些关于【Java】的项目学习资料（附讲解～～）和大家一起分享、学习一下：https://d.51cto.com/bLN8S1如何实现一个JavaScript代码审计工具作为一名刚入行的小白，你可能对如何创建一个JavaScript代码审计工具感到困惑。在这篇文章中，我将引导你完成整个流程，并提供具体的代码示例和说明。我们将采取结构化的步骤来确保你能够理解每个阶段。流程概述创建JavaSc
Python私有属性：隐藏数据的秘密武器有奇妙能力吗知识分享 Python python 开发语言
Python私有属性详解：为什么我们需要“隐藏”对象的数据？一、引言在面向对象编程中，封装（Encapsulation）是三大基本特性之一（另外两个是继承和多态）。而“私有属性”就是实现封装的重要手段之一。在Python中虽然不像Java或C++那样严格区分访问权限，但依然提供了一种机制来限制对类内部属性的直接访问。本文将带你深入了解：什么是私有属性？如何定义私有属性？私有属性的原理与注意事项使用
解密闭包：函数如何记住外部变量有奇妙能力吗知识分享 Python python 开发语言
什么是闭包？闭包是一个函数对象，它不仅记住它的代码逻辑，还记住了定义它时的自由变量（即非全局也非局部，但被内部函数引用的变量）。即使外部函数已经执行完毕，这些自由变量的值仍然保存在内存中，可以通过闭包访问和使用。简单来说：当一个嵌套的函数引用了其外部函数中的变量，并且这个嵌套函数可以在其外部函数之外被调用时，就形成了一个闭包。✅闭包的基本结构def outer_function(x): de
输入hadoop version时，解决Cannot execute /home/hadoop/libexec/hadoop-config.sh.的方法有奇妙能力吗 ubuntu hadoop hdfs linux 大数据分布式
在ubuntu用hadoopversion遇到了一个错误：Cannotexecute/home/hadoop/hadoop2.8/libexec/hadoop-config.sh.解决方法：在/etc/profile中找到了这个HADOOP_HOME全局变量，将其删除运行source/etc/profile输入vim.bashrc命令，在最后一行输入unsetHADOOP_HOMEsource.b
E函数.js 是紫焅呢 javascript 开发语言 ecmascript 青少年编程前端 visual studio code
前言：函数是构建强大而灵活应用程序的基石。它们不仅是代码执行的基本单元，更是实现模块化、可复用性和简洁性的关键。目录一、函数是个啥？简单来说就是“代码小跟班”二、定义函数给小跟班安排任务三、调用函数指挥小跟班干活儿四、参数给小跟班的“任务提示卡”五、返回值小跟班的“任务成果”六、函数的实际应用小跟班大显身手总结函数是你的编程小跟班一、函数是个啥？简单来说就是“代码小跟班”函数，说白了就是一段能完成
CLIP之后，多模态模型将如何进化？三大技术路径解析老周聊AI AI大模型人工智能 MCP 机器学习神经网络深度学习 AI大模型大模型训练框架
多模态学习的革命：CLIP技术深度解析关注老周不迷路本文较长，建议点赞收藏以免遗失。由于文章篇幅有限，更多涨薪知识点，也可在主页查看最新AI大模型应用开发学习资料免费领取引言：多模态学习的时代来临在人工智能领域，多模态学习正成为最具前景的研究方向之一。传统AI系统通常专注于单一模态（如纯文本或纯图像），而人类认知的本质却是多模态的——我们通过视觉、听觉、触觉等多种感官协同理解世界。OpenAI于2
【Python 中的几类运算符】
文章目录文章目录一、算术运算符二、比较运算符三、赋值运算符四、逻辑运算符附加知识：五、其他运算符1.位运算符2.成员运算符3.身份运算符总结一、算术运算符加法（+）：用于两个数值相加。例如，a=5，b=3，a+b的结果为8。也可以用于字符串拼接，如"Hello,"+"World"的结果为"Hello,World"。示例：a=5b=3result=a+bprint("求和",result)a="He
Windows PowerShell中无法将"python"项识别为cmdlet、函数、脚本文件或可运行程序的名称 xqhrs232 ROS系统/Python
原文地址::https://blog.csdn.net/Blateyang/article/details/86421594相关文章1、如何在Powershell中运行python程序?----https://cloud.tencent.com/developer/ask/1426072、Windows下如何方便的运行py脚本----https://blog.csdn.net/Naisu_kun/
嵌入模型 vs 大语言模型：语义理解能力的本质区别与应用场景 chenkangck50 AI大模型语言模型人工智能机器学习
嵌入模型vs大语言模型：语义理解能力的本质区别与应用场景（实战视角）一句话总结嵌入模型的“理解”是向量表示和相似性匹配，适合做召回；大语言模型的“理解”是上下文+逻辑+世界知识综合判断，适合做分析与生成。重点是可以结合prompt和本身具有的知识两类模型的本质区别能力项嵌入模型（如BGE、SBERT）大语言模型（如GPT、GLM、DeepSeek）输出形式向量（如768维）自然语言文本（如答案、解
现代Unity架构指南：以ECS为核心的应用程序中构建OOP抽象层 NocturnalSky Unity unity 架构游戏引擎
文章目录现代Unity架构指南：以ECS为核心的应用程序中构建OOP抽象层为什么需要混合架构抽象层设计原则性能关键优化技术典型应用场景实现混合架构性能分析（Unity2023.2实测）关键陷阱与解决方案架构演进策略Unity2023新特性整合现代Unity架构指南：以ECS为核心的应用程序中构建OOP抽象层为什么需要混合架构ECS的核心优势：数据局部性、缓存友好性、高效的并行处理能力，特别适合高性
Nginx负载均衡 510888780 nginx 应用服务器
Nginx负载均衡一些基础知识: nginx 的 upstream目前支持 4 种方式的分配 1)、轮询（默认）每个请求按时间顺序逐一分配到不同的后端服务器，如果后端服务器down掉，能自动剔除。 2)、weight 指定轮询几率，weight和访问比率成正比
RedHat 6.4 安装 rabbitmq bylijinnan erlang rabbitmq redhat
在 linux 下安装软件就是折腾，首先是测试机不能上外网要找运维开通，开通后发现测试机的 yum 不能使用于是又要配置 yum 源，最后安装 rabbitmq 时也尝试了两种方法最后才安装成功机器版本： [root@redhat1 rabbitmq]# lsb_release LSB Version: :base-4.0-amd64:base-4.0-noarch:core
FilenameUtils工具类 eksliang FilenameUtils common-io
转载请出自出处：http://eksliang.iteye.com/blog/2217081 一、概述这是一个Java操作文件的常用库，是Apache对java的IO包的封装，这里面有两个非常核心的类FilenameUtils跟FileUtils，其中FilenameUtils是对文件名操作的封装;FileUtils是文件封装，开发中对文件的操作，几乎都可以在这个框架里面找到。非常的好用。
xml文件解析SAX 不懂事的小屁孩 xml
xml文件解析:xml文件解析有四种方式， 1.DOM生成和解析XML文档(SAX是基于事件流的解析) 2.SAX生成和解析XML文档(基于XML文档树结构的解析) 3.DOM4J生成和解析XML文档 4.JDOM生成和解析XML 本文章用第一种方法进行解析，使用android常用的DefaultHandler import org.xml.sax.Attributes;
通过定时任务执行mysql的定期删除和新建分区，此处是按日分区酷的飞上天空 mysql
使用python脚本作为命令脚本，linux的定时任务来每天定时执行 #!/usr/bin/python # -*- coding: utf8 -*- import pymysql import datetime import calendar #要分区的表 table_name = 'my_table' #连接数据库的信息 host,user,passwd,db =
如何搭建数据湖架构？听听专家的意见蓝儿唯美架构
Edo Interactive在几年前遇到一个大问题：公司使用交易数据来帮助零售商和餐馆进行个性化促销，但其数据仓库没有足够时间去处理所有的信用卡和借记卡交易数据 “我们要花费27小时来处理每日的数据量，”Edo主管基础设施和信息系统的高级副总裁Tim Garnto说道：“所以在2013年，我们放弃了现有的基于PostgreSQL的关系型数据库系统，使用了Hadoop集群作为公司的数
spring学习——控制反转与依赖注入 a-john spring
控制反转（Inversion of Control，英文缩写为IoC）是一个重要的面向对象编程的法则来削减计算机程序的耦合问题，也是轻量级的Spring框架的核心。控制反转一般分为两种类型，依赖注入（Dependency Injection，简称DI）和依赖查找（Dependency Lookup）。依赖注入应用比较广泛。
用spool+unixshell生成文本文件的方法 aijuans xshell
例如我们把scott.dept表生成文本文件的语句写成dept.sql,内容如下: 　　set pages 50000; 　　set lines 200; 　　set trims on; 　　set heading off; 　　spool /oracle_backup/log/test/dept.lst; 　　select deptno||','||dname||','||loc
1、基础--名词解析(OOA/OOD/OOP) asia007 学习基础知识
OOA:Object-Oriented Analysis（面向对象分析方法）是在一个系统的开发过程中进行了系统业务调查以后，按照面向对象的思想来分析问题。OOA与结构化分析有较大的区别。OOA所强调的是在系统调查资料的基础上，针对OO方法所需要的素材进行的归类分析和整理，而不是对管理业务现状和方法的分析。　　OOA（面向对象的分析）模型由5个层次（主题层、对象类层、结构层、属性层和服务层）
浅谈java转成json编码格式技术百合不是茶 json编码 java转成json编码
json编码;是一个轻量级的数据存储和传输的语言在java中需要引入json相关的包,引包方式在工程的lib下就可以了 JSON与JAVA数据的转换（JSON 即 JavaScript Object Natation，它是一种轻量级的数据交换格式，非常适合于服务器与 JavaScript 之间的数据的交
web.xml之Spring配置(基于Spring+Struts+Ibatis) bijian1013 java web.xml SSI spring配置
指定Spring配置文件位置 <context-param> <param-name>contextConfigLocation</param-name> <param-value> /WEB-INF/spring-dao-bean.xml,/WEB-INF/spring-resources.xml, /WEB-INF/
Installing SonarQube（Fail to download libraries from server） sunjing Install Sonar
1. Download and unzip the SonarQube distribution 2. Starting the Web Server The default port is "9000" and the context path is "/". These values can be changed in &l
【MongoDB学习笔记十一】Mongo副本集基本的增删查 bit1129 mongodb
一、创建复本集假设mongod,mongo已经配置在系统路径变量上，启动三个命令行窗口，分别执行如下命令： mongod --port 27017 --dbpath data1 --replSet rs0 mongod --port 27018 --dbpath data2 --replSet rs0 mongod --port 27019 -
Anychart图表系列二之执行Flash和HTML5渲染白糖_ Flash
今天介绍Anychart的Flash和HTML5渲染功能 HTML5 Anychart从6.0第一个版本起，已经逐渐开始支持各种图的HTML5渲染效果了，也就是说即使你没有安装Flash插件，只要浏览器支持HTML5，也能看到Anychart的图形（不过这些是需要做一些配置的）。这里要提醒下大家，Anychart6.0版本对HTML5的支持还不算很成熟，目前还处于
Laravel版本更新异常4.2.8-> 4.2.9 Declaration of ... CompilerEngine ... should be compa bozch laravel
昨天在为了把laravel升级到最新的版本，突然之间就出现了如下错误： ErrorException thrown with message "Declaration of Illuminate\View\Engines\CompilerEngine::handleViewException() should be compatible with Illuminate\View\Eng
编程之美-NIM游戏分析-石头总数为奇数时如何保证先动手者必胜 bylijinnan 编程之美
import java.util.Arrays; import java.util.Random; public class Nim { /**编程之美 NIM游戏分析问题：有N块石头和两个玩家A和B，玩家A先将石头随机分成若干堆，然后按照BABA...的顺序不断轮流取石头，能将剩下的石头一次取光的玩家获胜，每次取石头时，每个玩家只能从若干堆石头中任选一堆，
lunce创建索引及简单查询 chengxuyuancsdn 查询创建索引 lunce
import java.io.File; import java.io.IOException; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Docume
[IT与投资]坚持独立自主的研究核心技术 comsci it
和别人合作开发某项产品....如果互相之间的技术水平不同,那么这种合作很难进行,一般都会成为强者控制弱者的方法和手段..... 所以弱者,在遇到技术难题的时候,最好不要一开始就去寻求强者的帮助,因为在我们这颗星球上,生物都有一种控制其
flashback transaction闪回事务查询 daizj oracle sql 闪回事务
闪回事务查询有别于闪回查询的特点有以下3个：（1）其正常工作不但需要利用撤销数据，还需要事先启用最小补充日志。（2）返回的结果不是以前的“旧”数据，而是能够将当前数据修改为以前的样子的撤销SQL（Undo SQL）语句。（3）集中地在名为flashback_transaction_query表上查询，而不是在各个表上通过“as of”或“vers
Java I/O之FilenameFilter类列举出指定路径下某个扩展名的文件游其是你 FilenameFilter
这是一个FilenameFilter类用法的例子，实现的列举出“c:\\folder“路径下所有以“.jpg”扩展名的文件。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
C语言学习五函数，函数的前置声明以及如何在软件开发中合理的设计函数来解决实际问题 dcj3sjt126com c
# include <stdio.h> int f(void) //括号中的void表示该函数不能接受数据，int表示返回的类型为int类型 { return 10; //向主调函数返回10 } void g(void) //函数名前面的void表示该函数没有返回值 { //return 10; //error 与第8行行首的void相矛盾 } in
今天在测试环境使用yum安装，遇到一个问题： Error: Cannot retrieve metalink for repository: epel. Pl dcj3sjt126com centos
今天在测试环境使用yum安装，遇到一个问题： Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again 处理很简单，修改文件“/etc/yum.repos.d/epel.repo”，将baseurl的注释取消， mirrorlist注释掉。即可。 &n
单例模式 shuizhaosi888 单例模式
单例模式懒汉式 public class RunMain { /** * 私有构造 */ private RunMain() { } /** * 内部类，用于占位，只有 */ private static class SingletonRunMain { priv
Spring Security（09）——Filter 234390216 Spring Security
Filter 目录 1.1 Filter顺序 1.2 添加Filter到FilterChain 1.3 DelegatingFilterProxy 1.4 FilterChainProxy 1.5
公司项目NODEJS实践0.1 逐行分析JS源代码 mongodb nginx ubuntu nodejs
一、前言前端如何独立用nodeJs实现一个简单的注册、登录功能，是不是只用nodejs+sql就可以了？其实是可以实现，但离实际应用还有距离，那要怎么做才是实际可用的。网上有很多nod
java.lang.Math liuhaibo_ljf java Math lang
System.out.println(Math.PI); System.out.println(Math.abs(1.2)); System.out.println(Math.abs(1.2)); System.out.println(Math.abs(1)); System.out.println(Math.abs(111111111)); System.out.println(Mat
linux下时间同步 nonobaba ntp
今天在linux下做hbase集群的时候，发现hmaster启动成功了，但是用hbase命令进入shell的时候报了一个错误 PleaseHoldException: Master is initializing，查看了日志，大致意思是说master和slave时间不同步，没办法，只好找一种手动同步一下，后来发现一共部署了10来台机器，手动同步偏差又比较大，所以还是从网上找现成的解决方
ZooKeeper3.4.6的集群部署 roadrunners zookeeper 集群部署
ZooKeeper是Apache的一个开源项目，在分布式服务中应用比较广泛。它主要用来解决分布式应用中经常遇到的一些数据管理问题，如：统一命名服务、状态同步、集群管理、配置文件管理、同步锁、队列等。这里主要讲集群中ZooKeeper的部署。 1、准备工作我们准备3台机器做ZooKeeper集群，分别在3台机器上创建ZooKeeper需要的目录。数据存储目录
Java高效读取大文件 tomcat_oracle java
　　读取文件行的标准方式是在内存中读取，Guava 和Apache Commons IO都提供了如下所示快速读取文件行的方法：　　Files.readLines(new File(path), Charsets.UTF_8); 　　FileUtils.readLines(new File(path)); 　　这种方法带来的问题是文件的所有行都被存放在内存中，当文件足够大时很快就会导致
微信支付api返回的xml转换为Map的方法 xu3508620 xml map 微信api
举例如下： <xml> <return_code><![CDATA[SUCCESS]]></return_code> <return_msg><![CDATA[OK]]></return_msg> <appid><