- 专栏内容:postgresql内核源码分析
- 个人主页:我的主页
- 座右铭:天行健,君子以自强不息;地势坤,君子以厚德载物.
目录
前言
查询执行阶段介绍
查询优化策略
五种执行策略说明
综合以上分析,归纲为五种处理策略:
策略的实现
策略的执行
初始化
执行
清理
查询执行分层
结尾
本文是基于postgresql 15的代码进行分析解读,演示是在centos8系统上进行。
查询执行阶段,主要有portal这个结构来记录信息,并贯穿整个过程。查询主要分有计划树和无计划树两种类型,根据查询类型分别由 executor module(ExecutorRun)和ProcessUtility进行执行;
Executor 模块主要执行DML操作,执行逻辑统一为select,或select基础上加一些额外处理。代码主要在src/backend/executor下
processUtility模块,主要处理DDL或命令,各处理流程差别较大,如用户游标角色定义。为每个定义子过程/函数进行处理。代码主要在src/backend/commands下
另外还有一些特定子功能模块,主要是指反复调用的函数,如辅助子系统,表达式运算,投影运算和元组操作等
主要从以下几方面来分析
一条简单sql可以生成可优化执行树或非计划树的功能操作,而一条复杂的SQL,可能同时包含DML和DDL,生成计划树和非计划树的功能操作。这样就分三个处理方向。
除此之外,还有一些SQL语句虽然被转换为一个原子操作,但在执行过程中需要缓存语句执行结果,在语句执行完成后返回执行结果。为什么需要缓存结果呢?因为在原子操作里,随着语句的边执行边输出执行结果,不会因为错误输出而中断原子操作,所以需要缓存的功能。
需要缓存的主要有以下两种情况:
(1)对于可优化语句,希望返回被修改的元组(带有returning子句的insert/update/delete)
(2)对于非可优化语句需要返回结果的(如show, explain),因此也需要一个缓存结构来保存结果。
此外,对于含有insert/update/delete的CTE子句,在CTE中会修改数据,与一般CTE处理会有不同,也需要特殊处理。
(1)PORTAL_ONE_SELECT
处理单个select语句,调用Executor模块
(2)PORTAL_ONE_RETURNING
处理单个select带有returning的语句,调用Executor模块
(3)PORTAL_ONE_MOD_WITH
处理带有insert/update/delete的with的select语句,逻辑与returning相似,调用Executor模块
(4)PORTAL_UTIL_SELECT
处理单个定义语句,调用ProcessUtility模块
(5)PORTAL_MULTI_QUERY
是前面四种策略的混合,可以处理多个原子操作。
查询计划选择器 ChoosePortalStrategy 在portal定义阶段,根据查询计划树的信息选择五种策略中的一种,计划树表和选择的策略都会保存在portalData结构中。
typedef struct PortalData
{
/* Bookkeeping data */
const char *name; /* portal's name */
const char *prepStmtName; /* source prepared statement (NULL if none) */
MemoryContext portalContext; /* subsidiary memory for portal */
ResourceOwner resowner; /* resources owned by portal */
void (*cleanup) (Portal portal); /* cleanup hook */
/*
* State data for remembering which subtransaction(s) the portal was
* created or used in. If the portal is held over from a previous
* transaction, both subxids are InvalidSubTransactionId. Otherwise,
* createSubid is the creating subxact and activeSubid is the last subxact
* in which we ran the portal.
*/
SubTransactionId createSubid; /* the creating subxact */
SubTransactionId activeSubid; /* the last subxact with activity */
int createLevel; /* creating subxact's nesting level */
/* The query or queries the portal will execute */
const char *sourceText; /* text of query (as of 8.4, never NULL) */
CommandTag commandTag; /* command tag for original query */
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
/* Features/options */
PortalStrategy strategy; /* see above */
int cursorOptions; /* DECLARE CURSOR option bits */
bool run_once; /* portal will only be run once */
/* Status data */
PortalStatus status; /* see above */
bool portalPinned; /* a pinned portal can't be dropped */
bool autoHeld; /* was automatically converted from pinned to
* held (see HoldPinnedPortals()) */
/* If not NULL, Executor is active; call ExecutorEnd eventually: */
QueryDesc *queryDesc; /* info needed for executor invocation */
/* If portal returns tuples, this is their tupdesc: */
TupleDesc tupDesc; /* descriptor for result tuples */
/* and these are the format codes to use for the columns: */
int16 *formats; /* a format code for each column */
/*
* Outermost ActiveSnapshot for execution of the portal's queries. For
* all but a few utility commands, we require such a snapshot to exist.
* This ensures that TOAST references in query results can be detoasted,
* and helps to reduce thrashing of the process's exposed xmin.
*/
Snapshot portalSnapshot; /* active snapshot, or NULL if none */
/*
* Where we store tuples for a held cursor or a PORTAL_ONE_RETURNING or
* PORTAL_UTIL_SELECT query. (A cursor held past the end of its
* transaction no longer has any active executor state.)
*/
Tuplestorestate *holdStore; /* store for holdable cursors */
MemoryContext holdContext; /* memory containing holdStore */
/*
* Snapshot under which tuples in the holdStore were read. We must keep a
* reference to this snapshot if there is any possibility that the tuples
* contain TOAST references, because releasing the snapshot could allow
* recently-dead rows to be vacuumed away, along with any toast data
* belonging to them. In the case of a held cursor, we avoid needing to
* keep such a snapshot by forcibly detoasting the data.
*/
Snapshot holdSnapshot; /* registered snapshot, or NULL if none */
/*
* atStart, atEnd and portalPos indicate the current cursor position.
* portalPos is zero before the first row, N after fetching N'th row of
* query. After we run off the end, portalPos = # of rows in query, and
* atEnd is true. Note that atStart implies portalPos == 0, but not the
* reverse: we might have backed up only as far as the first row, not to
* the start. Also note that various code inspects atStart and atEnd, but
* only the portal movement routines should touch portalPos.
*/
bool atStart;
bool atEnd;
uint64 portalPos;
/* Presentation data, primarily used by the pg_cursors system view */
TimestampTz creation_time; /* time at which this portal was defined */
bool visible; /* include this portal in pg_cursors? */
} PortalData;
stmts 有planstmt, query这两者是含有查询树,当然含有查询树但不是select的,如游标的声明(如utilitystmt不为空),select into(intoclause不为空的)。
创建一个可"clean"的port,然后分两步进行初始化
(1) PortalDefineQuery 初始化名字,计划树等,并设置portal的状态为define
(2)PortalStart 初始化策略等,并设置portal状态为PORTAL_READY
调用portalRun进行执行
调用PortalDrop进行清理
主要分三个层面,上层是控制整个流程,中层控制每个计划树的执行流程,下次控制每个树的节点的执行流程
作者邮箱:[email protected]
如有错误或者疏漏欢迎指出,互相学习。
注:未经同意,不得转载!