本节简单介绍了PostgreSQL的后台进程walsender,该进程实质上是streaming replication环境中master节点上普通的backend进程,在standby节点启动时,standby节点向master发送连接请求,master节点的postmaster进程接收到请求后,启动该进程与standby节点的walreceiver进程建立通讯连接,用于传输WAL Record.
walsender启动后,使用gdb跟踪此进程,其调用栈如下:
(gdb) bt
#0 0x00007fb6e6390903 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1 0x000000000088e668 in WaitEventSetWaitBlock (set=0x10ac808, cur_timeout=29999, occurred_events=0x7ffd634441b0,
nevents=1) at latch.c:1048
#2 0x000000000088e543 in WaitEventSetWait (set=0x10ac808, timeout=29999, occurred_events=0x7ffd634441b0, nevents=1,
wait_event_info=83886092) at latch.c:1000
#3 0x000000000088dcec in WaitLatchOrSocket (latch=0x7fb6dcbfc4d4, wakeEvents=27, sock=10, timeout=29999,
wait_event_info=83886092) at latch.c:385
#4 0x000000000085405b in WalSndLoop (send_data=0x8547fe ) at walsender.c:2229
#5 0x0000000000851c93 in StartReplication (cmd=0x10ab750) at walsender.c:684
#6 0x00000000008532f0 in exec_replication_command (cmd_string=0x101dd78 "START_REPLICATION 0/5D000000 TIMELINE 16")
at walsender.c:1539
#7 0x00000000008c0170 in PostgresMain (argc=1, argv=0x1049cb8, dbname=0x1049ba8 "", username=0x1049b80 "replicator")
at postgres.c:4178
#8 0x000000000081e06c in BackendRun (port=0x103fb50) at postmaster.c:4361
#9 0x000000000081d7df in BackendStartup (port=0x103fb50) at postmaster.c:4033
#10 0x0000000000819bd9 in ServerLoop () at postmaster.c:1706
#11 0x000000000081948f in PostmasterMain (argc=1, argv=0x1018a50) at postmaster.c:1379
#12 0x0000000000742931 in main (argc=1, argv=0x1018a50) at main.c:228
本节首先介绍调用栈中PostgresMain函数.
StringInfo
StringInfoData结构体保存关于扩展字符串的相关信息.
/*-------------------------
* StringInfoData holds information about an extensible string.
* StringInfoData结构体保存关于扩展字符串的相关信息.
* data is the current buffer for the string (allocated with palloc).
* data 通过palloc分配的字符串缓存
* len is the current string length. There is guaranteed to be
* a terminating '\0' at data[len], although this is not very
* useful when the string holds binary data rather than text.
* len 是当前字符串的长度.保证以ASCII 0(\0)结束(data[len] = '\0').
* 虽然如果存储的是二进制数据而不是文本时不太好使.
* maxlen is the allocated size in bytes of 'data', i.e. the maximum
* string size (including the terminating '\0' char) that we can
* currently store in 'data' without having to reallocate
* more space. We must always have maxlen > len.
* maxlen 以字节为单位已分配的'data'的大小,限定了最大的字符串大小(包括结尾的ASCII 0)
* 小于此尺寸的数据可以直接存储而无需重新分配.
* cursor is initialized to zero by makeStringInfo or initStringInfo,
* but is not otherwise touched by the stringinfo.c routines.
* Some routines use it to scan through a StringInfo.
* cursor 通过makeStringInfo或initStringInfo初始化为0,但不受stringinfo.c例程的影响.
* 某些例程使用该字段扫描StringInfo
*-------------------------
*/
typedef struct StringInfoData
{
char *data;
int len;
int maxlen;
int cursor;
} StringInfoData;
typedef StringInfoData *StringInfo;
PostgresMain
后台进程postgres的主循环入口 — 所有的交互式或其他形式的后台进程在这里启动.
其主要逻辑如下:
1.初始化相关变量
2.初始化进程信息,设置进程状态,初始化GUC参数
3.解析命令行参数并作相关校验
4.如为walsender进程,则调用WalSndSignals初始化,否则执行其他信号初始化
5.初始化BlockSig/UnBlockSig/StartupBlockSig
6.非Postmaster,则检查数据库路径/切换路径/创建锁定文件等操作
7.调用BaseInit执行基本的初始化
8.调用InitProcess/InitPostgres初始化进程
9.重置内存上下文,处理加载库和前后台消息交互等
10.初始化内存上下文
11.进入主循环
11.1切换至MessageContext上下文
11.2初始化输入的消息
11.3给客户端发送可以执行查询等消息
11.4读取命令
11.5根据命令类型执行相关操作
/* ----------------------------------------------------------------
* PostgresMain
* postgres main loop -- all backends, interactive or otherwise start here
* postgres主循环 -- 所有的交互式或其他形式的后台进程在这里启动
*
* argc/argv are the command line arguments to be used. (When being forked
* by the postmaster, these are not the original argv array of the process.)
* dbname is the name of the database to connect to, or NULL if the database
* name should be extracted from the command line arguments or defaulted.
* username is the PostgreSQL user name to be used for the session.
* argc/argv是命令行参数(postmaster fork进程时,不存在原有的进程argv数组).
* dbname是连接的数据库名称,如需要从命令行参数中解析或者为默认的数据库名称,则为NULL.
* username是PostgreSQL会话的用户名.
* ----------------------------------------------------------------
*/
/*
输入:
argc/argv-Main函数的输入参数
dbname-数据库名称
username-用户名
输出:
无
*/
void
PostgresMain(int argc, char *argv[],
const char *dbname,
const char *username)
{
int firstchar;//临时变量,读取输入的Command
StringInfoData input_message;//字符串增强结构体
sigjmp_buf local_sigjmp_buf;//系统变量
volatile bool send_ready_for_query = true;//
bool disable_idle_in_transaction_timeout = false;
/* Initialize startup process environment if necessary. */
//如需要,初始化启动进程环境
if (!IsUnderPostmaster//未初始化?initialized for the bootstrap/standalone case
InitStandaloneProcess(argv[0]);//初始化进程
SetProcessingMode(InitProcessing);//设置进程状态为InitProcessing
/*
* Set default values for command-line options.
* 设置命令行选项默认值
*/
if (!IsUnderPostmaster)
InitializeGUCOptions();//初始化GUC参数,GUC=Grand Unified Configuration
/*
* Parse command-line options.
* 解析命令行选项
*/
process_postgres_switches(argc, argv, PGC_POSTMASTER, &dbname);//解析输入参数
/* Must have gotten a database name, or have a default (the username) */
//必须包含数据库名称或者存在默认值
if (dbname == NULL)//输入的dbname为空
{
dbname = username;//设置为用户名
if (dbname == NULL)//如仍为空,报错
ereport(FATAL,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("%s: no database nor user name specified",
progname)));
}
/* Acquire configuration parameters, unless inherited from postmaster */
//请求配置参数,除非从postmaster中继承
if (!IsUnderPostmaster)
{
if (!SelectConfigFiles(userDoption, progname))//读取配置文件conf/hba文件&定位数据目录
proc_exit(1);
}
/*
* Set up signal handlers and masks.
* 配置信号handlers和masks.
*
* Note that postmaster blocked all signals before forking child process,
* so there is no race condition whereby we might receive a signal before
* we have set up the handler.
* 注意在fork子进程前postmaster已阻塞了所有信号,
* 因此就算接收到信号,但在完成配置handler前不会存在条件争用.
*
* Also note: it's best not to use any signals that are SIG_IGNored in the
* postmaster. If such a signal arrives before we are able to change the
* handler to non-SIG_IGN, it'll get dropped. Instead, make a dummy
* handler in the postmaster to reserve the signal. (Of course, this isn't
* an issue for signals that are locally generated, such as SIGALRM and
* SIGPIPE.)
* 同时注意:最好不要使用在postmaster中标记为SIG_IGNored的信号.
* 如果在改变处理器为non-SIG_IGN前,接收到这样的信号,会被清除.
* 相反,可以在postmaster中创建dummy handler来保留这样的信号.
* (当然,对于本地产生的信号,比如SIGALRM和SIGPIPE,这不会是问题)
*/
if (am_walsender)//wal sender进程?
WalSndSignals();//如果是,则调用WalSndSignals
else//不是wal sender进程
{
//设置标记,读取配置文件
pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config
* file */
//中断信号处理器(中断当前查询)
pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */
//终止当前查询并退出
pqsignal(SIGTERM, die); /* cancel current query and exit */
/*
* In a standalone backend, SIGQUIT can be generated from the keyboard
* easily, while SIGTERM cannot, so we make both signals do die()
* rather than quickdie().
* 在standalone进程,SIGQUIT可很容易的通过键盘生成,而SIGTERM则不好生成,
* 因此让这两个信号执行die()而不是quickdie().
*/
//bool IsUnderPostmaster = false
if (IsUnderPostmaster)
//悲催时刻,执行quickdie()
pqsignal(SIGQUIT, quickdie); /* hard crash time */
else
//执行die()
pqsignal(SIGQUIT, die); /* cancel current query and exit */
//建立SIGALRM处理器
InitializeTimeouts(); /* establishes SIGALRM handler */
/*
* Ignore failure to write to frontend. Note: if frontend closes
* connection, we will notice it and exit cleanly when control next
* returns to outer loop. This seems safer than forcing exit in the
* midst of output during who-knows-what operation...
* 忽略写入前端的错误.
* 注意:如果前端关闭了连接,会通知并在空中下一次返回给外层循环时退出.
* 这看起来会比在who-knows-what操作期间强制退出安全一些.
*/
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
pqsignal(SIGFPE, FloatExceptionHandler);
/*
* Reset some signals that are accepted by postmaster but not by
* backend
* 重置一些postmaster接收而后台进程不会接收的信号
*/
//在某些平台上,system()需要这个信号
pqsignal(SIGCHLD, SIG_DFL); /* system() requires this on some
* platforms */
}
//初始化BlockSig/UnBlockSig/StartupBlockSig
pqinitmask();//In