由于业务开发需要,需要对数据库代理进行研究,在研究 MySQL Proxy 实现原理的过程中,对一些功能点进行了分析总结。本文主要讲解下 MySQL Proxy 的 daemon 和 keepalive 功能实现原理。
MySQL Proxy 是数据库代理实现中的一种,提供了 MySQL server 与 MySQL client 之间的通信功能。由于 MySQL Proxy 使用的是 MySQL 网络协议,故其可以在不做任何修改的情况下,配合任何符合该协议的且与 MySQL 兼容的客户端一起使用。在最基本的配置下,MySQL Proxy 仅仅是简单地将自身置于服务器和客户端之间,负责将 query 从客户端传递到服务器,再将来自服务器的应答返回给相应的客户端。在高级配置下,MySQL Proxy 可以用来监视和改变客户端和服务器之间的通信。查询注入(query interception) 功能允许你按需要添加性能分析命令 (profiling) ,且可以通过 Lua 脚本语言对注入的命令进行脚本化控制。
本文不讨论 MySQL Proxy 作为数据库代理在功能上和实践中的优劣,而是着重讲述其源码实现中的两个功能点:daemon 功能和 keepalive 功能。
通过命令行启动 MySQL Proxy 时经常会用到如下两个配置项:--daemon 和 –keepalive 。在其相应的帮助命令中的解释为:
APUE 上的定义如下: 守护进程也称 daemon 进程,是生存期较长的一种进程,它们常常在系统自举时启动,仅在系统关闭时才终止。因为它们没有控制终端,所以说它们是再后台运行的。
首先,讲解下 daemon 实现的基本原则。事实上,编写守护进程程序时是存在一些基本规则的,目的是防止产生不需要的交互作用(比如与终端的交互)。规则如下:
有了上面的原则,现在对照下 MySQL Proxy 中的代码:
/** * start the app in the background * * UNIX-version */ void chassis_unix_daemonize(void) { #ifdef _WIN32 g_assert_not_reached(); /* shouldn't be tried to be called on win32 */ #else #ifdef SIGTTOU signal(SIGTTOU, SIG_IGN); #endif #ifdef SIGTTIN signal(SIGTTIN, SIG_IGN); #endif #ifdef SIGTSTP signal(SIGTSTP, SIG_IGN); #endif if (fork() != 0) exit(0); if (setsid() == -1) exit(0); signal(SIGHUP, SIG_IGN); if (fork() != 0) exit(0); chdir("/"); umask(0); #endif }
从上面的实现代码中,可以看出以下几点:
在上述 6 条 daemon 编程规则中没有提到 signal 处理的问题,那么针对 SIGHUP 的处理代表的是什么意思呢?还是参阅 APUE :
如果终端接口检测到一个连接断开,则将此信号发送给与该终端相关的控制进程(会话首进程)。仅当终端的 CLOCAL 标志没有设置时,上述条件下才产生此信号。
有别于由终端正常产生的信号(如中断、退出和挂起)-- 这些信号总是传递给前台进程组 -- SIGHUP 信号可以发送到位于后台运行的会话首进程。SIGHUP 信号的默认处理动作是终止当前进程。通常会使用该信号来通知守护进程,以重新读取它们的配置文件,因为守护进程不会有控制终端,而且通常决不会收到这种信号。
从上面这段文字可以看出,这里增加了 signal 信号处理的原因是,在 setsid 和第二次 fork 之间,当前的子进程仍旧是会话首进程,有可能会在收到 SIGHUP 信号时终止,所以这里通过设置 SIG_IGN 进行忽略。
至此,一个 daemon-mode 的守护进程就启动了。
下面讲解下 keepalive 功能的实现。简单的说,MySQL Proxy 的服务器编程模型为:1个 daemon 父进程 + 一个工作子进程(在其中可以再启动 n 个工作线程)。而 keepalive 的功能就是要求 daemon 进程在发现工作子进程被异常终结后,能够重新启动该子进程。
首先讲下 daemon 进程中的实现代码,其主要实现的功能为:
/** * forward the signal to the process group, but not us */ static void chassis_unix_signal_forward(int sig) { #ifdef _WIN32 g_assert_not_reached(); /* shouldn't be tried to be called on win32 */ #else signal(sig, SIG_IGN); /* we don't want to create a loop here */ kill(0, sig); #endif } /** * keep the ourself alive * * if we or the child gets a SIGTERM, we quit too * on everything else we restart it */ int chassis_unix_proc_keepalive(int *child_exit_status) { #ifdef _WIN32 g_assert_not_reached(); /* shouldn't be tried to be called on win32 */ return 0; /* for VC++, to silence a warning */ #else int nprocs = 0; pid_t child_pid = -1; /* we ignore SIGINT and SIGTERM and just let it be forwarded to the child instead * as we want to collect its PID before we shutdown too * * the child will have to set its own signal handlers for this */ for (;;) { /* try to start the children */ while (nprocs < 1) { pid_t pid = fork(); if (pid == 0) { /* child */ g_debug("%s: we are the child: %d", G_STRLOC, getpid()); return 0; } else if (pid < 0) { /* fork() failed */ g_critical("%s: fork() failed: %s (%d)", G_STRLOC, g_strerror(errno), errno); return -1; } else { /* we are the angel, let's see what the child did */ g_message("%s: [angel] we try to keep PID=%d alive", G_STRLOC, pid); /* forward a few signals that are sent to us to the child instead */ signal(SIGINT, chassis_unix_signal_forward); signal(SIGTERM, chassis_unix_signal_forward); signal(SIGHUP, chassis_unix_signal_forward); signal(SIGUSR1, chassis_unix_signal_forward); signal(SIGUSR2, chassis_unix_signal_forward); child_pid = pid; nprocs++; } } if (child_pid != -1) { struct rusage rusage; int exit_status; pid_t exit_pid; g_debug("%s: waiting for %d", G_STRLOC, child_pid); #ifdef HAVE_WAIT4 exit_pid = wait4(child_pid, &exit_status, 0, &rusage); #else memset(&rusage, 0, sizeof(rusage)); /* make sure everything is zero'ed out */ exit_pid = waitpid(child_pid, &exit_status, 0); #endif g_debug("%s: %d returned: %d", G_STRLOC, child_pid, exit_pid); if (exit_pid == child_pid) { /* our child returned, let's see how it went */ if (WIFEXITED(exit_status)) { g_message("%s: [angel] PID=%d exited normally with exit-code = %d (it used %ld kBytes max)", G_STRLOC, child_pid, WEXITSTATUS(exit_status), rusage.ru_maxrss / 1024); if (child_exit_status) *child_exit_status = WEXITSTATUS(exit_status); return 1; } else if (WIFSIGNALED(exit_status)) { int time_towait = 2; /* our child died on a signal * * log it and restart */ g_critical("%s: [angel] PID=%d died on signal=%d (it used %ld kBytes max) ... waiting 3min before restart", G_STRLOC, child_pid, WTERMSIG(exit_status), rusage.ru_maxrss / 1024); /** * to make sure we don't loop as fast as we can, sleep a bit between * restarts */ signal(SIGINT, SIG_DFL); signal(SIGTERM, SIG_DFL); signal(SIGHUP, SIG_DFL); while (time_towait > 0) time_towait = sleep(time_towait); nprocs--; child_pid = -1; } else if (WIFSTOPPED(exit_status)) { } else { g_assert_not_reached(); } } else if (-1 == exit_pid) { /* EINTR is ok, all others bad */ if (EINTR != errno) { /* how can this happen ? */ g_critical("%s: wait4(%d, ...) failed: %s (%d)", G_STRLOC, child_pid, g_strerror(errno), errno); return -1; } } else { g_assert_not_reached(); } } } #endif }其次讲解工作子进程中的实现代码,其主要实现的功能为:
通过 libevent 提供的接口设置对 SIGTERM/SIGINT/SIGHUP 三个信号的处理,通过 libevent 的信号处理方式可以做到,将I/O事件、Timer事件和信号事件统一按event-driven方式进行处理的目的,这样,一旦工作子进程检测到相应的信号,就会将控制变量signal_shutdown设置为1,进而令循环终止。
void chassis_set_shutdown_location(const gchar* location) { if (signal_shutdown == 0) g_message("Initiating shutdown, requested from %s", (location != NULL ? location : "signal handler")); signal_shutdown = 1; } gboolean chassis_is_shutdown() { return signal_shutdown == 1; } static void sigterm_handler(int G_GNUC_UNUSED fd, short G_GNUC_UNUSED event_type, void G_GNUC_UNUSED *_data) { chassis_set_shutdown_location(NULL); } static void sighup_handler(int G_GNUC_UNUSED fd, short G_GNUC_UNUSED event_type, void *_data) { chassis *chas = _data; g_message("received a SIGHUP, closing log file"); /* this should go into the old logfile */ chassis_log_set_logrotate(chas->log); g_message("re-opened log file after SIGHUP"); /* ... and this into the new one */ }
int chassis_mainloop(void *_chas) { chassis *chas = _chas; guint i; struct event ev_sigterm, ev_sigint; #ifdef SIGHUP struct event ev_sighup; #endif chassis_event_thread_t *mainloop_thread; /* redirect logging from libevent to glib */ event_set_log_callback(event_log_use_glib); /* add a event-handler for the "main" events */ mainloop_thread = chassis_event_thread_new(); chassis_event_threads_init_thread(chas->threads, mainloop_thread, chas); chassis_event_threads_add(chas->threads, mainloop_thread); chas->event_base = mainloop_thread->event_base; /* all global events go to the 1st thread */ g_assert(chas->event_base); /* setup all plugins all plugins */ for (i = 0; i < chas->modules->len; i++) { chassis_plugin *p = chas->modules->pdata[i]; g_assert(p->apply_config); if (0 != p->apply_config(chas, p->config)) { g_critical("%s: applying config of plugin %s failed", G_STRLOC, p->name); return -1; } } /* * drop root privileges if requested */ #ifndef _WIN32 if (chas->user) { struct passwd *user_info; uid_t user_id= geteuid(); /* Don't bother if we aren't superuser */ if (user_id) { g_critical("can only use the --user switch if running as root"); return -1; } if (NULL == (user_info = getpwnam(chas->user))) { g_critical("unknown user: %s", chas->user); return -1; } if (chas->log->log_filename) { /* chown logfile */ if (-1 == chown(chas->log->log_filename, user_info->pw_uid, user_info->pw_gid)) { g_critical("%s.%d: chown(%s) failed: %s", __FILE__, __LINE__, chas->log->log_filename, g_strerror(errno) ); return -1; } } setgid(user_info->pw_gid); setuid(user_info->pw_uid); g_debug("now running as user: %s (%d/%d)", chas->user, user_info->pw_uid, user_info->pw_gid ); } #endif signal_set(&ev_sigterm, SIGTERM, sigterm_handler, NULL); event_base_set(chas->event_base, &ev_sigterm); signal_add(&ev_sigterm, NULL); signal_set(&ev_sigint, SIGINT, sigterm_handler, NULL); event_base_set(chas->event_base, &ev_sigint); signal_add(&ev_sigint, NULL); #ifdef SIGHUP signal_set(&ev_sighup, SIGHUP, sighup_handler, chas); event_base_set(chas->event_base, &ev_sighup); if (signal_add(&ev_sighup, NULL)) { g_critical("%s: signal_add(SIGHUP) failed", G_STRLOC); } #endif if (chas->event_thread_count < 1) chas->event_thread_count = 1; /* create the event-threads * * - dup the async-queue-ping-fds * - setup the events notification * */ for (i = 1; i < (guint)chas->event_thread_count; i++) { /* we already have 1 event-thread running, the main-thread */ chassis_event_thread_t *event_thread; event_thread = chassis_event_thread_new(); chassis_event_threads_init_thread(chas->threads, event_thread, chas); chassis_event_threads_add(chas->threads, event_thread); } /* start the event threads */ if (chas->event_thread_count > 1) { chassis_event_threads_start(chas->threads); } /** * handle signals and all basic events into the main-thread * * block until we are asked to shutdown */ chassis_event_thread_loop(mainloop_thread); signal_del(&ev_sigterm); signal_del(&ev_sigint); #ifdef SIGHUP signal_del(&ev_sighup); #endif return 0; }
经过了上述源码分析,下面进行一些实验对其进行检验。
1.启动带 keepalive 功能的 mysql-proxy。
[root@Betty data]# mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16766 16765 16765 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16766 16767 16765 16765 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
2.向 daemon进程发送 INT 信号。
[root@Betty ~]# kill -INT 16766
3. MySQL Proxy日志显示内容:
2013-03-19 19:31:38: (message) Initiating shutdown, requested from signal handler 2013-03-19 19:31:39: (message) shutting down normally, exit code is: 0 2013-03-19 19:31:39: (debug) chassis-unix-daemon.c:167: 16767 returned: 16767 2013-03-19 19:31:39: (message) chassis-unix-daemon.c:176: [angel] PID=16767 exited normally with exit-code = 0 (it used 1 kBytes max) 2013-03-19 19:31:39: (message) Initiating shutdown, requested from mysql-proxy-cli.c:606 2013-03-19 19:31:39: (message) shutting down normally, exit code is: 0
可以看出,父子进程均退出。因为其信号处理函数会将全局变量 signal_shutdown 设置为 1,从而导致子进程退出 loop 循环,而处于 waitpid 状态的父进程获得的子进程的退出状态为 child_exit_status = 0 ,进而令父进程也会正常退出执行。
4.重复上述动作,但是改为向子进程发送 INT 信号。
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16872 16871 16871 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16872 16873 16871 16871 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -INT 16873
日志内容如下,完全相同。
2013-03-19 20:03:49: (message) Initiating shutdown, requested from signal handler 2013-03-19 20:03:50: (message) shutting down normally, exit code is: 0 2013-03-19 20:03:50: (debug) chassis-unix-daemon.c:167: 16873 returned: 16873 2013-03-19 20:03:50: (message) chassis-unix-daemon.c:176: [angel] PID=16873 exited normally with exit-code = 0 (it used 1 kBytes max) 2013-03-19 20:03:50: (message) Initiating shutdown, requested from mysql-proxy-cli.c:606 2013-03-19 20:03:50: (message) shutting down normally, exit code is: 0
5. 同样的实验(对子进程和和父进程分别实验一次),只是将信号变为 -TERM ,结果和上面的完全相同(因为代码中对这两个信号的处理方式完全相同)。
6. 同样的实验(对子进程和和父进程分别实验一次),只是将信号变为 -HUP ,结果如下:
2013-03-19 20:10:03: (message) received a SIGHUP, closing log file 2013-03-19 20:10:03: (message) re-opened log file after SIGHUP
上述打印出现在子进程的 HUP 信号处理函数中。该函数仅对日志设置了 rotate_logs = true 标识,并没有设置 signal_shutdown = 1 ,所以子进程不会结束,父进程也不会结束。
7. 同样的实验,将信号变为 -KILL ,向子进程发送:
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16902 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16902 16903 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -KILL 16903
输出日志如下:
2013-03-19 20:09:38: (debug) chassis-unix-daemon.c:121: we are the child: 16903 2013-03-19 20:09:38: (critical) plugin proxy 0.8.3 started 2013-03-19 20:09:38: (debug) max open file-descriptors = 1024 2013-03-19 20:09:38: (message) proxy listening on port 172.16.40.60:4040 2013-03-19 20:09:38: (message) added read/write backend: 172.16.40.60:12345 2013-03-19 20:09:38: (message) chassis-unix-daemon.c:136: [angel] we try to keep PID=16903 alive 2013-03-19 20:09:38: (debug) chassis-unix-daemon.c:157: waiting for 16903 ... ... 2013-03-19 20:31:36: (debug) chassis-unix-daemon.c:167: 16903 returned: 16903 2013-03-19 20:31:36: (critical) chassis-unix-daemon.c:189: [angel] PID=16903 died on signal=9 (it used 1 kBytes max) ... waiting 3min before restart 2013-03-19 20:31:38: (debug) chassis-unix-daemon.c:121: we are the child: 16947 2013-03-19 20:31:38: (critical) plugin proxy 0.8.3 started 2013-03-19 20:31:38: (debug) max open file-descriptors = 1024 2013-03-19 20:31:38: (message) proxy listening on port 172.16.40.60:4040 2013-03-19 20:31:38: (message) added read/write backend: 172.16.40.60:12345 2013-03-19 20:31:38: (message) chassis-unix-daemon.c:136: [angel] we try to keep PID=16947 alive 2013-03-19 20:31:38: (debug) chassis-unix-daemon.c:157: waiting for 16947
从日志和代码上都可以分析得出原因:由于 -KILL 信号是无法获取或者忽略的,所以当发送该信号给子进程后,子进程将被杀死,退出状态为 died on signal=9 ,此时父进程会执行 restart 子进程的操作。
此时重新查看进程信息:
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16902 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16902 16947 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
若向父进程发送 -KILL 信号,那么父进程将被直接杀死,子进程被 init 收留,而 init 进程根本不会理会是否需要 keepalive 子进程的问题,所以此时再向子进程发送 -KILL ,子进程被杀死后,不会重新被启动。
8. 同样的实验,将信号变为-STOP,向子进程发送:
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -STOP 16978 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 T 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -CONT 16978 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -STOP 16977 1 16977 16976 16976 ? -1 T 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -CONT 16977 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
出现上述结果的原因,是信号 -STOP 同样不可捕获和忽略,而进程对该信号的默认处理方式为暂停进程(可以从进程状态标志看出来)。同时在代码中,父进程在获得子进程状态处于暂停时,没有做任何特别处理,只是重新调用 waitpid 继续获取子进程的状态而已。
【总结】
daemon 功能和 keepalive 功能属于服务器程序开发过程中经常要面对到的问题,本文提供了上述功能的一种实现方式。通过学习开源代码,可以有机会接触到一些经典的处理问题的方法,通过对一些问题的深入了解,能够进一步完善自身的知识体系,强化对一些知识的理解。最后引用一位大师的名言:源码面前,了无秘密。祝玩的开心!
====================================
再贴两个 daemonize 的实现进行对比(取自 memcached-1.4.14):
int daemonize(int nochdir, int noclose) { int fd; switch (fork()) { case -1: return (-1); case 0: break; default: _exit(EXIT_SUCCESS); } if (setsid() == -1) return (-1); if (nochdir == 0) { if(chdir("/") != 0) { perror("chdir"); return (-1); } } if (noclose == 0 && (fd = open("/dev/null", O_RDWR, 0)) != -1) { if(dup2(fd, STDIN_FILENO) < 0) { perror("dup2 stdin"); return (-1); } if(dup2(fd, STDOUT_FILENO) < 0) { perror("dup2 stdout"); return (-1); } if(dup2(fd, STDERR_FILENO) < 0) { perror("dup2 stderr"); return (-1); } if (fd > STDERR_FILENO) { if(close(fd) < 0) { perror("close"); return (-1); } } } return (0); }(下面代码取自 Twemproxy)
static rstatus_t nc_daemonize(int dump_core) { rstatus_t status; pid_t pid, sid; int fd; pid = fork(); switch (pid) { case -1: log_error("fork() failed: %s", strerror(errno)); return NC_ERROR; case 0: break; default: /* parent terminates */ _exit(0); } /* 1st child continues and becomes the session leader */ sid = setsid(); if (sid < 0) { log_error("setsid() failed: %s", strerror(errno)); return NC_ERROR; } if (signal(SIGHUP, SIG_IGN) == SIG_ERR) { log_error("signal(SIGHUP, SIG_IGN) failed: %s", strerror(errno)); return NC_ERROR; } pid = fork(); switch (pid) { case -1: log_error("fork() failed: %s", strerror(errno)); return NC_ERROR; case 0: break; default: /* 1st child terminates */ _exit(0); } /* 2nd child continues */ /* change working directory */ if (dump_core == 0) { status = chdir("/"); if (status < 0) { log_error("chdir(\"/\") failed: %s", strerror(errno)); return NC_ERROR; } } /* clear file mode creation mask */ umask(0); /* redirect stdin, stdout and stderr to "/dev/null" */ fd = open("/dev/null", O_RDWR); if (fd < 0) { log_error("open(\"/dev/null\") failed: %s", strerror(errno)); return NC_ERROR; } status = dup2(fd, STDIN_FILENO); if (status < 0) { log_error("dup2(%d, STDIN) failed: %s", fd, strerror(errno)); close(fd); return NC_ERROR; } status = dup2(fd, STDOUT_FILENO); if (status < 0) { log_error("dup2(%d, STDOUT) failed: %s", fd, strerror(errno)); close(fd); return NC_ERROR; } status = dup2(fd, STDERR_FILENO); if (status < 0) { log_error("dup2(%d, STDERR) failed: %s", fd, strerror(errno)); close(fd); return NC_ERROR; } if (fd > STDERR_FILENO) { status = close(fd); if (status < 0) { log_error("close(%d) failed: %s", fd, strerror(errno)); return NC_ERROR; } } return NC_OK; }