RGW 源码梳理
版本:TAG:v16.0.0
RGW代码入口:
配置项:
common->options.cc
radosgw.cc:
int main(int argc, char **argv)
{
return radosgw_Main(argc, const_cast(argv));
}
RGW配置解析及启动
rgw_main.cc:
int radosgw_Main(int argc, const char **argv)
:
global_pre_init->common_preinit:
- 初始化全局变量CephContext(cct),主要是RGW配置信息等
- log线程的初始化和启动。#cct构造函数中初始化
- admin_socket的初始化,动态配置项的初始化,以及注册动态配置命令等。#cct构造函数中完成
- global_init_set_globals(cct),定义全局变量g_ceph_context
- 解析配置参数,如env、命令行参数
global_init:
- static bool first_run = true: 静态变量保证global_init只被调用一次
- int siglist[] = { SIGPIPE, 0 };
block_signals(siglist, NULL);
SIGPIPE信号需要显示的阻塞掉,客户端异常断开后,服务端写数据会导致产生SIGPIPE信号,导致服务进程终止。 - 注册信号处理函数,处理的都是会产生core dump文件的信号
install_sighandler(signum, handle_fatal_signal, SA_RESETHAND | SA_NODEFER);; 可参考:signal_handler.cc文件
注册信号处理函数- 程序不可捕获、阻塞或忽略的信号有:SIGKILL,SIGSTOP ,未做处理
处理以下几类信号: - SIGSEGV: 段错误,如buffer overflow, stack overflow, illegal file access等,默认动作为core dump
- SIGABRT: abort()产生的信号,默认动作为core dump
- SIGBUS: 硬件错误、malloc分配失败、地址未对齐、访问一些无文件内容区域,如超过文件尾的内存区域,默认动作为core dump
- SIGILL: 无效的指令,默认动作为core dump
- SIGFPE: 浮点例外,溢出及除数为0等,默认动作为core dump
- SIGXCPU: 超过CPU时间资源限制,默认动作为core dump
- SIGXFSZ: 超过文件长度限制,默认动作为core dump
- SIGSYS: 非法的系统调用,默认动作为core dump
- 程序不可捕获、阻塞或忽略的信号有:SIGKILL,SIGSTOP ,未做处理
- 设置用户,及用户组
install_sighandler,安装信号:
flags = SA_RESETHAND | SA_NODEFER; SA_NODEFER:信号处理函数运行时,内核将不会阻塞该信号;SA_RESETHAND: 进入信号处理函数时,将信号的处理函数重置为缺省值
void install_sighandler(int signum, signal_handler_t handler, int flags)
{
int ret;
struct sigaction oldact;
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_handler = handler;
sigemptyset(&act.sa_mask);
act.sa_flags = flags;
ret = sigaction(signum, &act, &oldact);
}
handle_fatal_signal:
- 写日志
- raise(signum),重新生成此信号,产生core dump文件
解析HTTP配置:
multimap fe_map; #key=framework, value=RGWFrontendConfig *
以boost:beast为例,rgw_frontends="beast port=7480"
RGWFrontendConfig *config = new RGWFrontendConfig(f); #f=beast port=7480
config.init():解析配置为文件,framework=beast, config_map={port:7480};
fe_map.insert(pair(framework, config));
此时完成http server配置文件的解析。
global_init_daemonize(g_ceph_context)
根据变量g_conf()->daemonize判断,默认是true。
- global_init_prefork(cct):将进程pid写入pidfile,关闭log线程
- daemon(1, 1):标准库函数,守护进程方式运行;
- global_init_postfork_start(cct):包括重启log线程,写pidfile,reopen_as_null(cct, STDIN_FILENO)重定向标准输入至/dev/null
- global_init_postfork_finish(cct):reopen_as_null(cct, STDERR_FILENO),reopen_as_null(cct, STDOUT_FILENO):重定向标准输出至/dev/null
此部分内容,设置RGW作为守护进程运行
基础配置结束
启动HTTP Server
设置rgw初始化的timeout,定时器的设计可参考另一篇文章Timer。
ceph::mutex mutex = ceph::make_mutex("main");
SafeTimer init_timer(g_ceph_context, mutex);
init_timer.init();
mutex.lock();
init_timer.add_event_after(g_conf()->rgw_init_timeout, new C_InitTimeout);
mutex.unlock();
基础配置完成,部分功能检查,包括日志是否启动,开启service thread,admin_socket检查:
common_init_finish(g_ceph_context):
1. 注册异步信号处理
注册异步信号处理函数:
init_async_signal_handler();
register_async_signal_handler(SIGHUP, sighup_handler);
- 初始化一个SignalHandler,新起一个线程,监听pipefd
- 注册信号处理函数,会单独开一个线程处理信号处理函数
异步信号处理函数的实现
参见文章异步信号处理函数
读取/etc/mine.types文件,主要用于根据文件后缀名设置Content-type
rgw_tools_init(g_ceph_context)
:
/*
* 设置DNS解析,设置句柄:rgw_resolver = new RGWResolver();RGWResolver->DNSResolver
* 设置RGWCurlHandles句柄:static RGWCurlHandles *handles,handles利用vector存储多个RGWCurlHandle句柄,按需提取及回收
RGWCurlHandle定义:
struct RGWCurlHandle {
int uses;
mono_time lastuse;
/*
* 为libcurl的句柄
* 注意libcurl的初始化顺序:curl_global_init非线程安全,主线程中初始化一次,curl_easy_init获取libcurl句柄
* std::call_once(curl_init_flag, curl_global_init, CURL_GLOBAL_ALL);
*/
CURL* h;
explicit RGWCurlHandle(CURL* h) : uses(0), h(h) {};
CURL* operator*() {
return this->h;
}
};
* rgw_http_client_init: 起一个线程处理rgw所有的curl请求(rgw作为client,发出curl请求,并发)
* libcurl管理curl访问可参考[libcurl](https://www.jianshu.com/p/2f569a42e049)
*/
rgw_init_resolver();
rgw::curl::setup_curl(fe_map);
rgw_http_client_init(g_ceph_context);
2. store的初始化
初始化一个RGWRadosStore,RGWRados;相互注册。
/*
** rgw_enable_gc_threads/rgw_enable_lc_threads/rgw_enable_quota_threads/rgw_run_sync_thread/rgw_dynamic_resharding/rgw_cache_enabled: true
*/
rgw::sal::RGWRadosStore *store =
RGWStoreManager::get_storage(g_ceph_context,
g_conf()->rgw_enable_gc_threads,
g_conf()->rgw_enable_lc_threads,
g_conf()->rgw_enable_quota_threads,
g_conf()->rgw_run_sync_thread,
g_conf().get_val("rgw_dynamic_resharding"),
g_conf()->rgw_cache_enabled);
get_storage():
RGWRados *rados = new RGWRados;
RGWRadosStore *store = new RGWRadosStore();
store->setRados(rados);
rados->set_store(store);
(*rados).set_use_cache(use_cache)
.set_run_gc_thread(use_gc_thread)
.set_run_lc_thread(use_lc_thread)
.set_run_quota_threads(quota_threads)
.set_run_sync_thread(run_sync_thread)
.set_run_reshard_thread(run_reshard_thread)
.initialize(cct)
RGWRadosStore包含两个指针:
- RGWRados *rados;
- RGWUserCtl *user_ctl;
RGWRados的初始化过程:
use_cache/use_gc_thread/use_lc_thread/quota_threads/run_sync_thread/run_reshard_thread: 设置为true;
rados->initialize(cct):
主要步骤如下:
- init_svc(false)
- init_ctl()
- host_id = svc.zone_utils->gen_host_id()
- init_rados()
- init_complete()
- init_svc(false)
/*
** svc: RGWService
** RGWService::RGWServices_def,RGWServices_def中定义了各个RGWService的指针(std::unique_ptr)
** 是各个RGWService的初始化及启动过程
** RGWService: 各个RGWService的作用在另一篇文章中介绍
*/
svc.init(cct, use_cache, run_sync_thread)
->RGWService::do_init(cct, true, false, true)
->_svc.init(cct, have_cache, raw, run_sync)
以下是各种RGWService的init,及start的过程,RGWServices_Def::init:
RGWSI_Finisher finisher = std::make_unique(cct);
RGWSI_Bucket_SObj bucket_sobj = std::make_unique(cct);
RGWSI_Bucket_Sync_SObj bucket_sync_sobj = std::make_unique(cct);
RGWSI_BucketIndex_RADOS bi_rados = std::make_unique(cct);
RGWSI_BILog_RADOS bilog_rados = std::make_unique(cct);
RGWSI_Cls cls = std::make_unique(cct);
RGWSI_ConfigKey_RADOS config_key_rados = std::make_unique(cct);
RGWSI_DataLog_RADOS datalog_rados = std::make_unique(cct);
RGWSI_MDLog mdlog = std::make_unique(cct, run_sync);
RGWSI_Meta meta = std::make_unique(cct);
RGWSI_MetaBackend_SObj meta_be_sobj = std::make_unique(cct);
RGWSI_MetaBackend_OTP meta_be_otp = std::make_unique(cct);
RGWSI_Notify notify = std::make_unique(cct);
RGWSI_OTP otp = std::make_unique(cct);
RGWSI_RADOS radosgw = std::make_unique(cct);
RGWSI_Zone zone = std::make_unique(cct);
RGWSI_ZoneUtils zone_utils = std::make_unique(cct);
RGWSI_Quota quota = std::make_unique(cct);
RGWSI_SyncModules sync_modules = std::make_unique(cct);
RGWSI_SysObj sysobj = std::make_unique(cct);
RGWSI_SysObj_Core sysobj_core = std::make_unique(cct);
RGWSI_User_RADOS user_rados = std::make_unique(cct);
RGWSI_SysObj_Cache sysobj_cache = std::make_unique(cct);
/*
** 调用各个RGWService的init初始化函数
*/
vector meta_bes{meta_be_sobj.get(), meta_be_otp.get()};
finisher->init();
bi_rados->init(zone.get(), rados.get(), bilog_rados.get(), datalog_rados.get());
bilog_rados->init(bi_rados.get());
bucket_sobj->init(zone.get(), sysobj.get(), sysobj_cache.get(),
bi_rados.get(), meta.get(), meta_be_sobj.get(),
sync_modules.get(), bucket_sync_sobj.get());
bucket_sync_sobj->init(zone.get(), sysobj.get(), sysobj_cache.get(), bucket_sobj.get());
cls->init(zone.get(), rados.get());
config_key_rados->init(rados.get());
datalog_rados->init(zone.get(), cls.get());
mdlog->init(rados.get(), zone.get(), sysobj.get(), cls.get());
meta->init(sysobj.get(), mdlog.get(), meta_bes);
meta_be_sobj->init(sysobj.get(), mdlog.get());
meta_be_otp->init(sysobj.get(), mdlog.get(), cls.get());
notify->init(zone.get(), rados.get(), finisher.get());
otp->init(zone.get(), meta.get(), meta_be_otp.get());
rados->init();
zone->init(sysobj.get(), rados.get(), sync_modules.get(), bucket_sync_sobj.get());
zone_utils->init(rados.get(), zone.get());
quota->init(zone.get());
sync_modules->init(zone.get());
sysobj_core->core_init(rados.get(), zone.get());
/*
** 使用缓存
*/
sysobj_cache->init(rados.get(), zone.get(), notify.get());
sysobj->init(rados.get(), sysobj_cache.get());
user_rados->init(rados.get(), zone.get(), sysobj.get(), sysobj_cache.get(),
meta.get(), meta_be_sobj.get(), sync_modules.get());
can_shutdown = true;
/*
** 以下是启动各个RGWService
*/
finisher->start();
notify->start();
rados->start();
r = zone->start();
mdlog->start();
sync_modules->start();
cls->start();
config_key_rados->start();
zone_utils->start();
quota->start();
sysobj_core->start();
/*
** 缓存
*/
sysobj_cache->start();
sysobj->start();
datalog_rados->start();
meta_be_sobj->start();
meta->start();
bucket_sobj->start();
bucket_sync_sobj->start();
user_rados->start();
otp->start();
- init_ctl
/*
** 涉及元数据处理句柄的初始化
** ctl: RGWCtl
** pctl: RGWCtl*, =(&ctl)
**
** RGWCtl:
** CephContext *cct
** RGWServices *svc
** RGWCtlDef _ctl
** RGWUserCtl *user
** RGWBucketCtl *bucket
** RGWOTPCtl *otp
**
** struct _meta {
** RGWMetadataManager *mgr{nullptr};
** RGWMetadataHandler *bucket{nullptr};
** RGWMetadataHandler *bucket_instance{nullptr};
** RGWMetadataHandler *user{nullptr};
** RGWMetadataHandler *otp{nullptr};
** } meta;
**
** RGWCtlDef: 也是对各个RGWCtl及MetaHandler的智能指针的封装
** RGWCtl与RGWCtlDef中指针都是指向的相同实例
*/
ctl.init(&svc):
svc = _svc;
cct = svc->cct;
_ctl.init(*svc);
... /* 对_ctl中的智能指针获取指针,赋值给RGWCtl中的指针*/
/*
** attach注册过程,meta.mgr->register_handler(meta.user);
** RGWMetadataManager: handlers[meta.user->get_type()] = meta.user;
*/
meta.user->attach(meta.mgr);
meta.bucket->attach(meta.mgr);
meta.bucket_instance->attach(meta.mgr);
meta.otp->attach(meta.mgr);
_ctl.init(*svc):
/*
** 各个RGWCtl及MetaHandler的初始化
**
*/
/*
** 注册user的元数据处理句柄
*/
meta.mgr.reset(new RGWMetadataManager(svc.meta));
meta.user.reset(RGWUserMetaHandlerAllocator::alloc(svc.user));
/*
** 注册bucket及bucket_instance的元数据处理句柄,这些句柄利用sync_module中的meta handler初始化
*/
svc.sync_modules->get_sync_module();
/*
** meta.bucket: new RGWBucketMetadataHandler
** meta.bucket_instance: new RGWBucketInstanceMetadataHandler
** meta.otp: new RGWOTPMetadataHandler
*/
meta.bucket.reset(sync_module->alloc_bucket_meta_handler());
meta.bucket_instance.reset(sync_module->alloc_bucket_instance_meta_handler());
meta.otp.reset(RGWOTPMetaHandlerAllocator::alloc());
user.reset(new RGWUserCtl(svc.zone, svc.user, (RGWUserMetadataHandler *)meta.user.get()));
bucket.reset(new RGWBucketCtl(svc.zone, svc.bucket, svc.bucket_sync, svc.bi));
otp.reset(new RGWOTPCtl(svc.zone, svc.otp));
RGWBucketMetadataHandlerBase *bucket_meta_handler = static_cast(meta.bucket.get());
RGWBucketInstanceMetadataHandlerBase *bi_meta_handler = static_cast(meta.bucket_instance.get());
bucket_meta_handler->init(svc.bucket, bucket.get());
bi_meta_handler->init(svc.zone, svc.bucket, svc.bi);
RGWOTPMetadataHandlerBase *otp_handler = static_cast(meta.otp.get());
otp_handler->init(svc.zone, svc.meta_be_otp, svc.otp);
user->init(bucket.get());
bucket->init(user.get(), (RGWBucketMetadataHandler *)bucket_meta_handler, (RGWBucketInstanceMetadataHandler *)bi_meta_handler, svc.datalog_rados->get_log());
otp->init((RGWOTPMetadataHandler *)meta.otp.get());
- svc.zone_utils->gen_host_id()
获取host_id: "instance_id+zone_name+zonegroup_name"
- init_rados
/*
** 初始化librados::rados,连接rads集群
** 注册cr dump命令
*/
rados.init_with_context(cct);
rados.connect();
auto crs = std::unique_ptr{new RGWCoroutinesManagerRegistry(cct)};
crs->hook_to_admin_command("cr dump");
cr_registry = crs.release();
- init_complete
梳理流程如下:
/*
** 创建一个sync module instance,返回的是一个shared_ptr
*/
sync_module = svc.sync_modules->get_sync_module();
/*
** 打开几个控制面的pool,主要是关联ioctx与pool: rados->ioctx_create(pool.name.c_str(), ioctx);
** 当一个pool不存在时,会自动创建一个pool
*/
open_root_pool_ctx();
open_gc_pool_ctx();
open_lc_pool_ctx();
open_objexp_pool_ctx();
open_reshard_pool_ctx();
pools_initialized = true;
/*
** 开启GC线程
*/
gc = new RGWGC();
gc->initialize(cct, this);
obj_expirer = new RGWObjectExpirer(this->store);
if (use_gc_thread) {
gc->start_processor();
obj_expirer->start_processor();
}
/*
** zone数据设置;
** 1. 判断是否需要同步,依据是否有master-zone
** 2. 对于master-zone,开启RGWMetaNotifier线程
** 3. 开启RGWSyncTraceManager线程,主要用于radosgw-admin sync相关的命令
*/
auto& current_period = svc.zone->get_current_period();
auto& zonegroup = svc.zone->get_zonegroup();
auto& zone_params = svc.zone->get_zone_params();
auto& zone = svc.zone->get_zone();
if (!svc.zone->need_to_sync()) {
run_sync_thread = false;
}
if (svc.zone->is_meta_master()) {
auto md_log = svc.mdlog->get_log(current_period.get_id());
meta_notifier = new RGWMetaNotifier(this, md_log);
meta_notifier->start();
}
/* init it anyway, might run sync through radosgw-admin explicitly */
sync_tracer = new RGWSyncTraceManager(cct, cct->_conf->rgw_sync_trace_history_size);
sync_tracer->init(this);
ret = sync_tracer->hook_to_admin_command();
/*
** multisite-sync: 启动 sync 业务的的阶段
*/
...
/*
** 启动RGWDataNotifier线程,此线程用于通知其他zone数据变化
*/
data_notifier = new RGWDataNotifier(this);
data_notifier->start();
/*
** 初始化cache结构: binfo_cache
*/
binfo_cache = new RGWChainedCacheImpl;
binfo_cache->init(svc.cache);
/*
** lc线程初始化及启动:
** 主要负责RGW lifecycle功能
*/
lc = new RGWLC();
lc->initialize(cct, this->store);
lc->start_processor();
/*
** 1. 设置Quota线程,返回quota handler
** 2. 在master zone中运行reshared线程
*/
quota_handler = RGWQuotaHandler::generate_handler(this->store, quota_threads);
reshard_wait = std::make_shared();
reshard = new RGWReshard(this->store);
/* only the master zone in the zonegroup reshards buckets */
run_reshard_thread = run_reshard_thread && (zonegroup.master_zone == zone.id);
reshard->start_processor();
/*
** 1. 管理Bucket Index异步读写的回调函数
** 2. Notification Manager
*/
index_completion_manager = new RGWIndexCompletionManager(this);
ret = index_completion_manager->start();
ret = rgw::notify::init(cct, store);
3. 基本配置
/*
** 1. 启用rgw计数器:
** 查看计数器:`ceph --admin-daemon /var/run/ceph/ceph-client.rgw.v1501.rgw0.1232.93838507500048.asok perf dump rgw`
** 2. HTTP与RGW ATTR的映射关系,HTTP Code及Name的映射,DNS Hostname等解析
** 3. 初始化一个usage_logger: new UsageLogger(cct, store);
*/
rgw_perf_start(g_ceph_context);
rgw_rest_init(g_ceph_context, store->svc()->zone->get_zonegroup());
rgw_log_usage_init(g_ceph_context, store->getRados());
/*
** 1. PUBSUB初始化:rgw::amqp::init(cct.get())/rgw::kafka::init(cct.get())
** 2. apis的解析,RGWRESTMgr的注册
*/
如s3 Mgr注册:
RGWREST rest
rest.register_default_mgr(set_logging(rest_filter(store->getRados(), RGW_REST_S3, new RGWRESTMgr_S3(s3website_enabled, sts_enabled, iam_enabled, pubsub_enabled))));
/*
** 1. 注册auth strategy
** 2. 初始化socket pair,用于进程的关闭: socketpair(AF_UNIX, SOCK_STREAM, 0, signal_fd);
** 3. 注册信号处理函数
*/
auto auth_registry = rgw::auth::StrategyRegistry::create(g_ceph_context, implicit_tenant_context, store->getRados()->pctl);
signal_fd_init();
register_async_signal_handler(SIGTERM, handle_sigterm);
register_async_signal_handler(SIGINT, handle_sigterm);
register_async_signal_handler(SIGUSR1, handle_sigterm);
sighandler_alrm = signal(SIGALRM, godown_alarm);
4. 启动Beast
流程如下:
RGWProcessEnv env{ store, &rest, olog, port, uri_prefix, auth_registry };
fe = new RGWAsioFrontend(env, config, sched_ctx);
fe->init();
fe->run();
store->getRados()->register_to_service_map("rgw", service_map_meta);
wait_shutdown();
fe->stop();
fe->join();
delete fe;