ceph rgw:rgw的I/O路径 后篇

在上一篇文章中,我分析了rgw main函数的流程,其中fe->run()开始了frontend的运行,这篇文章就以run()函数开始。

rgw 支持很多frontend,以默认的frontend civietweb来分析。

RGWCivetWebFrontend::run

run函数很长,但其中大部分都是在处理配置,其功能代码只有下面几行:

  struct mg_callbacks cb;
  memset((void *)&cb, 0, sizeof(cb));
  cb.begin_request = civetweb_callback;
  cb.log_message = rgw_civetweb_log_callback;
  cb.log_access = rgw_civetweb_log_access_callback;
  ctx = mg_start(&cb, this, options.data());

代码很简单,就是对civetweb的使用,首先注册了我们自己的各种事件处理函数,然后使用mg_start开启了服务器(在新的线程执行)。
有关mg_callbacksmg_start参考:
https://github.com/civetweb/civetweb/blob/master/docs/api/mg_callbacks.md
https://github.com/civetweb/civetweb/blob/master/docs/api/mg_start.md

civetweb_callback

其中,请求的处理函数civetweb_callback实现如下:

static int civetweb_callback(struct mg_connection* conn)
{
  const struct mg_request_info* const req_info = mg_get_request_info(conn);
  return static_cast(req_info->user_data)->process(conn);
}

可以看到,这里只是做了用于参数的获取和转发,其真正的处理函数是RGWCivetWebFrontend::process

RGWCivetWebFrontend::process

int RGWCivetWebFrontend::process(struct mg_connection*  const conn)
{
  /* Hold a read lock over access to env.store for reconfiguration. */
  RWLock::RLocker lock(env.mutex);

  RGWCivetWeb cw_client(conn);
  auto real_client_io = rgw::io::add_reordering(
                          rgw::io::add_buffering(dout_context,
                            rgw::io::add_chunking(
                              rgw::io::add_conlen_controlling(
                                &cw_client))));
  RGWRestfulIO client_io(dout_context, &real_client_io);

  RGWRequest req(env.store->get_new_req_id());
  //处理函数
  int ret = process_request(env.store, env.rest, &req, env.uri_prefix,
                            *env.auth_registry, &client_io, env.olog);
  if (ret < 0) {
    /* We don't really care about return code. */
    dout(20) << "process_request() returned " << ret << dendl;
  }

  /* Mark as processed. */
  return 1;
}

rgw_process.cc/process_request

process函数将请求以及处理请求所需要的环境信息都准备好,调用process_request函数进行处理。这个函数比较长,只贴出关键的代码片段:

  struct req_state rstate(g_ceph_context, &rgw_env, &userinfo);
  struct req_state *s = &rstate;
  
  ......
  
  RGWRESTMgr *mgr;
  RGWHandler_REST *handler = rest->get_handler(store, s,
    auth_registry,
    frontend_prefix,
    client_io, &mgr, &init_error);
  
  ......

  ret = rgw_process_authenticated(handler, op, req, s);
  
  ......
  
  client_io->complete_request();
  ......

RGWREST::get_handler

process_request 将req的状态和一些必要的env存入rstate对象,然后调用rest->get_handler获得对应api的处理函数,要注意的是,这里的rest就是之前传入process的env.rest,我们追踪下这个env.rest究竟是什么。

让我们回到rgw_main.cc/main函数:

RGWREST rest;
......
if (apis_map.count("s3") > 0 || s3website_enabled) {
    if (! swift_at_root) {
        rest.register_default_mgr(set_logging(rest_filter(store, RGW_REST_S3,new RGWRESTMgr_S3(s3website_enabled))));
    } else {
        derr << "Cannot have the S3 or S3 Website enabled together with "
            << "Swift API placed in the root of hierarchy" << dendl;
        return EINVAL;
    }
}
......
RGWProcessEnv env = { store, &rest, olog, 0, uri_prefix, auth_registry };
fe = new RGWCivetWebFrontend(env, config);

上面的代码很清楚了,env.rest会随着api配置的不同而不同,下面代码继续对get_handler进行fen分析,以S3的api为例。

rest->get_handler(RGWHandler_REST* RGWREST::get_handler)函数比较复杂,只列出关键代码片段:

RGWRESTMgr *m = mgr.get_manager(s, frontend_prefix, s->decoded_uri,&s->relative_uri);
RGWHandler_REST* handler = m->get_handler(s, auth_registry, frontend_prefix);
return handler;

RGWRESTMgr_S3::get_handler

可以看到它转而去调用了具体的api所对应的get_handler函数,具体到S3,会调用RGWHandler_REST* RGWRESTMgr_S3::get_handler(..)函数:

RGWHandler_REST* RGWRESTMgr_S3::get_handler(struct req_state* const s,
                                            const rgw::auth::StrategyRegistry& auth_registry,
                                            const std::string& frontend_prefix)
{
  // 根据配置判断使用html还是xml控制
  bool is_s3website = enable_s3website && (s->prot_flags & RGW_REST_WEBSITE);
  int ret =
    RGWHandler_REST_S3::init_from_header(s,
                    is_s3website ? RGW_FORMAT_HTML :
                    RGW_FORMAT_XML, true);
  if (ret < 0)
    return NULL;

  RGWHandler_REST* handler;
  // 基于html的handler
  if (is_s3website) {
    // 根据请求中操作对象的不同返回不同的handler
    if (s->init_state.url_bucket.empty()) {
      handler = new RGWHandler_REST_Service_S3Website(auth_registry);
    } else if (s->object.empty()) {
      handler = new RGWHandler_REST_Bucket_S3Website(auth_registry);
    } else {
      handler = new RGWHandler_REST_Obj_S3Website(auth_registry);
    }
    //基于xml的handler
  } else {
    // 根据请求中操作对象的不同返回不同的handler      
    if (s->init_state.url_bucket.empty()) {
      handler = new RGWHandler_REST_Service_S3(auth_registry);
    } else if (s->object.empty()) {
      handler = new RGWHandler_REST_Bucket_S3(auth_registry);
    } else {
      handler = new RGWHandler_REST_Obj_S3(auth_registry);
    }
  }

  ldout(s->cct, 20) << __func__ << " handler=" << typeid(*handler).name()
            << dendl;
  return handler;
}

回到 rgw_process.cc/process_request

  struct req_state rstate(g_ceph_context, &rgw_env, &userinfo);
  struct req_state *s = &rstate;
  
  ......
  
  RGWRESTMgr *mgr;
  RGWHandler_REST *handler = rest->get_handler(store, s,
    auth_registry,
    frontend_prefix,
    client_io, &mgr, &init_error);
  
  ......
  // 开始分析以下部分代码
  ret = rgw_process_authenticated(handler, op, req, s);
  
  ......
  
  client_io->complete_request();
  ......

我们在之前已经分析了process_request的前部分代码,分析了handler是如何获得的。

在获得handler之后,经过各种参数检查,权限认证之后,其真正执行请求是在rgw_process_authenticated函数中,执行完之后,调用complete_request完成请求。

rgw_process.cc/rgw_process_authenticated

这是rgw_process_authenticated有关执行逻辑的代码:

  req->log(s, "pre-executing");
  op->pre_exec(); //拼接reponse的header,并返回给client

  req->log(s, "executing");
  op->execute(); //执行

  req->log(s, "completing");
  op->complete(); //调用send_response,返回执行结果给client

至于op的获得,稍微补充下

op = handler->get_op(store);

get_op函数会根据req的信息,去调用对应的handler的op_xxx函数,比如RGWHandler_REST_Obj_S3首先了下面一系列操作。

  RGWOp *op_get() override;
  RGWOp *op_head() override;
  RGWOp *op_put() override;
  RGWOp *op_delete() override;
  RGWOp *op_post() override;
  RGWOp *op_options() override;

每一个操作对对应一个RGWOp的子类,比如RGWGetObj_ObjStore_S3、RGWGetObjTags_ObjStore_S3、RGWListBucket_ObjStore_S3等一系列类对象。

到这,从frontend到操作的执行就走通了,接下来就可以对自己想要详细学习的operation进行阅读了。只需要看对应op对象的execute函数,pre_exec和complete函数基本一致,具体见代码注释。

你可能感兴趣的:(ceph rgw:rgw的I/O路径 后篇)