栈溢出问题的调查

上周在进行压测时,某个调用hiredis库的函数出现了coredump,调用栈如下:

Program terminated with signal 11, Segmentation fault.
#0  0x000000000052c497 in wh::common::redis::RedisConn::HashMultiGet(std::string const&, std::vector > const&, std::map, std::allocator > >&) ()
(gdb) bt
#0  0x000000000052c497 in wh::common::redis::RedisConn::HashMultiGet(std::string const&, std::vector > const&, std::map, std::allocator > >&) ()
#1  0x00000000004cc418 in wh::server::user_activityHandler::getUserRating(wh::server::GetUserRatingResult&, std::vector > const&) ()
#2  0x00000000004e54cf in wh::server::user_activityProcessor::process_getUserRating(int, apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, void*) ()
#3  0x00000000004e3ad3 in wh::server::user_activityProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*) ()

RedisConn中的HashMultiGet代码如下:

 int RedisConn::HashMultiGet(
       const string&         key,
       const vector& fields,
       map&  fvs)
 {
   if(key.empty() || fields.empty())
     return 0;

   if ( !conn_ )
   {
     LOG(LOG_ERR, "ERROR!!! conn is NULL!!!");
     return kErrConnBroken;
   }

   size_t argc = fields.size() + 2;
   const char* argv[argc]; //在栈中直接分配内存
   size_t argvlen[argc];

   std::string cmd = "HMGET";
   argv[0]    = cmd.data();
   argvlen[0] = cmd.length();
   argv[1]    = key.data();
   argvlen[1] = key.length();
   size_t i = 2;
   for(vector< string >::const_iterator cit = fields.begin();
         cit != fields.end(); ++cit )
   {
     // put value into arg list
     argv[i] = cit->data();
     argvlen[i] = cit->length();
     ++i;
   }

   redisReply* reply = static_cast( redisCommandArgv( conn_, argc, argv, argvlen ) );
   if ( !reply )
   {
     this->Release();
     LOG(LOG_ERR, "ERROR!!! Redis connection broken!!!");
     return kErrConnBroken;
   }

   int32_t ret = kErrOk;
   if ( reply->type != REDIS_REPLY_ARRAY )
   {
     this->CheckReply( reply );
     LOG(LOG_ERR, "RedisReply ERROR: %d %s", reply->type, reply->str);
     ret = kErrUnknown;
   }
...

其中出现问题的地方是构造hiredisredisCommandArgv请求时,构造的两个参数都是直接在栈上分配。

   const char* argv[argc]; //在栈中直接分配内存
   size_t argvlen[argc];

压测时,HashMultiGet(key, fields, fvs)fields大小超过10万,那么在栈上分配的内存为 10万 * (8 + 8) = 160万字节 = 1.6MB (64位系统),再加上之前分配的栈,将栈打爆了,导致了coredump.

为什么要将参数在栈上分配呢?一种可能是:如果在堆上分配,就需要考虑free的问题。

解决方法:
将argv和argvlen在堆上分配,毕竟堆的大小比栈大很多。

你可能感兴趣的:(c++,stackoverflow,segmentfault)