linux c解决多个第三方so动态库包含不同版本openssl造成的符号冲突

1.奇异的现象

由于有一个功能(用钉钉群机器人向钉钉群发送消息)采用了libcurl库,所以链接了libcurl库,出现了一个非常奇怪的现象:

编译正常,运行正常,但是运行到发送https post请求时,整个程序死机,让libcurl以VERBOSE方式输出执行信息时,发现停止在ALPN, offering http/1.1这里不动了,CPU有一个核100%占用。单独弄一个项目来测试libcurl库的功能,一切正常;在应用项目中使用就出现这个情况

* TCP_NODELAY set
* Connected to oapi.dingtalk.com (203.119.206.75) port 443 (#0)
* ALPN, offering http/1.1

中间尝试了wireshark抓包,也不得要领,死机的程序在TCP连接上对端之后就不再有动作,本该进行的加密通讯没有进行下去,而是停止了。

总之,现象就是单独编译正常的libcurl一旦集成到原先的应用程序中就会出现上面说的死机+cpu满载的情况。

启用调试,在程序死机的时候中断程序,发现每次中断,调用栈都如下图

linux c解决多个第三方so动态库包含不同版本openssl造成的符号冲突_第1张图片

2.探明原因

2.1.分析过程

根据现象分析,应该是libcurl在调用与SSL加密相关的函数时走到了不正常的代码中;之所以这么确定是与加密相关的代码是因为之前我们应用程序也集成了libcurl而且功能正常,唯一不同之处就是之前那个应用程序没有连接 https 而是连接 http。

到网上进行了一番检索,发现了这个文章 Shared Library Symbol Conflicts (on Linux) ,盗用其中一张图说明我们现在的处境:

linux c解决多个第三方so动态库包含不同版本openssl造成的符号冲突_第2张图片

应该是应用程序中其它的第三方库与libcurl中的符号发生了冲突;及有可能就是我们在调试时候中断时发现的OPENSSL_sk_insert 和 X509_STORE开头的两个符号。

2.2 确认,查出哪个库造成的冲突

根据上面提到的文章中讲的,用nm命令查看,发现应用程序中的一个库包含大量的OPENSSL_开头的和X509_STORE_开头的符号,而且这些符号以全局的形式存在,也就是静态链接进来了。后来发现这些符号都是openssl库的,也就是说应用程序中原先使用的这个第三方库把openssl库整个静态链接进去了而且还把所有无关的符号都导出成全局的了。这样就导致libcurl库以动态方式调用openssl的时候,调用了第三方库静态链接的openssl,而这个openssl的版本应该是和当前系统内的openssl动态库的版本不一致的,从而导致了不可预知的故障。

第三方库中X509_STORE_开头的符号

faund@Sirius:/usr/lib$ nm -CDa libthostmduserapi_se_v6.3.15.so | grep X509_STORE_
00000000002f3720 T X509_STORE_add_cert
00000000002f35f0 T X509_STORE_add_crl
00000000002f2c20 T X509_STORE_add_lookup
00000000002dbf20 T X509_STORE_CTX_cleanup
00000000002dc3b0 T X509_STORE_CTX_free
00000000002dad70 T X509_STORE_CTX_get0_cert
00000000002dacf0 T X509_STORE_CTX_get0_chain
00000000002dad10 T X509_STORE_CTX_get0_current_crl
00000000002dad00 T X509_STORE_CTX_get0_current_issuer
00000000002daeb0 T X509_STORE_CTX_get0_param
00000000002dad20 T X509_STORE_CTX_get0_parent_ctx
00000000002dae80 T X509_STORE_CTX_get0_policy_tree
00000000002f2810 T X509_STORE_CTX_get0_store
00000000002dad80 T X509_STORE_CTX_get0_untrusted
00000000002f3310 T X509_STORE_CTX_get1_certs
00000000002db210 T X509_STORE_CTX_get1_chain
00000000002f31a0 T X509_STORE_CTX_get1_crls
00000000002f2fb0 T X509_STORE_CTX_get1_issuer
00000000002f2ed0 T X509_STORE_CTX_get_by_subject
00000000002dae30 T X509_STORE_CTX_get_cert_crl
00000000002dae20 T X509_STORE_CTX_get_check_crl
00000000002dadf0 T X509_STORE_CTX_get_check_issued
00000000002dae40 T X509_STORE_CTX_get_check_policy
00000000002dae00 T X509_STORE_CTX_get_check_revocation
00000000002dae70 T X509_STORE_CTX_get_cleanup
00000000002dacd0 T X509_STORE_CTX_get_current_cert
00000000002dac90 T X509_STORE_CTX_get_error
00000000002dacb0 T X509_STORE_CTX_get_error_depth
00000000002db230 T X509_STORE_CTX_get_ex_data
00000000002dae90 T X509_STORE_CTX_get_explicit_policy
00000000002dae10 T X509_STORE_CTX_get_get_crl
00000000002dade0 T X509_STORE_CTX_get_get_issuer
00000000002dae50 T X509_STORE_CTX_get_lookup_certs
00000000002dae60 T X509_STORE_CTX_get_lookup_crls
00000000002daea0 T X509_STORE_CTX_get_num_untrusted
00000000002f3490 T X509_STORE_CTX_get_obj_by_subject
00000000002dadd0 T X509_STORE_CTX_get_verify
00000000002dadb0 T X509_STORE_CTX_get_verify_cb
00000000002dbfc0 T X509_STORE_CTX_init
00000000002db060 T X509_STORE_CTX_new
00000000002db0b0 T X509_STORE_CTX_purpose_inherit
00000000002dad40 T X509_STORE_CTX_set0_crls
00000000002daec0 T X509_STORE_CTX_set0_dane
00000000002daed0 T X509_STORE_CTX_set0_param
00000000002dad50 T X509_STORE_CTX_set0_trusted_stack
00000000002dad90 T X509_STORE_CTX_set0_untrusted
00000000002dbee0 T X509_STORE_CTX_set0_verified_chain
00000000002dad30 T X509_STORE_CTX_set_cert
00000000002dace0 T X509_STORE_CTX_set_current_cert
00000000002daf00 T X509_STORE_CTX_set_default
00000000002daf50 T X509_STORE_CTX_set_depth
00000000002daca0 T X509_STORE_CTX_set_error
00000000002dacc0 T X509_STORE_CTX_set_error_depth
00000000002db240 T X509_STORE_CTX_set_ex_data
00000000002daf40 T X509_STORE_CTX_set_flags
00000000002db200 T X509_STORE_CTX_set_purpose
00000000002daf30 T X509_STORE_CTX_set_time
00000000002db1f0 T X509_STORE_CTX_set_trust
00000000002dadc0 T X509_STORE_CTX_set_verify
00000000002dada0 T X509_STORE_CTX_set_verify_cb
00000000002f2ca0 T X509_STORE_free
00000000002f2670 T X509_STORE_get0_objects
00000000002f2680 T X509_STORE_get0_param
00000000002f2780 T X509_STORE_get_cert_crl
00000000002f2760 T X509_STORE_get_check_crl
00000000002f2700 T X509_STORE_get_check_issued
00000000002f27a0 T X509_STORE_get_check_policy
00000000002f2720 T X509_STORE_get_check_revocation
00000000002f2800 T X509_STORE_get_cleanup
00000000002f2820 T X509_STORE_get_ex_data
00000000002f2740 T X509_STORE_get_get_crl
00000000002f26e0 T X509_STORE_get_get_issuer
00000000002f27c0 T X509_STORE_get_lookup_certs
00000000002f27e0 T X509_STORE_get_lookup_crls
00000000002f26a0 T X509_STORE_get_verify
00000000002f26c0 T X509_STORE_get_verify_cb
00000000002f28a0 T X509_STORE_lock
00000000002f2b40 T X509_STORE_new
00000000002f2840 T X509_STORE_set1_param
00000000002f2770 T X509_STORE_set_cert_crl
00000000002f2750 T X509_STORE_set_check_crl
00000000002f26f0 T X509_STORE_set_check_issued
00000000002f2790 T X509_STORE_set_check_policy
00000000002f2710 T X509_STORE_set_check_revocation
00000000002f27f0 T X509_STORE_set_cleanup
00000000002f2870 T X509_STORE_set_depth
00000000002f2830 T X509_STORE_set_ex_data
00000000002f2890 T X509_STORE_set_flags
00000000002f2730 T X509_STORE_set_get_crl
00000000002f26d0 T X509_STORE_set_get_issuer
00000000002f27b0 T X509_STORE_set_lookup_certs
00000000002f27d0 T X509_STORE_set_lookup_crls
00000000002f2860 T X509_STORE_set_purpose
00000000002f2850 T X509_STORE_set_trust
00000000002f2690 T X509_STORE_set_verify
00000000002f26b0 T X509_STORE_set_verify_cb
00000000002f28b0 T X509_STORE_unlock
00000000002f2b00 T X509_STORE_up_ref

libcurl中X509_STORE_开头的的符号

faund@Sirius:/usr/lib/x86_64-linux-gnu$ nm -CDa libcurl.so | grep X509_STORE_
                 U X509_STORE_add_lookup
                 U X509_STORE_set_flags

 第三方库中OPENSSL_开头的符号

faund@Sirius:/usr/lib$ nm -CDa libthostmduserapi_se_v6.3.15.so | grep OPENSSL_
0000000000271490 T OPENSSL_asc2uni
00000000001d44e0 T OPENSSL_atexit
00000000001db850 T OPENSSL_atomic_add
00000000001d70c0 T OPENSSL_buf2hexstr
00000000001dba10 T OPENSSL_cleanse
00000000001d4b90 T OPENSSL_cleanup
00000000001e9ec0 T OPENSSL_config
00000000001e9fb0 T OPENSSL_die
00000000002d4c90 T OPENSSL_gmtime
00000000002d4ac0 T OPENSSL_gmtime_adj
00000000002d49f0 T OPENSSL_gmtime_diff
00000000001d6ff0 T OPENSSL_hexchar2int
00000000001d7240 T OPENSSL_hexstr2buf
00000000001db880 T OPENSSL_ia32_cpuid
00000000001dbbd0 T OPENSSL_ia32_rdrand
00000000001dbbf0 T OPENSSL_ia32_rdrand_bytes
00000000001dbc50 T OPENSSL_ia32_rdseed
00000000001dbc70 T OPENSSL_ia32_rdseed_bytes
00000000001d4520 T OPENSSL_init_crypto
0000000000244d80 T OPENSSL_INIT_free
0000000000244df0 T OPENSSL_INIT_new
0000000000244da0 T OPENSSL_INIT_set_config_appname
00000000001dbb10 T OPENSSL_instrument_bus
00000000001dbb60 T OPENSSL_instrument_bus2
00000000001e9ef0 T OPENSSL_isservice
00000000001d4f50 T OPENSSL_LH_delete
00000000001d51f0 T OPENSSL_LH_doall
00000000001d4da0 T OPENSSL_LH_doall_arg
00000000001d4ec0 T OPENSSL_LH_error
00000000001d4ed0 T OPENSSL_LH_free
00000000001d4ea0 T OPENSSL_LH_get_down_load
00000000001d5250 T OPENSSL_LH_insert
00000000001d5120 T OPENSSL_LH_new
00000000001d4e90 T OPENSSL_LH_num_items
00000000001d5480 T OPENSSL_LH_retrieve
00000000001d4eb0 T OPENSSL_LH_set_down_load
00000000001d4e10 T OPENSSL_LH_strhash
00000000002455b0 T OPENSSL_load_builtin_modules
00000000001d6f70 T OPENSSL_memcmp
0000000000867cfc B OPENSSL_NONPIC_relocated
00000000001db870 T OPENSSL_rdtsc
00000000001e9f00 T OPENSSL_showfatal
00000000002295e0 T OPENSSL_sk_deep_copy
0000000000229330 T OPENSSL_sk_delete
00000000002293f0 T OPENSSL_sk_delete_ptr
0000000000229760 T OPENSSL_sk_dup
0000000000229820 T OPENSSL_sk_find
0000000000229290 T OPENSSL_sk_find_ex
00000000002291c0 T OPENSSL_sk_free
0000000000229430 T OPENSSL_sk_insert
0000000000229170 T OPENSSL_sk_is_sorted
0000000000229550 T OPENSSL_sk_new
00000000002295d0 T OPENSSL_sk_new_null
0000000000229100 T OPENSSL_sk_num
00000000002293b0 T OPENSSL_sk_pop
0000000000229200 T OPENSSL_sk_pop_free
0000000000229540 T OPENSSL_sk_push
0000000000229140 T OPENSSL_sk_set
00000000002290e0 T OPENSSL_sk_set_cmp_func
00000000002293d0 T OPENSSL_sk_shift
0000000000229180 T OPENSSL_sk_sort
0000000000229530 T OPENSSL_sk_unshift
0000000000229110 T OPENSSL_sk_value
0000000000229260 T OPENSSL_sk_zero
00000000001d7200 T OPENSSL_strlcat
00000000001d71a0 T OPENSSL_strlcpy
00000000001d6fc0 T OPENSSL_strnlen
00000000001d4b20 T OPENSSL_thread_stop
00000000002713e0 T OPENSSL_uni2asc
0000000000271710 T OPENSSL_uni2utf8
0000000000271540 T OPENSSL_utf82uni
00000000001dbaa0 T OPENSSL_wipe_cpu

libcurl中OPENSSL_开头的符号 

faund@Sirius:/usr/lib/x86_64-linux-gnu$ nm -CDa libcurl.so | grep OPENSSL_
0000000000000000 A CURL_OPENSSL_4
                 U OPENSSL_load_builtin_modules
                 U OPENSSL_sk_num
                 U OPENSSL_sk_pop
                 U OPENSSL_sk_pop_free
                 U OPENSSL_sk_value

3.众说纷纭的解决办法

既然已经找出问题原因了,那就找找看怎么办。

由于我们没有第三方库的源码,无法用上面提到的文章中的重新编译共享库以只导出必要符号的办法(用-fvisibility=hidden选项编译,源码中函数前添加__attribute__ ((visibility ("default")))这种方法),所以要另想办法。

找了许多说法:

Static and shared library symbol conflicts?

这里面提到修改link选项以便于只导出需要的符号,进一步提到选项可以参考gnu手册,但是我们无法控制那个滥用符号导出的第三方库,而libcurl虽然可以重新编译,但是它已经非常克制,不但只导出了几个必须的符号,而且是以动态链接方式链接到系统当前的openssl动态库去的,所以这个选项对我们是没有用的。

 

Linux下包含相同符号表的两个库的冲突问题(郁闷)

linux 下同名符号冲突问题解决方法 

这两个是csdn上找到的,说的方法是在编译so库时加上链接选项: -Wl,-Bsymbolic,--version-script,version,用 version 文件中的脚本指定其导出哪些函数。由于我们没有第三方库的源码,和上一条方法一样的原因,对我们没有用。相似的说法还有 Linking two shared libraries with some of the same symbols 这个里面除了提到上面的选项,还一并提到__attribute__ ((visibility ("default")))这种方法

 

What should I do if two libraries provide a function with the same name generating a conflict?

mouviciel的回答中提到:可以用dlopen(), dlsym(), dlclose()动态地分别地加载两个冲突的动态库,用完一个马上用dlclose关掉;但是我们的应用程序中那个第三方库是必须一直加载着的,所以这个办法也行不通。

 

How can I link with (or work around) two third-party static libraries that define the same symbols?

这个人是非常有毅力,他逐个修改openssl库中冲突的符号,弄到做梦都梦见在改openssl源码。

 

符号冲突问题解决

这里提到了用symbol rename的办法,用objcopy --redefine-sym命令把不可控的第三方库中的符号给改个名字,可惜,objcopy的redefine-sym选项对so文件无效,这个办法也没有用。

 

linking 2 conflicting versions of a libraries

这里面讲用dlopen(..., RTLD_LOCAL);的办法可以让第三方库正常,还提到可以把openssl等用到的库静态链接并且隐藏掉符号的办法来编译libcurl,但是我尝试了RTLD_LOCAL这个方法,编译一切正常,运行时得到一个错误:invalid mode for dlopen()。事实上这里已经非常接近问题的解决了,可惜我不知道为什么他给出的答案我不能用。

 

How can I remove a symbol from a shared object?

这里面动了用objcopy把不需要的符号删除掉的念头,我试了,执行 objcopy -N 倒是没有报错也没有别的输出,我还高兴了一下,因为UNIX一向推崇“没有消息就是好消息”,但是当我再一次用nm到检查符号时,发现那个要删除的符号依然在那里。

 

4.突然的解决

我准备改弦易张了,大不了把需要libcurl库的功能独立出来另外再弄一个程序,再和原先的应用程序进行通信,这是最后的退路了。或者还有很多失败的尝试记录中记载的代替方法。而正在此时,胡乱地翻看到了这一篇文章:

Dynamic loading of shared library with RTLD_DEEPBIND 

当我一dlopen中加了参数RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND之后,程序功能正常了,正常了!

这个困扰我几天的问题,就这样戏剧性地解决掉了。

When we are supposed to use RTLD_DEEPBIND? 说明了原因:

You should use RTLD_DEEPBIND when you want to ensure that symbols looked up in the loaded library start within the library, and its dependencies before looking up the symbol in the global namespace.

This allows you to have the same named symbol being used in the library as might be available in the global namespace because of another library carrying the same definition; which may be wrong, or cause problems.

 当希望dlopen载入的库首先从自己和它的依赖库中查找符号,然后再去全局符号中去查找时,就用RTLD_DEEPBIND。这样就允许dlopen载入的库中存在与全局符号重名的符号,而对这个载入的库来说错的或者可能引起问题的全局符号可能是由其它库引入的。

现在,libcurl用dlopen方式运行时动态加载,加载时使用RTLD_DEEPBIND参数,这样,它就会首先从libcurl.so以及它所依赖的其它库中查找符号,从而避免了使用有问题的全局符号。

相关的程序片段

//用cURLpp发送RESTful post 请求
int DingDing::post_request_curl(const std::string& url, const std::string& jsonBody, std::string& strReturn)
{
    void *handle;
    static CURLcode (*f_global_init)(long) = NULL;
    static CURL *(*f_easy_init)(void) = NULL;
    static struct curl_slist *(*f_slist_append)(struct curl_slist *, const char *) = NULL;
    static CURLcode (*f_easy_setopt)(CURL *, CURLoption, ...) = NULL;
    static CURLcode (*f_easy_perform)(CURL *) = NULL;
    static void (*f_easy_cleanup)(CURL *) = NULL;
    static void (*f_global_cleanup)(void) = NULL;
    char *error;
    //handle = dlopen ("libcurl.so", RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND);
    handle = dlopen ("libcurl.so", RTLD_LAZY | RTLD_DEEPBIND);
    //handle = dlopen ("libcurl.so", RTLD_LAZY | RTLD_LOCAL);
    if (!handle) {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    dlerror();    /* Clear any existing error */
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_global_init = (CURLcode (*)(long)) dlsym(handle, "curl_global_init");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_easy_init = (CURL *(*)(void)) dlsym(handle, "curl_easy_init");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_slist_append = (struct curl_slist *(*)(struct curl_slist *, const char *))
                     dlsym(handle, "curl_slist_append");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_easy_setopt = (CURLcode (*)(CURL *, CURLoption, ...)) dlsym(handle, "curl_easy_setopt");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_easy_perform = (CURLcode (*)(CURL *)) dlsym(handle, "curl_easy_perform");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_easy_cleanup = (void (*)(CURL *)) dlsym(handle, "curl_easy_cleanup");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }
    f_global_cleanup = (void (*)(void)) dlsym(handle, "curl_global_cleanup");
    if ((error = dlerror()) != NULL)  {
        LOGINFO << "载入libcurl库出错,错误信息:" << dlerror();
        exit(1);
    }

    CURL *ch;
    CURLcode rv;

    f_global_init(CURL_GLOBAL_ALL);
    ch = f_easy_init();

    struct curl_slist *chunk = NULL;
    chunk = f_slist_append(chunk, "Content-Type: application/json");

    f_easy_setopt(ch, CURLOPT_VERBOSE, 1L);
    f_easy_setopt(ch, CURLOPT_HTTPHEADER, chunk);
    //curl_easy_setopt(ch, CURLOPT_HEADER, 0L);
    f_easy_setopt(ch, CURLOPT_POSTFIELDS, jsonBody.c_str());
    //curl_easy_setopt(ch, CURLOPT_NOPROGRESS, 1L);
    //curl_easy_setopt(ch, CURLOPT_NOSIGNAL, 1L);
    //curl_easy_setopt(ch, CURLOPT_WRITEFUNCTION, *writefunction);
    //curl_easy_setopt(ch, CURLOPT_WRITEDATA, stdout);
    //curl_easy_setopt(ch, CURLOPT_HEADERFUNCTION, *writefunction);
    //curl_easy_setopt(ch, CURLOPT_HEADERDATA, stderr);
    //curl_easy_setopt(ch, CURLOPT_SSLCERTTYPE, "PEM");
    f_easy_setopt(ch, CURLOPT_SSL_VERIFYPEER, 0L);
    f_easy_setopt(ch, CURLOPT_URL, url.c_str());

    /* Turn off the default CA locations, otherwise libcurl will load CA
     * certificates from the locations that were detected/specified at
     * build-time
     */
    //curl_easy_setopt(ch, CURLOPT_CAINFO, NULL);
    //curl_easy_setopt(ch, CURLOPT_CAPATH, NULL);

    /* first try: retrieve page without ca certificates -> should fail
     * unless libcurl was built --with-ca-fallback enabled at build-time
     */
    rv = f_easy_perform(ch);
    if(rv == CURLE_OK)
        LOGINFO << "libcurl请求发送成功";
    else
        LOGINFO << "libcurl请求发送失败";

    f_easy_cleanup(ch);
    f_global_cleanup();
    return rv;
}

失败的尝试记录

1. 链接到libcurl.a,但是这样做libcurl仍旧会用到动态openssl库,程序仍然会调用到错误版本的openssl,仍旧会死机。

2. 尝试把openssl静态链接到libcurl的so中,由于libcurl没有这个编译选项,也参考了很多其它人的说法,其中有很多trick,没有尝试下去(主要是在试这个的过程中“突然的解决”发生了:)),如果有人试过这招,请告诉我灵不灵。

3. 想到去找一个不使用openssl的http库来代替libcurl(curl很厚道,它把所有竞争者都列了个清单),难于上青天,openssl已经基本是事实上一统天下了,beast这些有https功能的requirement清单中都会有openssl。

延伸阅读

https://cseweb.ucsd.edu/~gbournou/CSE131/the_inside_story_on_shared_libraries_and_dynamic_loading.pdf

 

你可能感兴趣的:(linux,c++)