How To Write Shared Libraries(10)

1.5.2 Symbol Relocations (3)

To measure the effectiveness of the hashing two numbers are important:
• The average chain length for a successful lookup.
• The average chain length for an unsuccessful lookup.
度量hash效率两个维度:
查找到的平均链长度。
查找失败的平均链长度。

It might be surprising to talk about unsuccessful lookups here but in fact they are the rule. Note that “unsuccess-ful” means only unsuccessful in the current objects.
可能因为这失败的查找感到奇怪,实际这就是规则。注意这里的失败只是当前文件。
Only for objects which implement almost everything they get looked in for is the successful lookup number more important. In this category there are basically only two ob-jects on a Linux system: the C library and the dynamic linker itself.
只有对于那些几乎实现了所有查找内容的对象,成功的查找数字才更重要。在这个类别中,Linux系统上基本上只有两个对象:C库和动态链接器本身。(有道翻译)

Some versions of the readelf program compute the value directly and the output is similar to figures 3 and 4. The data in these examples shows us a number of things. Based on the number of symbols (2027 versus 106) the chosen table size is radically different. For the smaller table the linker can afford to “waste” 53.9% of the hash table en- tries which contain no data. That’s only 412 bytes on a gABI-compliant system. If the same amount of over- head would be allowed for the libc.so binary the table would be 4 kilobytes or more larger. That is a big dif- ference. The linker has a fixed cost function integrated which takes the table size into account.
一些版本的readelf程序直接计算值,输出类似于图3和图4。这些例子中的数据向我们展示了许多事情。根据符号的数量(2027和106),所选表的大小是截然不同的。对于较小的表,链接器可以“浪费”53.9%不包含数据的哈希表条目。在一个符合gabi的系统上,这只有412字节。如果允许libc有同样的开销的话。所以二进制表应该是4kb或者更大。这是一个很大的区别。链接器有一个固定的成本功能集成,考虑到表的大小。(有道翻译)

The increased relative table size means we have signifi- cantly shorter hash chains. This is especially true for the average chain length for an unsuccessful lookup. The av- erage for the small table is only 28% of that of the large table.
相对表大小的增加意味着我们有更短的哈希链。对于不成功查找的平均链长,尤其如此。小表的平均浏览量只有表的28%。(有道翻译)

What these numbers should show is the effect of reduc- ing the number of symbols in the dynamic symbol ta- ble. With significantly fewer symbols the linker has a much better chance to counter the effects of the subopti- mal hashing function.
这些数字应该显示的是减少动态符号表中符号数量的效果。有了明显更少的符号,链接器有更好的机会来抵消次优哈希函数的影响。(有道翻译)

Another factor in the cost of the lookup algorithm is con- nected with the strings themselves. Simple string com- parison is used on the symbol names which are stored in a string table associated with the symbol table data structures. Strings are stored in the C-format; they are terminated by a NUL byte and no initial length field is used. This means string comparisons has to proceed until a non-matching character is found or until the end of the string. This approach is susceptible to long strings with common prefixes. Unfortunately this is not uncommon.
查找算法成本的另一个因素与字符串本身有关。简单字符串比较用于存储在与符号表数据结构相关联的字符串表中的符号名称。字符串以c格式存储;它们以NUL字节结束,并且不使用初始长度字段。这意味着必须继续进行字符串比较,直到找到不匹配的字符或直到字符串结束。这种方法容易受到带有通用前缀的长字符串的影响。不幸的是,这种情况并不少见。(有道翻译)

The name mangling scheme used by the GNU C++ com- piler before version 3.0 used a mangling scheme which put the name of a class member first along with a descrip- tion of the parameter list and following it the other parts of the name such as namespaces and nested class names. The result is a name which distinguishable in the begin- ning if the member names are different. For the example above the mangled names for the two members functions look like this figure 5.
矫直方案名称使用GNU c++ com - version 3.0之前堆垛机矫直方案使用一个类成员的名字放在第一位加上descrip后,参数列表和它的其他部分的名字如名称空间和嵌套的类名。如果成员名不同,则结果是一个开头可区分的名称。在上面的示例中,两个成员函数的变形名称如图5所示。(有道翻译)

In the new mangling scheme used in today’s gcc versions and all other compilers which are compatible with the common C++ ABI the names start with the namespaces and class names and end with the member names. Fig- ure 6 shows the result for the little example. The mangled names for the two member functions differs only after the 43rd character. This is really bad performance-wise if the two symbols should fall into the same hash bucket.
在今天的gcc版本和所有其他与通用c++ ABI兼容的编译器中使用的新的mangling模式中,名称以名称空间和类名开始,以成员名结束。图6显示了这个小示例的结果。两个成员函数的变形名称仅在第43个字符之后不同。如果两个符号落在同一个散列桶中,那么性能就会非常差。(有道翻译)

Ada has similar problems. The standard Ada library for gcc has all symbols prefixed with ada , then the pack- age and sub-package names, followed by function name. Figure 7 shows a short excerpt of the list of symbols from the library. The first 23 character are the same for all the names.
艾达也有类似的问题。gcc的标准Ada库有所有符号前缀Ada,然后是包名和子包名,最后是函数名。图7显示了库中符号列表的一个简短摘录。所有名字的前23个字符是相同的。(有道翻译)

The length of the strings in both mangling schemes is worrisome since each string has to be compared com- pletely when the symbol itself is searched for. The names in the example are not extra ordinarily long either. Look- ing through the standard C++ library one can find many names longer than 120 characters and even this is not the longest. Other system libraries feature names longer than 200 characters and complicated, “well designed” C++ projects with many namespaces, templates, and nested classes can feature names with more than 1,000 charac-ters. One plus point for design, but minus 100 points for performance.
两种篡改方案中的字符串长度都令人担忧,因为在搜索符号本身时必须对每个字符串进行完全比较。示例中的名称通常也不是特别长。查看标准c++库,可以发现许多超过120个字符的名称,即使这也不是最长的。其他系统库的名称长度超过200个字符,具有许多名称空间、模板和嵌套类的复杂“设计良好”c++项目的名称可以超过1000个字符。设计加1分,性能减100分。(有道翻译)

With the knowledge of the hashing function and the de- tails of the string lookup let us look at a real-world exam- ple: OpenOffice.org. The package contains 144 separate DSOs. During startup about 20,000 relocations are per- formed. Many of the relocations are performed as the result of dlopen calls and therefore cannot be optimized away by using prelink [7]. The number of string compar- isons needed during the symbol resolution can be used as a fair value for the startup overhead. We compute an approximation of this value now.
有了哈希函数的知识和字符串查找的细节,让我们看一个真实世界的例子:OpenOffice.org。这个包裹包含144个分开的DSOs。在启动期间,大约进行了20,000次搬迁。许多重定位都是作为dlopen调用的结果执行的,因此不能通过使用预链接[7]进行优化。符号解析过程中需要的字符串比较的数量可以用作启动开销的公平值。我们现在计算这个值的近似值。(有道翻译)

The average chain length for unsuccessful lookup in all DSOs of the OpenOffice.org 1.0 release on IA-32 is 1.1931. This means for each symbol lookup the dynamic linker has to perform on average 72 × 1.1931 = 85.9032 string comparisons. For 20,000 symbols the total is 1,718,064 string comparisons. The average length of an exported symbol defined in the DSOs of OpenOffice.org is 54.13. Even if we are assuming that only 20% of the string is searched before finding a mismatch (which is an opti- mistic guess since every symbol name is compared com- pletely at least once to match itself) this would mean a to- tal of more then 18.5 million characters have to be loaded from memory and compared. No wonder that the startup is so slow, especially since we ignored other costs.
在IA-32上发布的OpenOffice.org 1.0版本中,所有DSOs中查找失败的平均链长是1.1931。这意味着对于每个符号查找,动态连接器必须平均执行72 × 1.1931 = 85.9032个字符串比较。对于20,000个符号,总共是1,718064个字符串比较。在OpenOffice.org的DSOs中定义的导出符号的平均长度是54.13。即使我们假定只有20%的搜索字符串之前找到一个不匹配(opti - mistic猜因为每个符号的名字是com相比完全匹配本身)至少一次这将意味着-塔尔的超过1850万个字符必须从内存加载和比较。难怪创业会如此缓慢,尤其是在我们忽略了其他成本之后。(有道翻译)

To compute number of lookups the dynamic linker per- forms one can use the help of the dynamic linker. If the environment variable LD DEBUG is set to symbols one only has to count the number of lines which start with symbol=. It is best to redirect the dynamic linker’s out- put into a file with LD DEBUG OUTPUT. The number of string comparisons can then be estimate by multiplying the count with the average hash chain length. Since the collected output contains the name of the file which is looked at it would even be possible to get more accurate results by multiplying with the exact hash chain length for the object.
要计算动态连接器执行的查找次数,可以使用动态连接器的帮助。如果环境变量LD DEBUG设置为符号,则只需计算符号=开头的行数。最好将动态连接器的输出重定向到带有LD DEBUG OUTPUT的文件中。然后,可以通过将计数与平均哈希链长度相乘来估计字符串比较的次数。由于收集的输出包含要查看的文件的名称,因此甚至可以通过与对象的确切哈希链长度相乘来获得更准确的结果。(有道翻译)

Changing any of the factors ‘number of exported sym- bols’, ‘length of the symbol strings’, ‘number and length of common prefixes’,‘number of DSOs’, and ‘hash table size optimization’ can reduce the costs dramatically. In general the percentage spent on relocations of the time the dynamic linker uses during startup is around 50-70% if the binary is already in the file system cache, and about 20-30% if the file has to be loaded from disk. It is therefore worth spending time on these issues and in the re- mainder of the text we will introduce methods to do just that. So far to remember: pass -O1 to the linker to generate the final product.
改变任何因子'导出符号的数量- bol ', '符号字符串的长度','常见前缀的数量和长度',' DSOs的数量',和'哈希表大小优化'可以显著降低成本。一般来说,如果二进制文件已经在文件系统缓存中,则动态连接器在启动期间使用的重新定位时间的百分比约为50-70%,如果必须从磁盘加载文件,则大约为20-30%。因此,花时间讨论这些问题是值得的,在本文的其余部分,我们将介绍这样做的方法。到目前为止要记住:将-O1传递给链接器以生成最终产品。(有道翻译)

todo:
dlopen
“export LD_DEBUG=bindings # 符号绑定”的用法
export LD_DEBUG=symbols
export LD_DEBUG=help

你可能感兴趣的:(How To Write Shared Libraries(10))