Modular characters are a method of storing compressed integer values. They are used in the object map to
indicate both handle offsets and file location offsets. They consist of a stream of bytes, terminating when
the high bit of the byte is 0.
模块化字符是存储压缩整数值的方法。它们在对象映射中用于指示句柄偏移和文件位置偏移。它们由字节流组成,当字节的高位为0时终止。
In each byte, the high bit is a flag; when set, it indicates that another byte follows. The concept is not
difficult to understand, but is a little difficult to explain. Let’s look at an example.
在每个字节中,高位是一个标志;当设置时,它指示另一个字节跟随。这个概念不难理解,但解释起来有点困难。让我们看一个例子。
Assume the next two bytes in the file are:
10000010 00100100
We read bytes until we reach a byte with a high bit of 0. Obviously the second byte meets that criterion.
Since we are reading from least significant to most significant, let's reverse the order of the bytes so that
they read MSB to LSB from left to right.
假设文件中的下两个字节是:因为我们是从最重要的到最重要的阅读,让我们逆转的字节,从左至右读到LSB MSB的顺序。
Now we drop the high order flag bits:
现在我们放弃了高阶标志位:
And then re-group the bits from right to left, padding on the left with 0's:
然后从右到左重新分组,左边填充0个:
Result = 2 + 18*256 = 4610
Here’s another example using the basic formF1101001 F0010111 F1100110 00110101:
11101001 10010111 11100110 00110101
这里的另一个例子使用的基本formf1101001 f0010111 f1100110 00110101:
11101001 10010111 11100110 00110101
We read bytes until we reach a byte with a high bit of 0. Obviously the fourth byte meets that criterion.
Since we are reading from least significant to most significant, let's reverse the order of the bytes so that
they read MSB to LSB from left to right.
我们读字节直到达到一个高0位的字节。显然,第四字节符合这个标准。
Now we drop the high order flag bits:
现在我们放弃了高阶标志位:
And then re-group the bits from right to left, padding on the left with 0's:
然后从右到左重新分组,左边填充0个:
Result:233+139*256+185*256^2+6*256^3=112823273
This process is further complicated by the fact that if the final byte (high bit 0) also has the 64 bit (0x40)
set, this means to negate the number.
This is a negative number: 10000101 01001011
Since we are reading from least significant to most significant, let's reverse the order of the bytes so that
they read MSB to LSB from left to right.
这一过程进一步复杂化的事实,如果最后的字节(高0位)也有64位(0x40),这意味着否定的数量。
We then clear the bit that was used to represent the negative number, and note that the result must be
negated:
然后,我们清除用来表示负数的位,并注意结果必须否定:
Now we drop the high order flag bits:
现在我们放弃了高阶标志位:
And then re-group the bits from right to left, padding on the left with 0's:
然后从右到左重新分组,左边填充0个:
Result: 133+5*256=1413, which we negate to get –1413
Modular chars are also used to store handle offsets in the object map. In this case there is no negation
used; handles in the object map are always in increasing order.
模块字符也用于存储对象映射中的句柄偏移。在这种情况下,没有使用否定;对象映射中的句柄总是递增的。
MODULAR SHORTS
Modular shorts work just like modular chars -- except that the base module is a short instead of a char.
There are only two cases to worry about here (from a practical point of view), because, in the case of
shorts, two modules make a long, and since these are used only to indicate object sizes, a maximum
object size of 1 GB is probably correct.
00110001 11110100 10001101 00000000.
Reverse the order of the shorts:
模块化的 shorts 就像模块化的字符一样工作——只是基本模块是短的,而不是char。
Reverse the order of the bytes in each short:
反转每个字节的字节顺序:
Drop the high order flag bit of each short:
删除每个短的高阶标志位:
And then re-group the bits from right to left, padding on the left with 0's:
然后从右到左重新分组,左边填充0个:
Result: 62513+70*65536=4650033