1-Hex编码

编码原理

Hex编码就是把一个8位的字节数据用两个十六进制数展示出来，编码时，将8位二进制码重新分组成两个4位的字节，其中一个字节的低4位是原字节的高四位，另一个字节的低4位是原数据的低4位，高4位都补0，然后输出这两个字节对应十六进制数字作为编码。Hex编码后的长度是源数据的2倍，Hex编码的编码表为

 0 0     1 1     2 2     3 3    
 4 4     5 5     6 6     7 7    
 8 8     9 9    10 a    11 b    
12 c    13 d    14 e    15 f

比如ASCII码A的Hex编码过程为

ASCII码：A (65)
二进制码：0100_0001
重新分组：0000_0100 0000_0001
十六进制：        4         1
Hex编码：41

丁
e4b881

代码实现

使用Bouncy Castle的实现

下面的代码使用开源软件Bouncy Castle实现Hex编解码，使用的版本是1.56。

import java.io.UnsupportedEncodingException;
import org.bouncycastle.util.encoders.Hex;
public class HexTestBC {
    public static void main(String[] args) 
            throws UnsupportedEncodingException {
        // 编码
        byte data[] = "A".getBytes("UTF-8");
        byte[] encodeData = Hex.encode(data);
        String encodeStr = Hex.toHexString(data);
        System.out.println(new String(encodeData, "UTF-8"));
        System.out.println(encodeStr);
        // 解码
        byte[] decodeData = Hex.decode(encodeData);
        byte[] decodeData2 = Hex.decode(encodeStr);
        System.out.println(new String(decodeData, "UTF-8"));
        System.out.println(new String(decodeData2, "UTF-8"));
    }
}

程序输出

41
41
A
A

使用Apache Commons Codec实现

下面的代码使用开源软件Apache Commons Codec实现Hex编解码，使用的版本是1.10。

import java.io.UnsupportedEncodingException;
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Hex;
public class HexTestCC {
    public static void main(String[] args)
            throws UnsupportedEncodingException,
                DecoderException {
        // 编码
        byte data[] = "A".getBytes("UTF-8");
        char[] encodeData = Hex.encodeHex(data);
        String encodeStr = Hex.encodeHexString(data);
        System.out.println(new String(encodeData));
        System.out.println(encodeStr);
        // 解码
        byte[] decodeData = Hex.decodeHex(encodeData);
        System.out.println(new String(decodeData, "UTF-8"));
    }
}

源码分析

Bouncy Castle实现源码分析

Bouncy Castle实现Hex编解码的是org.bouncycastle.util.encoders.HexEncoder类，实现编码时首先定义了一个编码表

protected final byte[] encodingTable =
{
    (byte)'0', (byte)'1', (byte)'2', (byte)'3', 
    (byte)'4', (byte)'5', (byte)'6', (byte)'7',
    (byte)'8', (byte)'9', (byte)'a', (byte)'b', 
    (byte)'c', (byte)'d', (byte)'e', (byte)'f'
};

然后编码的代码是

public int encode(
    byte[]                data,
    int                    off,
    int                    length,
    OutputStream    out) 
    throws IOException
{        
    for (int i = off; i < (off + length); i++)
    {
        int    v = data[i] & 0xff;
        out.write(encodingTable[(v >>> 4)]);
        out.write(encodingTable[v & 0xf]);
    }
    return length * 2;
}

解码的实现稍微复杂一点，在HexEncoder的构造方法中会调用initialiseDecodingTable建立解码表，代码如下

protected final byte[] decodingTable = new byte[128];
protected void initialiseDecodingTable()
{
    for (int i = 0; i < decodingTable.length; i++)
    {
        decodingTable[i] = (byte)0xff;
    }
    for (int i = 0; i < encodingTable.length; i++)
    {
        decodingTable[encodingTable[i]] = (byte)i;
    }
    
    decodingTable['A'] = decodingTable['a'];
    decodingTable['B'] = decodingTable['b'];
    decodingTable['C'] = decodingTable['c'];
    decodingTable['D'] = decodingTable['d'];
    decodingTable['E'] = decodingTable['e'];
    decodingTable['F'] = decodingTable['f'];
}

解码表是一个长度是128的字节数组，每个位置代表对应的ASCII码，该位置上的值表示该ASCII码对应的二进制码。具体到Hex的解码表，第48-59个位置，即ASCII码0-9的位置保存了数字0-9，第65-70个位置，即ASCII码A-F的位置保存了数字10-15，第97-102个位置，即ASCII码a-f同样保存了数字10-15。解码表为
比如array[65] = A

  -1      -1      -1      -1      -1      -1      -1      -1    
  -1      -1      -1      -1      -1      -1      -1      -1    
  -1      -1      -1      -1      -1      -1      -1      -1    
  -1      -1      -1      -1      -1      -1      -1      -1    
  -1    ! -1    " -1    # -1    $ -1    % -1    & -1    ' -1    
( -1    ) -1    * -1    + -1    , -1    - -1    . -1    / -1    
0  0    1  1    2  2    3  3    4  4    5  5    6  6    7  7    
8  8    9  9    : -1    ; -1    < -1    = -1    > -1    ? -1    
@ -1    A 10    B 11    C 12    D 13    E 14    F 15    G -1    
H -1    I -1    J -1    K -1    L -1    M -1    N -1    O -1    
P -1    Q -1    R -1    S -1    T -1    U -1    V -1    W -1    
X -1    Y -1    Z -1    [ -1    \ -1    ] -1    ^ -1    _ -1    
` -1    a 10    b 11    c 12    d 13    e 14    f 15    g -1    
h -1    i -1    j -1    k -1    l -1    m -1    n -1    o -1    
p -1    q -1    r -1    s -1    t -1    u -1    v -1    w -1    
x -1    y -1    z -1    { -1    | -1    } -1    ~ -1      -1

解码的过程实际上就是获取连续两个字节，取这两个字节解码表中对应的数值，然后将这两个数值拼接成一个8位二进制码，作为解码的输出。源码如下：

public int decode(
    byte[]          data,
    int             off,
    int             length,
    OutputStream    out)
    throws IOException
{
    byte    b1, b2;
    int     outLen = 0;
    
    int     end = off + length;
    
    while (end > off)
    {
        if (!ignore((char)data[end - 1]))
        {
            break;
        }
        
        end--;
    }
    
    int i = off;
    while (i < end)
    {
        while (i < end && ignore((char)data[i]))
        {
            i++;
        }
        
        b1 = decodingTable[data[i++]];
        
        while (i < end && ignore((char)data[i]))
        {
            i++;
        }
        
        b2 = decodingTable[data[i++]];
        if ((b1 | b2) < 0)
        {
            throw new IOException("invalid 
                  characters encountered in Hex data");
        }
        out.write((b1 << 4) | b2);
        
        outLen++;
    }
    return outLen;
}

其中ignore方法的代码如下，解码时会忽略首、尾及中间的空白。

private static boolean ignore(
    char    c)
{
    return c == '\n' || c =='\r' || c == '\t' || c == ' ';
}

示例代码中的Hex工具类持有HexEncoder的实例，并通过ByteArrayOutputStream类实现对byte数组的操作，此外不再赘述。

public class Hex
{
    private static final Encoder encoder = new HexEncoder();
    public static byte[] encode(
        byte[]    data,
        int       off,
        int       length)
    {
        ByteArrayOutputStream    bOut = new ByteArrayOutputStream();
        
        try
        {
            encoder.encode(data, off, length, bOut);
        }
        catch (Exception e)
        {
            throw new EncoderException("exception encoding Hex string: " 
                      + e.getMessage(), e);
        }
        
        return bOut.toByteArray();
    }
    ......
}

Apache Commons Codec实现源码分析

Apache Commons Codec实现Hex编码的步骤是直接创建一个两倍源数据长度的字符数组，然后分别将源数据的每个字节转换成两个字节放到目标字节数组中，Apache Commons Codec支持设置的要转换为大写还是小写。

private static final char[] DIGITS_LOWER =
    {'0', '1', '2', '3', '4', '5', '6', '7',
     '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'};
private static final char[] DIGITS_UPPER =
    {'0', '1', '2', '3', '4', '5', '6', '7',
     '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
public static char[] encodeHex(final byte[] data) {
    return encodeHex(data, true);
}
public static char[] encodeHex(final byte[] data, 
                               final boolean toLowerCase) {
        return encodeHex(data, 
                toLowerCase ? DIGITS_LOWER : DIGITS_UPPER);
}
protected static char[] encodeHex(final byte[] data,
                                  final char[] toDigits) {
    final int l = data.length;
    final char[] out = new char[l << 1];
    // two characters form the hex value.
    for (int i = 0, j = 0; i < l; i++) {
        out[j++] = toDigits[(0xF0 & data[i]) >>> 4];
        out[j++] = toDigits[0x0F & data[i]];
    }
    return out;
}

Apache Commons Codec实现Hex解码的步骤是首先创建一个原字符串一半长度的字节数组，然后依次将两个连续的十六进制数转换为一个字节数据，转换时使用了JDK的Character.digit方法。

public static byte[] decodeHex(final char[] data)
           throws DecoderException {
    final int len = data.length;
    if ((len & 0x01) != 0) {
        throw new DecoderException("Odd number of characters.");
    }
    final byte[] out = new byte[len >> 1];
    // two characters form the hex value.
    for (int i = 0, j = 0; j < len; i++) {
        int f = toDigit(data[j], j) << 4;
        j++;
        f = f | toDigit(data[j], j);
        j++;
        out[i] = (byte) (f & 0xFF);
    }
    return out;
}
protected static int toDigit(final char ch, final int index)
        throws DecoderException {
    final int digit = Character.digit(ch, 16);
    if (digit == -1) {
        throw new DecoderException(""
                + "Illegal hexadecimal character "
                + ch + " at index " + index);
    }
    return digit;
}