MySQL通讯协议研究1(基础知识)

为什么突然想起来要研究它了呢?是因为想用MySQL做点东西,但是发现它的客户端库,也就是MySql Connector/Net,是使用GPL授权的,换句话说,不可以闭源分发。听说MariaDB使用完全相同的通讯协议,所以它的客户端库可以兼容MySQL,它倒是LGPL的,不过没有.Net的版本,天啊,不带这么欺负.Net的。 于是乎,萌生了自己写connector的想法(能不能写得出来先不管),所以来研究一下MySQL的通讯协议。


声明一下,这不是教程,是研究过程中的笔记,所以错误在所难免。另外考虑到我不打算写一个大而全的组件,所以会有意忽略某些东西。


参考资料:http://dev.mysql.com/doc/internals/en/client-server-protocol.html


先整理一下相关的基础知识:


1、通讯协议

Client与Server之间支持多种通讯方式,最广泛使用的是TCP通讯,其次还支持命名管道和共享内存。 C/S之间采用一种半双式的模式收发数据,即在一个TCP链路上,Client发出请求数据后,只能等待接收完所有Server端的响应数据以后才能发下一批数据,中间不能发其它数据,有很强的顺序性要求。以登录为例:

Client               Server
  |      handshake     |
  |<-------------------|
  |   authentication   |
  |------------------->|
  |     auth result    |
  |<-------------------|
  |                    |


2、协议断层

MySQL在4.1版的时候扩充了通讯协议,因此面对不同版本的Server需要用不同的协议通讯。不过好在现在基本全是5.0以上版本了,我们可以忽略这个问题。


3、基础数据类型

注意这里是指通讯协议中使用的数据类型,而不是字段的数据类型。通讯协议中的所有信息从大面上讲只基于两种数据类型:数值和字符串。


数值又分定长型和变长型,其中定长型可以是1/2/3/4/6/8字节,变长型是指根据数值大小范围以不同的字节数来存储,可以是1/2/3/8字节,具体后面遇到再看。


字符串可分为定长型、NULL结尾型、变长型等几种,具体同样后面遇到了再看。


4、数据包基础格式

+-------------------+------------------+---------------+
| data_len(3 bytes) | sequence(1 byte) | data(n bytes) |
+-------------------+------------------+---------------+

写过TCP程序的应该都能理解,很常见的包格式。注意数据长度是指后面数据的长度,不包含头部的4字节。


5、字符集

后面很多地方都会遇到字符集,整理如下:

+-----+----------------------+
| id  | collation_name       |
+-----+----------------------+
|   1 | big5_chinese_ci      |
|   2 | latin2_czech_cs      |
|   3 | dec8_swedish_ci      |
|   4 | cp850_general_ci     |
|   5 | latin1_german1_ci    |
|   6 | hp8_english_ci       |
|   7 | koi8r_general_ci     |
|   8 | latin1_swedish_ci    |
|   9 | latin2_general_ci    |
|  10 | swe7_swedish_ci      |
|  11 | ascii_general_ci     |
|  12 | ujis_japanese_ci     |
|  13 | sjis_japanese_ci     |
|  14 | cp1251_bulgarian_ci  |
|  15 | latin1_danish_ci     |
|  16 | hebrew_general_ci    |
|  18 | tis620_thai_ci       |
|  19 | euckr_korean_ci      |
|  20 | latin7_estonian_cs   |
|  21 | latin2_hungarian_ci  |
|  22 | koi8u_general_ci     |
|  23 | cp1251_ukrainian_ci  |
|  24 | gb2312_chinese_ci    |
|  25 | greek_general_ci     |
|  26 | cp1250_general_ci    |
|  27 | latin2_croatian_ci   |
|  28 | gbk_chinese_ci       |
|  29 | cp1257_lithuanian_ci |
|  30 | latin5_turkish_ci    |
|  31 | latin1_german2_ci    |
|  32 | armscii8_general_ci  |
|  33 | utf8_general_ci      |
|  34 | cp1250_czech_cs      |
|  35 | ucs2_general_ci      |
|  36 | cp866_general_ci     |
|  37 | keybcs2_general_ci   |
|  38 | macce_general_ci     |
|  39 | macroman_general_ci  |
|  40 | cp852_general_ci     |
|  41 | latin7_general_ci    |
|  42 | latin7_general_cs    |
|  43 | macce_bin            |
|  44 | cp1250_croatian_ci   |
|  47 | latin1_bin           |
|  48 | latin1_general_ci    |
|  49 | latin1_general_cs    |
|  50 | cp1251_bin           |
|  51 | cp1251_general_ci    |
|  52 | cp1251_general_cs    |
|  53 | macroman_bin         |
|  57 | cp1256_general_ci    |
|  58 | cp1257_bin           |
|  59 | cp1257_general_ci    |
|  63 | binary               |
|  64 | armscii8_bin         |
|  65 | ascii_bin            |
|  66 | cp1250_bin           |
|  67 | cp1256_bin           |
|  68 | cp866_bin            |
|  69 | dec8_bin             |
|  70 | greek_bin            |
|  71 | hebrew_bin           |
|  72 | hp8_bin              |
|  73 | keybcs2_bin          |
|  74 | koi8r_bin            |
|  75 | koi8u_bin            |
|  77 | latin2_bin           |
|  78 | latin5_bin           |
|  79 | latin7_bin           |
|  80 | cp850_bin            |
|  81 | cp852_bin            |
|  82 | swe7_bin             |
|  83 | utf8_bin             |
|  84 | big5_bin             |
|  85 | euckr_bin            |
|  86 | gb2312_bin           |
|  87 | gbk_bin              |
|  88 | sjis_bin             |
|  89 | tis620_bin           |
|  90 | ucs2_bin             |
|  91 | ujis_bin             |
|  92 | geostd8_general_ci   |
|  93 | geostd8_bin          |
|  94 | latin1_spanish_ci    |
|  95 | cp932_japanese_ci    |
|  96 | cp932_bin            |
|  97 | eucjpms_japanese_ci  |
|  98 | eucjpms_bin          |
|  99 | cp1250_polish_ci     |
| 128 | ucs2_unicode_ci      |
| 129 | ucs2_icelandic_ci    |
| 130 | ucs2_latvian_ci      |
| 131 | ucs2_romanian_ci     |
| 132 | ucs2_slovenian_ci    |
| 133 | ucs2_polish_ci       |
| 134 | ucs2_estonian_ci     |
| 135 | ucs2_spanish_ci      |
| 136 | ucs2_swedish_ci      |
| 137 | ucs2_turkish_ci      |
| 138 | ucs2_czech_ci        |
| 139 | ucs2_danish_ci       |
| 140 | ucs2_lithuanian_ci   |
| 141 | ucs2_slovak_ci       |
| 142 | ucs2_spanish2_ci     |
| 143 | ucs2_roman_ci        |
| 144 | ucs2_persian_ci      |
| 145 | ucs2_esperanto_ci    |
| 146 | ucs2_hungarian_ci    |
| 192 | utf8_unicode_ci      |
| 193 | utf8_icelandic_ci    |
| 194 | utf8_latvian_ci      |
| 195 | utf8_romanian_ci     |
| 196 | utf8_slovenian_ci    |
| 197 | utf8_polish_ci       |
| 198 | utf8_estonian_ci     |
| 199 | utf8_spanish_ci      |
| 200 | utf8_swedish_ci      |
| 201 | utf8_turkish_ci      |
| 202 | utf8_czech_ci        |
| 203 | utf8_danish_ci       |
| 204 | utf8_lithuanian_ci   |
| 205 | utf8_slovak_ci       |
| 206 | utf8_spanish2_ci     |
| 207 | utf8_roman_ci        |
| 208 | utf8_persian_ci      |
| 209 | utf8_esperanto_ci    |
| 210 | utf8_hungarian_ci    |
+-----+----------------------+


未完待续……



你可能感兴趣的:(mysql,protocol)