Mysql Blob通过压缩来提升性能
对于blob字段,由于当前版本较低,5.0目前文件格式默认为compact,还不支持compress格式的压缩.我觉得可以采用如下方式来节省空间.
1.通过应用服务器端的压缩,比如zlib算法,进行压缩然后存入数据库。当然应用需要先读数据,压缩,然后再写新的表,然后互相rename即可.
好处:
1.1 通过压缩,可能app端会带来CPU的上升,但是通过把这个过程转移到app层,可以减少db层的压力。
1.2 同时由于压缩后数据变小了,节省了网络的开销(app<--->db)。
1.3 cpu cache的开销降低。
1.4 mem的开销降低。
1.5 db cache的开销降低。
1.6 同时db层的I/O也会减少。
2.通过数据库层面的压缩,通过数据库本身提供的compress/decompress来实现.
坏处[1]:
2.1 用cpu换io,cpu负载上升。
2.2 db buffer 除了要cache compress的数据,还要cache decompress的数据,浪费了内存空间
好处[2]:
2.1 节省了存储空间,disk的seek会降低,降低了磁盘io。
2.2 cpu cache的开销降低。
对于第一种方案当然是最好的方案,对于db层的方案我们来test:
mysql> select sum(length(base)-length(compress(base))) from player;
+------------------------------------------+
| sum(length(base)-length(compress(base))) |
+------------------------------------------+
| 213423997 |
+------------------------------------------+
213423997 /1024/1024=203.5369844436646M
可以看到使用了compress之后,光一个字段都节省了203.5M的空间,对空间的利用率有很大提高。
那么对于所有的blob使用了db层的压缩之后能节省多少空间呢?
mysql> select sum(length(base)-length(compress(base)))+
-> sum(length(shortcut)-length(compress(shortcut)))+
-> sum(length(item)-length(compress(item)))+
-> sum(length(spell)-length(compress(spell)))+
-> sum(length(buff)-length(compress(buff)))+
-> sum(length(pet)-length(compress(pet)))+
-> sum(length(task)-length(compress(task)))+
-> sum(length(mail)-length(compress(mail)))+
-> sum(length(friend)-length(compress(friend)))+
-> sum(length(corps)-length(compress(corps)))+
-> sum(length(mounts)-length(compress(mounts)))+
-> sum(length(titles)-length(compress(titles)))+
-> sum(length(apply)-length(compress(apply)))+
-> sum(length(armor)-length(compress(armor)))+
-> sum(length(base2)-length(compress(base2)))+
-> sum(length(bond)-length(compress(bond)))
-> from player;
+----------------------------------------------------------------------------
| 2959392318 |
+---------------------------------------------------------------------------
1 row in set (6 min 11.49 sec)
2959392318 /1024/1024/1024=2.756148872897029G
该数据量的总大小大概4.6G,而使用压缩之后至少可以可以节省一半以上(60%),可以对于IO密集型的table,选择压缩或许是一个可行的方案。如何选择还在于具体的测试结果来定!
关于字符的的类型的选择范围如下:
Data Type | Storage Required |
---|---|
CHAR(M) | M × w bytes, 0 <= M <= 255, where w is the number of bytes required for the maximum-length character in the character set |
BINARY(M) | M bytes, 0 <= M <= 255 |
VARCHAR(M), VARBINARY(M) | L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes |
TINYBLOB, TINYTEXT | L + 1 bytes, where L < 28 |
BLOB, TEXT | L + 2 bytes, where L < 216 |
MEDIUMBLOB, MEDIUMTEXT | L + 3 bytes, where L < 224 |
LONGBLOB, LONGTEXT | L + 4 bytes, where L < 232 |
ENUM('value1','value2',...) | 1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum) |
SET('value1','value2',...) | 1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum) |
每个colum节省了一点点,对于所有的记录来讲,可能节省大量的空间,从而带来性能的显著提升,所以合理的设置数据类型是非常有必要的。
对于innodb的表,blob的存储格式大概如下:
当 blob coulum length<8000 byte 存放在一个page中。否则通过overflow page的方式来存储,前768个字节存放在当前page中,剩余通过一个指针指向另外的overflow page中.
通过overflow的存储方式,也可以从理论上分析到,当一条记录中,每个大于8000 byte的page只占用了768 byte,可以腾出空间存放更多的记录。而且在不查询该blob字段时,也不用去overflow page中read 该blob的剩余数据,可以很好的节省IO。
当然如果经常需要查询该blob数据的话,可能磁盘会需要额外的random seek,去读overflow的数据,从而会有额外的开销!
innodb使用大字段text,blob的一些优化建议
最近看到一些老应用,在表结构的设计上使用了text或者blob的字段;其中一个应用,对blob字段的依赖非常的严重,查询和更新的频率也是非常的高,单表的存储空间已经达到了近100G,这个时候,应用其实已经被数据库绑死了,任何应用或者查询逻辑的变更几乎成为不可能;
为了清楚大字段对性能的影响,我们必须要知道innodb存储引擎在底层对行的处理方式:
知识点一:在5.1中,innodb存储引擎的默认的行格式为compact(redundant为兼容以前的版本),对于blob,text,varchar(8099)这样的大字段,innodb只会存放前768字节在数据页中,而剩余的数据则会存储在溢出段中(发生溢出情况的时候适用);
知识点二:innodb的块大小默认为16kb,由于innodb存储引擎表为索引组织表,树底层的叶子节点为一双向链表,因此每个页中至少应该有两行记录,这就决定了innodb在存储一行数据的时候不能够超过8k(8098字节);
知识点三:使用了blob数据类型,是不是一定就会存放在溢出段中?通常我们认为blob,clob这类的大对象的存储会把数据存放在数据页之外,其实不然,关键点还是要看一个page中到底能否存放两行数据,blob可以完全存放在数据页中(单行长度没有超过8098字节),而varchar类型的也有可能存放在溢出页中(单行长度超过8098字节,前768字节存放在数据页中);
知识点四:5.1中的innodb_plugin引入了新的文件格式:barracuda(将compact和redundant合称为antelope),该文件格式拥有新的两种行格式:compressed和dynamic,两种格式对blob字段采用完全溢出的方式,数据页中只存放20字节,其余的都存放在溢出段中:
知识点五:mysql在操作数据的时候,以page为单位,不管是更新,插入,删除一行数据,都需要将那行数据所在的page读到内存中,然后在进行操作,这样就存在一个命中率的问题,如果一个page中能够相对的存放足够多的行,那么命中率就会相对高一些,性能就会有提升;
有了上面的知识点,我们一起看看该应用的特点,表结构:
CREATE TABLE `xx_msg` (
`col_user` VARCHAR(64) NOT NULL,
`col_smallint` SMALLINT(6) NOT NULL,
`col_lob` longblob,
`gmt_create` datetime DEFAULT NULL,
`gmt_modified` datetime DEFAULT NULL,
PRIMARY KEY (`xxx`)
) ENGINE=InnoDB DEFAULT CHARSET=gbk
col_lob为blob字段,用于存放该用户的所有的消息,其平均长度在2.4kb左右,该表中其他剩余的字段则是非常的小,大致在60字节左右
SELECT avg(LENGTH(col_clob)) FROM (SELECT * fromxxx_msg LIMIT 30000)a;
| 2473.8472 |
该表的应用场景包括:
1) select col_user ,col_smallint,DATE_FORMAT(gmt_modified,’%Y-%m-%d’) from xx_msg;
2) update xx_msg set gmt_modified=’2012-03-31 23:16:30′,col_smallint=1,col_lob=’xxx’ where col_user=’xxx’;
3) select col_smallint from xx_msg where user=’xxx’;
可以看到由于单行的平均长度(2.5k)还远小于一个innodb page的size(16k)(当然也有存在超过8k的行),也就是知识点三中提到的,blob并不会存放到溢出段中,而是存放到数据段中去,innodb能够将一行的所有列(包括longlob)存储在数据页中:
在知识点五中,mysql的io以page为单位,因此不必要的数据(大字段)也会随着需要操作的数据一同被读取到内存中来,这样带来的问题由于大字段会占用较大的内存(相比其他小字段),使得内存利用率较差,造成更多的随机读取。
从上面的分析来看,我们已经看到性能的瓶颈在于由于大字段存放在数据页中,造成了内存利用较差,带来过多的随机读,那怎么来优化掉这个大字段的影响:
一.压缩:
在知识点四中,innodb提供了barracuda文件格式,将大字段完全存放在溢出段中,数据段中只存放20个字节,这样就大大的减小了数据页的空间占用,使得一个数据页能够存放更多的数据行,也就提高了内存的命中率(对于本实例,大多数行的长度并没有超过8k,所以优化的幅度有限);如果对溢出段的数据进行压缩,那么在空间使用上也会大大的降低,具体的的压缩比率可以设置key_blok_size来实现。
二.拆分:
将主表拆分为一对一的两个关联表:
CREATE TABLE `xx_msg` (
`col_user` VARCHAR(64) NOT NULL,
`col_smallint` SMALLINT(6) NOT NULL,
`gmt_create` datetime DEFAULT NULL,
`gmt_modified` datetime DEFAULT NULL,
PRIMARY KEY (`xxx`)
) ENGINE=InnoDB DEFAULT CHARSET=gbk;
CREATE TABLE `xx_msg_lob` (
`col_user` VARCHAR(64) NOT NULL,
`col_lob` longblob,
PRIMARY KEY (`xxx`)
) ENGINE=InnoDB DEFAULT CHARSET=gbk
xx_msg表由于将大字段单独放到另外一张表后,单行长度变的非常的小,page的行密度相比原来的表大很多,这样就能够缓存足够多的行,表上的多个select由于buffer pool的高命中率而受益;应用程序需要额外维护的是一张大字段的子表;
三.覆盖索引:
在上面的两个查询当中,都是查询表中的小字段,由于老的方案需要全表或者根据主键来定位表中的数据,但是还是以page为单位进行操作,blob字段存在还是会导致buffer pool命中率的下降,如果通过覆盖索引来优化上面的两个查询,索引和原表结构分开,从访问密度较小的数据页改为访问密度很大的索引页,随机io转换为顺序io,同时内存命中率大大提升;额外的开销为数据库多维护一个索引的代价;
alter table xx_msg add index ind_msg(col_user ,col_smallint,gmt_modified);
对于查询一,原来的执行计划为走全表扫描,现在通过全索引扫描来完成查询;
对于查询二,原来的执行计划为走主键PK来定位数据,现在该走覆盖索引ind_msg完成查询;
注意上面的两个查询为了稳固执行计划,需要在sql执行中加入hint提示符来强制sql通过索引来完成查询;
总结:上面三种思路来优化大字段,其核心思想还是让单个page能够存放足够多的行,不断的提示内存的命中率,尽管方法不同,但条条大路通罗马,从数据库底层存储的原理出发,能够更深刻的优化数据库,扬长避短,达到意想不到的效果。
ref:《innodb 技术内幕》
ref:MySQL Blob Compression performance benefits
ref: Data compression in InnoDB for text and blob fields
ref:Handling long texts/blobs in InnoDB – 1 to 1 relationship, covering index
Innodb 关于blob、text字段的存储以及性能
对于innodb存储大的blob字段,将会对索引、查询性能造成极大的影响(从mysql5.1版本有所改进),以下为关于此的相关原因分析与总结:
1、http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/ (转载)
I’m running in this misconception second time in a week or so, so it is time to blog about it.
How blobs are stored in Innodb ? This depends on 3 factors. Blob size; Full row size and Innodb row format.
But before we look into how BLOBs are really stored lets see what misconception is about. A lot of people seems to think for standard (“Antelope”) format first 768 bytes are stored in the row itself while rest is stored in external pages, which would make such blobs really bad. I even seen a solution to store several smaller blobs or varchar fields which are when concatenated to get the real data. This is not exactly what happens
With COMPACT and REDUNDANT row formats (used in before Innodb plugin and named “Antelope” in Innodb Plugin and XtraDB) Innodb would try to fit the whole row onto Innodb page. At least 2 rows have to fit to each page plus some page data, which makes the limit about 8000 bytes. If row fits completely Innodb will store it on the page and not use external blob storage pages. For example 7KB blob can be stored on the page. However if row does not fit on the page, for example containing two 7KB blobs Innodb will have to pick some of them and store them in external blob pages. It however will keep at least 768 bytes from each of the BLOBs on the row page itself. With two of 7KB blobs we will have one blob stored on the page completely while another will have 768 bytes stored on the row page and the remainder at external page.
Such decision to store first 768 bytes of the BLOB may look strange, especially as MySQL internally has no optimizations to read portions of the blob – it is either read completely or not at all, so the 768 bytes on the row page is a little use – if BLOB is accessed external page will always have to be read. This decision seems to be rooted in desire to keep code simple while implementing initial BLOB support for Innodb – BLOB can have prefix index and it was easier to implement index BLOBs if their prefix is always stored on the row page.
This decision also causes strange data storage “bugs” – you can store 200K BLOB easily, however you can’t store 20 of 10K blobs. Why ? Because each of them will try to store 768 bytes on the row page itself and it will not fit.
Another thing to beware with Innodb BLOB storage is the fact external blob pages are not shared among the blobs. Each blob, even if it has 1 byte which does not fit on the page will have its own 16K allocated. This can be pretty inefficient so I’d recommend avoiding multiple large blobs per row when possible. Much better decision in many cases could be combine data in the single large Blob (and potentially compress it)
If all columns do not fit to the page completely Innodb will automatically chose some of them to be on the page and some stored externally. This is not clearly documented neither can be hinted or seen. Furthermore depending on column sizes it may vary for different rows. I wish Innodb would have some way to tune it allowing me to force actively read columns for inline store while push some others to external storage. May be one day we’ll come to implementing this in XtraDB
So BLOB storage was not very efficient in REDUNDANT (MySQL 4.1 and below) and COMPACT (MySQL 5.0 and above) format and the fix comes with Innodb Plugin in “Barracuda” format and ROW_FORMAT=DYNAMIC. In this format Innodb stores either whole blob on the row page or only 20 bytes BLOB pointer giving preference to smaller columns to be stored on the page, which is reasonable as you can store more of them. BLOBs can have prefix index but this no more requires column prefix to be stored on the page – you can build prefix indexes on blobs which are often stored outside the page.
COMPRESSED row format is similar to DYNAMIC when it comes to handling blobs and will use the same strategy storing BLOBs completely off page. It however will always compress blobs which do not fit to the row page, even if KEY_BLOCK_SIZE is not specified and compression for normal data and index pages is not enabled.
If you’re interested to learn more about Innodb row format check out this page in Innodb docs:
It is worth to note I use BLOB here in a very general term. From storage prospective BLOB, TEXT as well as long VARCHAR are handled same way by Innodb. This is why Innodb manual calls it “long columns” rather than BLOBs.
2、mysql 5.1 官网解释:
All data in InnoDB is stored in database pages comprising a B-tree index (the so-called clustered index or primary key index). The essential idea is that the nodes of the B-tree contain, for each primary key value (whether user-specified or generated or chosen by the system), the values of the remaining columns of the row as well as the key. In some other database systems, a clustered index is called an “index-organized table”. Secondary indexes in InnoDB are also B-trees, containing pairs of values of the index key and the value of the primary key, which acts as a pointer to the row in the clustered index.
There is an exception to this rule. Variable-length columns (such as BLOB
and VARCHAR
) that are too long to fit on a B-tree page are stored on separately allocated disk (“overflow”) pages. We call these “off-page columns”. The values of such columns are stored on singly-linked lists of overflow pages, and each such column has its own list of one or more overflow pages. In some cases, all or a prefix of the long column values is stored in the B-tree, to avoid wasting storage and eliminating the need to read a separate page.
The new “Barracuda” file format provides a new option (KEY_BLOCK_SIZE
) to control how much column data is stored in the clustered index, and how much is placed on overflow pages.
. COMPACT and REDUNDANT Row Formats
Previous versions of InnoDB used an unnamed file format (now called “Antelope”) for database files. With that format, tables were defined with ROW_FORMAT=COMPACT
(or ROW_FORMAT=REDUNDANT
) and InnoDB stored up to the first 768 bytes of variable-length columns (such as BLOB
and VARCHAR
) in the index record within the B-tree node, with the remainder stored on the overflow page(s).
To preserve compatibility with those prior versions, tables created with the InnoDB Plugin use the prefix format, unless one of ROW_FORMAT=DYNAMIC
or ROW_FORMAT=COMPRESSED
is specified (or implied) on the CREATE TABLE
command.
With the “Antelope” file format, if the value of a column is not longer than 768 bytes, no overflow page is needed, and some savings in I/O may result, since the value is in the B-tree node. This works well for relatively shortBLOB
s, but may cause B-tree nodes to fill with data rather than key values, thereby reducing their efficiency. Tables with many BLOB
columns could cause B-tree nodes to become too full of data, and contain too few rows, making the entire index less efficient than if the rows were shorter or if the column values were stored off-page.
DYNAMIC Row Format
When innodb_file_format
is set to “Barracuda” and a table is created with ROW_FORMAT=DYNAMIC
orROW_FORMAT=COMPRESSED
, long column values are stored fully off-page, and the clustered index record contains only a 20-byte pointer to the overflow page.
Whether any columns are stored off-page depends on the page size and the total size of the row. When the row is too long, InnoDB chooses the longest columns for off-page storage until the clustered index record fits on the B-tree page.
The DYNAMIC
row format maintains the efficiency of storing the entire row in the index node if it fits (as do theCOMPACT
and REDUNDANT
formats), but this new format avoids the problem of filling B-tree nodes with a large number of data bytes of long columns. The DYNAMIC
format is predicated on the idea that if a portion of a long data value is stored off-page, it is usually most efficient to store all of the value off-page. With DYNAMIC
format, shorter columns are likely to remain in the B-tree node, minimizing the number of overflow pages needed for any given row.
Specifying a Table’s Row Format
The row format used for a table is specified with the ROW_FORMAT
clause of the CREATE TABLE
and ALTER TABLE
commands. Note that COMPRESSED
format implies DYNAMIC
format. See Section 3.2, “Enabling Compression for a Table” for more details on the relationship between this clause and other clauses of these commands.
3、http://blog.csdn.net/gtuu0123/article/details/5354822 (转载)
一、innodb行格式
(1)REDUNDANT和COMPACT格式,被命名为“Antelope”
REDUNDANT:MySQL 4.1 and below
COMPACT:MySQL 5.0 and above(默认的)
相关特征:
1)每一个页面至少存储2行,因此如果一行要完全存储在此页面中,那个此行数据限制为8000bytes
2)如果一个带blob列的行的大小小于上述限制,那么此行的所有数据将存储在一个页面中;否则,会将blob列的前768bytes存储在此页面中,其他的数据存储在额外的页面中。假设一行有两个7k的blob列数据,那么会将第一个blob列数据存储在此页面中,第二个blob列的头768bytes存储在此页面中,而第二个blob列中的其他存储在其他页面中。
3)存储blob列的前768bytes字节的原因是:可以很容易的实现blob列前缀索引
4)这个768bytes的决定造成了一个bugs:那是你可以存储200K Blob数据,但是你不能存储一个20字节的数据
5)外部用于存储blob数据的页面不是共享的。假设一个blob列多出了一个字节需要存储,那么将分配一个16K大小的页面,并且这个页面不能被其他的blob列数据使用。所以应当避免同一行中使用多个blob列,建议将多个blob列合并为一个。
6)如果所有的blob列都不适合于页,那么mysql会用外部页面选择其中的一个blob列进行存储,这取决于不同行的列的大小情况。因此,可以不同的行选择不同的blob列来进行外部存储。
缺点:
因为在innodb中一个b-tree结点所存储的是key+row data,所以如果一个页面中能够存储更多的数据,即存储更多的行,那么在搜索时会达到更多的效率。假如有blob列数据,利用上面的存储格式会造成效率的降低。因为在一个页面中如果存储了blob列的数据,那么会造成存储的行数据的减少,因此搜索时的效率会下降。如果要是使blob列的数据用分离的页面存储,那么存储的行数据会更多,搜索效率会更高。
(2)ROW_FORMAT=DYNAMIC(被命名为Barracuda)
相关特征:
1)要么存储所有行数据(包括blob列数据)在一个页面,要么只存储20bytes的指针在页面中,而利用外部页面存储blob列数据。
2)COMPRESSED格式比DYNAMIC更小,其他同DYNAMIC格式
BLOB、TEXT、VARCHAR存储的格式是相同的。因此,上述也适用于TEXT、VARCHAR。
通过以上描述,可以得出:
(1)如果预期blob的数据较少,并且整行的数据可以小于8000bytes,而且查询一般是以单行查询为主,那么用“REDUNDANT和COMPACT格式”比较好。
(2)如果预期blob的数据较多,并且查询经常返回一个范围的结果集,那么用DYNAMIC较好。
(3)使用COMPRESSED会消耗更多的CPU时间。
Other:基于存储过程的mysql blob字段访问