Innodb/XtraDB tables do benefit from being reorganized often. You can get data physically laid out in primary key order as well as get better feel for primary key and index pages and so using less space, it is just OPTIMIZE TABLE might not be best way to do it.
Innodb/XtraDB表在重建时常常来带来益处。你可以重建表数据的物理存储结构,获得更好的主键和索引页的效率,使用更少的存储空间,为了做到这点,仅仅执行OPTIMIZE TABLE命令恐怕不是最好的办法。
If you’re running Innodb Plugin on Percona Server with XtraDB you get benefit of a great new feature – ability to build indexes by sort instead of via insertion. This process can be a lot faster, especially for large indexes which would get inserts in very random order, such as indexes on UUID column or something similar. It also produces a lot better fill factor. The problem is…. OPTIMIZE TABLE for Innodb tables does not get advantage of it for whatever reason.
如果你使用的是XtraDB的INNODB PLUGIN,那么这里有一个很强大的新特性--通过排序创建索引,而不是通过插入。这个过程可以快非常多。特别是对于非常大的索引,他们会导致在写入时有很多随机的排序,比如使用UUID列的索引或其他类似的情况。它也能使数据页更加的紧凑。问题是。。。OPTIMIZE TABLE不能利用这个特性,也不知道什么原因。
Lets take a look at little benchmark I done by running OPTIMIZE for a second time on a table which is some 10 times larger than amount of memory I allocated for buffer pool:
看看下面这个小的基准测试,我在一个表上重复执行OPTIMIZE,这个表大概有分配的BP内存的10倍大小
CREATE TABLE `a` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `c` char(64) DEFAULT NULL, PRIMARY KEY (`id`), KEY `c` (`c`) ) ENGINE=InnoDB AUTO_INCREMENT=12582913 DEFAULT CHARSET=latin1 mysql> select * from a order by id limit 10; +----+------------------------------------------+ | id | c | +----+------------------------------------------+ | 1 | 813cf02d7d65de2639014dd1fb574d4c481ecac7 | | 2 | 62960f5d5d50651e5a5983dacaedfa9a73a9ee87 | | 3 | cea33998792ffe28b16b9272b950102a9633439f | | 4 | 8346a7afa0a0791693338d96a07a944874340a1c | | 5 | b00faaa432f507a0d16d2940ca8ec36699f141c8 | | 6 | 8e00926cf6c9b13dc8e0664a744b7116c5c61036 | | 7 | f151fe34b66fd4d28521d5e7ccb68b0d5d81f21b | | 8 | 7fceb5afa200a27b81cab45f94903ce04d6f24db | | 9 | 0397562dc35b5242842d68de424aa9f0b409d60f | | 10 | af8efbaef7010a1a3bfdff6609e5c233c897e1d5 | +----+------------------------------------------+ 10 rows in set (0.04 sec) # This is just random SHA(1) hashes mysql> optimize table a; +--------+----------+----------+-------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +--------+----------+----------+-------------------------------------------------------------------+ | test.a | optimize | note | Table does not support optimize, doing recreate + analyze instead | | test.a | optimize | status | OK | +--------+----------+----------+-------------------------------------------------------------------+ 2 rows in set (3 hours 3 min 35.15 sec) mysql> alter table a drop key c; Query OK, 0 rows affected (0.46 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> optimize table a; +--------+----------+----------+-------------------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +--------+----------+----------+-------------------------------------------------------------------+ | test.a | optimize | note | Table does not support optimize, doing recreate + analyze instead | | test.a | optimize | status | OK | +--------+----------+----------+-------------------------------------------------------------------+ 2 rows in set (4 min 5.52 sec) mysql> alter table a add key(c); Query OK, 0 rows affected (5 min 51.83 sec) Records: 0 Duplicates: 0 Warnings: 0
That’s right ! Optimizing table straight away takes over 3 hours, while dropping indexes besides primary key, optimizing table and adding them back takes about 10 minutes, which is close than 20x speed difference and more compact index in the end.
很强大啊,直接执行OPTIMIZE操作花了3个多小时,移除全部二级索引,再OPTIMIZE表,再把二级索引加回来,只花了10分钟。这比前面的方法快了20倍,并且索引结构更紧凑。
So if you’re considering running OPTIMIZE on your tables consider using this trick, it is especially handy when you’re running it on the Slave where it is OK table is exposed without indexes for some time. Note though nothing stops you from using LOCK TABLES on Innodb table to ensure there is not ton of queries starting reading table with no indexes and bringing box down.
如果你准备在表上使用这一招做OPTIMIZE操作,这在从服务器上执行特别有效,有时候表没有索引也是可以的。注意你可以在INNODB表上只LOCK TABLES,去确保在没有索引的时候没有大量的查询读这个表从而导致服务器宕机。
You can also use this trick for ALTER TABLE which requires table rebuild. Dropping all indexes; doing ALTER and when adding them back can be a lot faster than straight ALTER TABLE.
你也可以使用这个方法做ALTER TABLE操作,它也会导致表的重建。移除所有的索引,做ALTER 操作,再加回索引,这样都比直接做ALTER TABLE要快很多(实际测试来看,并没有快很多,看这篇测试)
P.S I do not know why this was not done when support for creating index by sorting was implemented. It looks very strange to me to have this feature implemented but majority of high level commands or tools (like mysqldump) do not get advantage of it and will use old slow method of building indexes by insertion.