It felt like the right time for us to look back at some useful commands for table maintenance that some of us may not have mastered as much as we might like to think.
In my post about gathering index statistics , I referred to OPTIMIZE TABLE
, ANALYZE TABLE
, and REPAIR TABLE
— but I never explained in depth what the different commands do, and what the differences between them are. That is what I thought I would do with this post, focusing on InnoDB and MyISAM, and the differences in how they treat those commands. I will also look at different cases and see which one is right for in each case.
In this article, I will be looking at are online tools* , rather than the other (very useful) tools that are offline–such as myisamchk
or setting innodb_force_recovery
and starting up the server–both of which can sometimes save our skins. I’ll cover offline methods in another post, as they are outside of the scope for this post.
For InnoDB, this is a fast operation — although not exactly the most trustworthy statistics come from it. The reason, is that with InnoDB, when you execute ANALYZE TABLE randomTable
, in order to update the index cardinality, there will be 10 random dives into each index, fetching an estimate cardinality and presenting that to you and the optimizer. Hence, several ANALYZE TABLE
s in a row can (and probably will) produce different results each time. Be aware however, that this causes a write lock on the table.
With MyISAM, this operation may take longer, as it will scan the index and populate it accordingly. It only places a read lock while doing so, but on the plus side, the statistics gathered is trustworthy and the information you get is correct.
One of the reasons to issue ANALYZE TABLE
is that when we have a lot of modification of a table (INSERT
or DELETE
for example), over time the optimizer might not make the best choices when trying to decide if it should use a specific index or not. By analyzing the table, we help it to make a more educated choice.
If we are in a replicated setup, and don’t want the ANALYZE to spread from the master, use ANALYZE NO_WRITE_TO_BINLOG TABLE randomTable
, and that will stop it from being replicated. (Or do:
SET SQL_LOG_BIN=0;
ANALYZE TABLE randomTable;
SET SQL_LOG_BIN=1;
which will temporarily disable logging to binary log for your session.)
A command that works with InnoDB, ARCHIVE, and MyISAM, CHECK TABLE
does what it sounds like — it check a table for errors or other issues.
CHECK TABLE
also checks if a table is compatible with the current version after an upgrade (CHECK TABLE randomTable FOR UPGRADE
), and if it’s found not to be compatible, a full check is done and then the .frm
file is updated with the current version number.
With MyISAM, you have the option of specifying different levels of checking. (You can specify it with InnoDB as well, but it is just ignored.) The different levels are:
If you do not specify any options, CHECK TABLE
uses the MEDIUM
option, which checks the key checksums and also that all links are valid.
The output might look like this:
sql01 sakila> CHECK TABLE film_actor EXTENDED;
+-------------------+-------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-------------------+-------+----------+----------+
| sakila.film_actor | check | status | OK |
+-------------------+-------+----------+----------+
1 row in set (0.09 sec)
Or like this:
sql01 sakila> CHECK TABLE rental_rep \G
*************************** 1. row ***************************
Table: sakila.rental_rep
Op: check
Msg_type: warning
Msg_text: 1 client is using or hasn't closed the table properly
*************************** 2. row ***************************
Table: sakila.rental_rep
Op: check
Msg_type: warning
Msg_text: Size of indexfile is: 26052608 Should be: 25295872
*************************** 3. row ***************************
Table: sakila.rental_rep
Op: check
Msg_type: warning
Msg_text: Size of datafile is: 20836352 Should be: 20185030
*************************** 4. row ***************************
Table: sakila.rental_rep
Op: check
Msg_type: error
Msg_text: Key in wrong position at page 15993856
*************************** 5. row ***************************
Table: sakila.rental_rep
Op: check
Msg_type: error
Msg_text: Corrupt
5 rows in set (0.14 sec)
EXTENDED
, checks all keys for all rows. This is a very slow operation.
QUICK
doesn’t check the rows for invalid links.
FAST
only checks tables that wasn’t closed in the right way.
checks for tables that wern’t closed properly or that have been changed since the last time you ran a check.
CHANGED
When running your cron jobs to check if a table is okay, the preferred method of doing so is FAST
, as it is quick and doesn’t interrupt normal operations too much, while it still catches the major problems. Then perhaps ANALYZE TABLE
biweekly/monthly, to keep them up to date (if you have a lot of changing data — for a static table, this is not needed).
Just as with CHECK TABLE
, the compatibility with various storage engines isn’t the best for OPTIMIZE
, but let’s see why, and what it means, after we’ve taken a look at its usage.
OPTIMIZE TABLE
is used to remove overhead, sort indexes for better access, and generally keep your tables in good order — but is often overrated and used more often than necessary. I’ve seen a lot of people using OPTIMIZE TABLE
daily or even hourly, but that is generally not needed, unless you do massive amount of INSERT
and/or DELETE
on a few tables, and even then daily or weekly will be enough — just remember to run it only on the affected tables.
Just like ANALYZE TABLE
, you can specify NO_WRITE_TO_BINLOG
to avoid logging the statement and having it spread to your slaves if you are using a replication setup.
First, the limitations.
OPTIMIZE TABLE
will not sort R-tree indexes unless you are using version >= 5.0.34 OR >= 5.1.15, in which this issue has been fixed.
With InnoDB, OPTIMIZE
yet again maps to an ALTER TABLE
statement, which means that there will be a table rebuild, which in turn updates the indexes (causing, obviously, a lock on the table).
For MyISAM, it starts by locking the table, then sorts the index pages and updates the statistics (if they’re not already up to date).
However, if you try to optimize a crashed table:
sql01 sakila> OPTIMIZE TABLE rental_rep \G
*************************** 1. row ***************************
Table: sakila.rental_rep
Op: optimize
Msg_type: error
Msg_text: Table './sakila/rental_rep' is marked as crashed and should be repaired
1 row in set, 2 warnings (0.00 sec)
So, let’s skip ahead and look at how to repair this table:
sql01 sakila> REPAIR TABLE rental_rep \G
*************************** 1. row ***************************
Table: sakila.rental_rep
Op: repair
Msg_type: warning
Msg_text: Number of rows changed from 651130 to 672140
*************************** 2. row ***************************
Table: sakila.rental_rep
Op: repair
Msg_type: status
Msg_text: OK
2 rows in set (8.52 sec)
So we just solved the problem (I faked a server crash by killing the mysqld process while inserting data to the table). But let’s do something else to this table, before I get into explaining how REPAIR
works:
%rm var/sakila/rental_rep.MYI
%ls var/sakila/rental_rep.*
var/sakila/rental_rep.MYD var/sakila/rental_rep.frm
%
Ouch! We don’t have the index data there any longer. What does the server say?
sql01 sakila> CHECK TABLE rental_rep \G
*************************** 1. row ***************************
Table: sakila.rental_rep
Op: check
Msg_type: error
Msg_text: Can't find file: 'rental_rep' (errno: 2)
1 row in set (0.00 sec)
sql01 sakila> SELECT * FROM rental_rep LIMIT 1;
ERROR 1017 (HY000): Can't find file: 'rental_rep' (errno: 2)
sql01 sakila> SHOW CREATE TABLE rental_rep;
ERROR 1017 (HY000): Can't find file: 'rental_rep' (errno: 2)
sql01 sakila> REPAIR TABLE rental_rep \G
*************************** 1. row ***************************
Table: sakila.rental_rep
Op: repair
Msg_type: error
Msg_text: Can't find file: 'rental_rep' (errno: 2)
1 row in set, 1 warning (0.00 sec)
That looks pretty bad, no? errno: 2 means “No such file or directory” which we know, since we deleted the file — but can we solve this without restoring from a backup?
Yes! There is a specific usage for this with REPAIR TABLE
:
sql01 sakila> REPAIR TABLE rental_rep USE_FRM \G
*************************** 1. row ***************************
Table: sakila.rental_rep
Op: repair
Msg_type: warning
Msg_text: Number of rows changed from 0 to 672140
*************************** 2. row ***************************
Table: sakila.rental_rep
Op: repair
Msg_type: status
Msg_text: OK
2 rows in set (8.17 sec)
sql01 sakila> CHECK TABLE rental_rep;
+-------------------+-------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+-------------------+-------+----------+----------+
| sakila.rental_rep | check | status | OK |
+-------------------+-------+----------+----------+
1 row in set (0.80 sec)
sql01 sakila> SELECT * FROM rental_rep LIMIT 1 \G
*************************** 1. row ***************************
rental_id: 1
rental_date: 2005-05-24 23:04:41
inventory_id: 2452
customer_id: 333
return_date: 2005-06-03 01:43:41
staff_id: 2
last_update: 2006-02-15 21:30:53
1 row in set (0.00 sec)
So we have just successfully restored the index data, and made the table usable again, without restoring from a backup. Pretty neat! Let me explain what just happened, and how REPAIR
worksREPAIR TABLE
tries to repair a corrupted or broken table, and it can also be used as REPAIR NO_WRITE_TO_BINLOG TABLE randomTable
to disable writing to the binary log and replicate the statement.
REPAIR TABLE
works for MyISAM and ARCHIVE, but with some caveats. For example, there is a risk of data loss if something happens during the repair.
Besides USE_FRM
, REPAIR
has two other options that can be used. The first, EXTENDED
, basically means that the indexes are repaired row by row, rather than creating one index at a time with sorting.
The other option, QUICK
, means that REPAIR
will only try to repair the index rather than index and data.
So, what about USE_FRM
which I used above? Well, it’s a bit of a double-edged sword, one that might save you from a lot of headaches, but which might also give you even more trouble if you are not careful.
If you create table rental_rep
with version 5.0.27, upgrade to 5.0.51a for example, and then someone removes your MYI file, and you run the above command, you risk losing all data in the table, because of the difference in MySQL version that the table was created with. However, if you are running version >= 5.0.62 (currently Enterprise only), there will instead be an error such as Failed repairing incompatible .FRM file
, making sure you do not make a mistake.
However, if you do successfully import your data after the version upgrade, and perform a vanilla REPAIR TABLE <table>
(or CHECK TABLE <table> FOR UPGRADE
), the .frm
will be upgraded to the current version, and potentially solve your problem.
With an ARCHIVE
table, REPAIR TABLE
might also improve the compression, depending on how much data addition there has been.
* (Well, as online as it gets — table locking is still done.)