转自(http://www.mysqlperformanceblog.com/2008/03/17/researching-your-mysql-table-sizes/)
Researching your MySQL table sizes
Posted by peter
I posted a simple INFORMATION_SCHEMA query to find largest tables last month and it got a good response. Today I needed little modifications to that query to look into few more aspects of data sizes so here it goes:
Find total number of tables, rows, total data in index size for given MySQL Instance
SELECT count(*) TABLES,
concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES;
+--------+--------+-------+-------+------------+---------+
| TABLES | rows | DATA | idx | total_size | idxfrac |
+--------+--------+-------+-------+------------+---------+
| 35 | 17.38M | 0.32G | 0.00G | 0.32G | 0.00 |
+--------+--------+-------+-------+------------+---------+
Find the same data using some filter
I often use similar queries to find space used by particular table "type" in sharded environment when multiple tables with same structure and similar name exists:
PLAIN TEXT
SQL:
SELECT count(*) TABLES,
concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES
WHERE table_name LIKE "%test%";
+--------+--------+-------+-------+------------+---------+
| TABLES | rows | DATA | idx | total_size | idxfrac |
+--------+--------+-------+-------+------------+---------+
| 1 | 21.85M | 0.41G | 0.00G | 0.41G | 0.00 |
+--------+--------+-------+-------+------------+---------+
1 row in set (0.00 sec)
Find biggest databases
PLAIN TEXT
SQL:
SELECT count(*) TABLES,
table_schema,concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES
GROUP BY table_schema
ORDER BY sum(data_length+index_length) DESC LIMIT 10;
+--------+--------------------+-------+-------+-------+------------+---------+
| TABLES | table_schema | rows | DATA | idx | total_size | idxfrac |
+--------+--------------------+-------+-------+-------+------------+---------+
| 1 | test | 7.31M | 0.14G | 0.00G | 0.14G | 0.00 |
| 17 | mysql | 0.00M | 0.00G | 0.00G | 0.00G | 0.15 |
| 17 | information_schema | NULL | 0.00G | 0.00G | 0.00G | NULL |
+--------+--------------------+-------+-------+-------+------------+---------+
3 rows in set (0.01 sec)
Data Distribution by Storage Engines
You can change this query a bit and get most popular storage engines by number of tables or number of rows instead of data stored.
PLAIN TEXT
SQL:
SELECT engine,
count(*) TABLES,
concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES
GROUP BY engine
ORDER BY sum(data_length+index_length) DESC LIMIT 10;
+--------+--------+--------+-------+-------+------------+---------+
| engine | TABLES | rows | DATA | idx | total_size | idxfrac |
+--------+--------+--------+-------+-------+------------+---------+
| MyISAM | 22 | 26.54M | 0.49G | 0.00G | 0.49G | 0.00 |
| MEMORY | 13 | NULL | 0.00G | 0.00G | 0.00G | NULL |
+--------+--------+--------+-------+-------+------------+---------+
2 rows in set (0.01 sec)
Trivially but handy.
==============================================================
用python写了一个测试数据库的脚本
生成了 26.54M行数据
liuhui@ubuntu:~$ cat test.py
#!/usr/bin/env python
import MySQLdb
import sys
con = MySQLdb.connect(host="localhost",port=3306,user="root",passwd="password",db="test")
cursor = con.cursor()
sql = "insert into test1 (id,name) value(1,'test1')";
for i in range(0,1000000):
if(i%10000 == 0):
print "."
cursor.execute(sql);