Researching your MySQL table sizes

转自(http://www.mysqlperformanceblog.com/2008/03/17/researching-your-mysql-table-sizes/)
Researching your MySQL table sizes
Posted by peter

I posted a simple INFORMATION_SCHEMA query to find largest tables last month and it got a good response. Today I needed little modifications to that query to look into few more aspects of data sizes so here it goes:

Find total number of tables, rows, total data in index size for given MySQL Instance
SELECT count(*) TABLES,
    concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES;

+--------+--------+-------+-------+------------+---------+
| TABLES | rows   | DATA  | idx   | total_size | idxfrac |
+--------+--------+-------+-------+------------+---------+
|     35 | 17.38M | 0.32G | 0.00G | 0.32G      |    0.00 |
+--------+--------+-------+-------+------------+---------+

Find the same data using some filter
I often use similar queries to find space used by particular table "type" in sharded environment when multiple tables with same structure and similar name exists:
PLAIN TEXT
SQL:

SELECT count(*) TABLES,
concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES
WHERE  table_name LIKE "%test%";
+--------+--------+-------+-------+------------+---------+
| TABLES | rows   | DATA  | idx   | total_size | idxfrac |
+--------+--------+-------+-------+------------+---------+
|      1 | 21.85M | 0.41G | 0.00G | 0.41G      |    0.00 |
+--------+--------+-------+-------+------------+---------+
1 row in set (0.00 sec)

Find biggest databases
PLAIN TEXT
SQL:

SELECT count(*) TABLES,
table_schema,concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES
GROUP BY table_schema
ORDER BY sum(data_length+index_length) DESC LIMIT 10;
+--------+--------------------+-------+-------+-------+------------+---------+
| TABLES | table_schema       | rows  | DATA  | idx   | total_size | idxfrac |
+--------+--------------------+-------+-------+-------+------------+---------+
|      1 | test               | 7.31M | 0.14G | 0.00G | 0.14G      |    0.00 |
|     17 | mysql              | 0.00M | 0.00G | 0.00G | 0.00G      |    0.15 |
|     17 | information_schema | NULL  | 0.00G | 0.00G | 0.00G      |    NULL |
+--------+--------------------+-------+-------+-------+------------+---------+
3 rows in set (0.01 sec)
Data Distribution by Storage Engines
You can change this query a bit and get most popular storage engines by number of tables or number of rows instead of data stored.
PLAIN TEXT
SQL:
SELECT engine,
count(*) TABLES,
concat(round(sum(table_rows)/1000000,2),'M') rows,
concat(round(sum(data_length)/(1024*1024*1024),2),'G') DATA,
concat(round(sum(index_length)/(1024*1024*1024),2),'G') idx,
concat(round(sum(data_length+index_length)/(1024*1024*1024),2),'G') total_size,
round(sum(index_length)/sum(data_length),2) idxfrac
FROM information_schema.TABLES
GROUP BY engine
ORDER BY sum(data_length+index_length) DESC LIMIT 10;

+--------+--------+--------+-------+-------+------------+---------+
| engine | TABLES | rows   | DATA  | idx   | total_size | idxfrac |
+--------+--------+--------+-------+-------+------------+---------+
| MyISAM |     22 | 26.54M | 0.49G | 0.00G | 0.49G      |    0.00 |
| MEMORY |     13 | NULL   | 0.00G | 0.00G | 0.00G      |    NULL |
+--------+--------+--------+-------+-------+------------+---------+
2 rows in set (0.01 sec)


Trivially but handy.

==============================================================
用python写了一个测试数据库的脚本
生成了 26.54M行数据
liuhui@ubuntu:~$ cat test.py
#!/usr/bin/env python
import MySQLdb
import sys


con = MySQLdb.connect(host="localhost",port=3306,user="root",passwd="password",db="test")

cursor = con.cursor()

sql = "insert into test1 (id,name) value(1,'test1')";

for i in range(0,1000000):
                if(i%10000 == 0):
                        print "."
                cursor.execute(sql);

你可能感兴趣的:(sql,mysql,python,ubuntu,脚本)