Notes from Scaling MySQL

Here is the quick notes from the session Scaling MySQL - Up or Out ? moderated by Kaj Arno as part of the todays keynote.

Here is the list of panelists are ordered by Alexa ranking.

Here is the list of questions and answers from panelists:

Â	How many servers	Number of DBAs	How many web servers	Number of caching servers	Version of MySQL	Language, platform	Operating System
MySQL	1 M, 3 S	1/10	2	2	5.1.23	Perl,php and bash	Linux fedora
Sun	2 clustered, 2 individual	1.5	160+	8	5.0.21	Lots of stuff (java mostly)	Open Solaris
Flickr	166	At present 0	244	14	5.0.51	Php and some Java	Linux
Fotolog	140 databases on 37 instances	10 instances a DBA	70	40 ( 2 on each, 80 total)	4.11 and 4.4	Php, 90% Java	Solaris 10
Wikipedia	20	None, but everybody is kind of aÂ DBA	70+200	40 ( 2 on each, 80 total)	Â	Php, c++, python	Fedora / Ubuntu
Facebook	30000 databases, 1800 db servers	2	1200	805	5.0.44 with relay log corruption patch	Php, python, c++ and erlang	Fedora / RHEL
Youtube	I can not say	3	I can not say	I can not say	5.0.24	Python	SuSE 9

Few more misc questions …

FR: All of our servers are federated, pairs of servers, we can loose any one side of shard, we can loose boxes.. traffic goes to either side of shard, now it goes to one, and we will get another one (very transparent to user)
WK: Users shout at them on IRC then they moderate … fixed in seconds
FB: one of 1800-1900 will always fail, just operate well, minor impact, with data going away for a while…we restore from binlog and start the server quickly, promote slave to master and number of ways
FL: we simply mount the snapshots to different servers and get
YT: SAN etc, very important data.. recover the server, mirrored disk …mirrored hard drive is crucial

FL: UltraSPARC-T1 (excellent master, multi threaded) and UltraSPARC-T2 for slave (single threaded)
WK: good network switch
FB: cheap switch causes problems and learned lessons, we do not use SAN, neatly partitioned, they scale independently and fail independently
MY: cluster very sad

FB: app design is the key to use resources, data center power supply and consumption
FL: Google has to approve for our lab power (cut app servers by 1/2 by moving from php to java)
YT: not at all

better you know what the systems are, then you can
performance, scaling taking it serious
nothing more permanent than temp solutions (if you don’t know when you will fail, then you will )
architect properly in start, schema, cost of serving data

Notes from Scaling MySQL - Up or Out