之前参照他人的做法,使用DBUtils.PooledDB来建立多个可复用的MySQL连接,部分文章有误,方法不当,导致我走了很多弯路,专研几天后,终于找到了正确的使用方法。
网上有很多使用DBUtils.PooledDB模块建立连接池,再加threading多线程连接MySQL的例子,不仅没有告诉读者如何验证是否为多线程跑,而且大多是使用cursor()来建立多线程连接,这是错误的。(验证是否为多线程方法请见文章最后)
使用cursor()来建立多线程连接,在执行SQL查询的时候会报错:2014, "Commands out of sync; you can't run this command now"
究其原因是:
1、查询结果未释放,然又执行查询
2、两次查询之间没有存储结果应该重复使用connection(),而不是cursor()
Mysql文档:Commands out of sync
If you get Commands out of sync; you can't run this command now in your client code, you are calling client functions in the wrong order.
This can happen, for example, if you are using mysql_use_result() and try to execute a new query before you have called mysql_free_result().
It can also happen if you try to execute two queries that return data without calling mysql_use_result() or mysql_store_result() in between.
DBUtils.PooledDB模块里关于这两个函数的说明如下:
def connection(self, shareable=True):
"""Get a steady, cached DB-API 2 connection from the pool.
If shareable is set and the underlying DB-API 2 allows it,
then the connection may be shared with other threads.
"""
def cursor(self, *args, **kwargs):
"""Return a new Cursor Object using the connection."""
return SteadyDBCursor(self, *args, **kwargs)
所以要并发连接MySQL执行SQL,应该是使用connect(),而不是使用cursor。
多线程连接MySQL.demo
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import MySQLdb
import threading
import time
from DBUtils.PooledDB import PooledDB
'''
建立连接池,返回连接池地址
'''
def dbpool(ip,port,username,password,dbname,char_set='utf8'):
connKwargs = {'host': ip, 'port': port, 'user': username, 'passwd': password, 'db': dbname,'charset': char_set}
pool = PooledDB(MySQLdb, mincached=10, maxcached=10, maxshared=10, maxconnections=10, **connKwargs)
return pool
'''
从连接池中取出一个连接,执行SQL
num:用于统计总的影响行数
'''
num=0
def dbconnect(db_pool):
global num
conn = db_pool.connection()
cur = conn.cursor()
try:
cur.execute("select * from table_name")
lock.acquire()
num += cur.rowcount
lock.release()
except Exception as e:
print e
finally:
cur.close()
conn.close()
if __name__ == '__main__':
'''
lock:生成全局锁,用于执行语句中,被影响行数的统计值加锁使用,每次只允许一个线程修改被锁变量
'''
lock = threading.Lock()
st = time.time()
db_pool = dbpool(ip='127.0.0.1',port = 1234,username='root',password='1234',dbname ='try',char_set='utf8')
'''
同时连接MySQL执行的线程数要小于等于前面PooledDB中设置的maxconnections,如果大于这个量,会报异常:TooManyConnections。
设置每次只跑10个线程,跑完后再循环。
'''
thread_list = []
for i in range(100):
t = threading.Thread(target=dbconnect,args=(db_pool,))
thread_list.append(t)
while len(thread_list)!=0:
if len(thread_list)>10:
thread_list_length = 10
else:
thread_list_length = len(thread_list)
sub_thread_list = []
for n in range(thread_list_length):
sub_thread_list.append(thread_list[0])
thread_list.remove(thread_list[0])
for i in sub_thread_list:
i.start()
for j in sub_thread_list:
j.join()
et = time.time()
print et - st,num
检查程序是否为多线程在跑:
1、可以在执行主机上,通过”pstree -p PID“查看执行的进程数
2、在MySQL中查看连接数及SQL执行情况:select * from information_schema.processlist where host like '192.168.1.1%' order by 1