ld909

[转载] python学习笔记

参考链接： Python | a += b并不总是a = a + b

官网http://www.python.org/

官网library http://docs.python.org/library/

PyPI https://pypi.python.org/pypi

中文手册，适合快速入门 http://download.csdn.net/detail/xiarendeniao/4236870

python cook book中文版 http://download.csdn.net/detail/XIARENDENIAO/3231793

1.数值尤其是实数很方便、字符串操作很炫、列表 a = complex(1,0.4)

a.real

a.imag

Unicode()

字符串前加上r/R表示常规字符串，加上u/U表示unicode字符串列表的append()方法在列表末尾加一个新元素

2.流程控制

while：

if:

if xxx:

...

elif yyy:

...

elif xxx:

...

else:

...

for

range()

break continue 循环中的else

pass

3.函数

1)def funA(para) 没有return语句时函数返回None，参数传递进去的是引用

2)默认参数，默认参数是列表、字典、类实例时要小心

3)不定参数，def funB(king, *arguments, **keywords) 不带关键字的参数值存在元组arguments中，关键字跟参数值存在字典keywords中。其实是元组封装和序列拆封的一个结合。

4)

def funC(para1, para2, para3) 下面的调用把列表元素分散成函数参数funcC(*list)

5)匿名函数 lambda arg1,arg2...:

特点：创建一个函数对象，但是没有赋值给标识符（不同于def）;lambda是表达式，不是语句；“：”后面只能是一个表达式 6)if ‘ok’ in (‘y’, ‘ye’, ‘yes’): xxxxx 关键字in的用法

7)f = bambda x: x*2 等效于 def f(x): return x*2

4.数据结构

1)[] help(list) append(x) extend(L) insert(i,x) remove(x) pop([i]) index(x) count(x) sort() reverse()

2)List的函数化编程 filter() map() reduce()

3)列表推导式 aimTags = [aimTag for aimTag in aimTags if aimTag not in filterAimTags]

4)del删除列表切片或者整个变量

5)() help(tuple) 元组tuple，其中元素和字符串一样不能改变。元组、字符串、列表都是序列。 Python 要求单元素元组中必须使用逗号，以此消除与圆括号表达式之间的歧义。这是新手常犯的错误

6){} help(dict) 字典 keys() has_key() 可用以键值对元组为元素的列表直接构造字典

7)循环字典：for k, v in xxx.iteritems():… for item in xxx.items():... 序列：for i, v in enumerate([‘tic’, ‘tac’, ‘toe’]):… 同时循环多个序列：for q, a in zip(questions, answers):…

8)in not in is is not a

9)相同类型的序列对象之间可以用< > ==进行比较

10)判断变量类型的两种方法：isinstance（var,int） type(var).__name__=="int"

多种类型判断，isinstance(s,(str,unicode))当s是常规字符串或者unicode字符串都会返回True

11）在循环中删除list元素时尤其要注意出问题，for i in listA:... listA.remove(i)是会有问题的，删除一个元素之后后面的元素就前移了；for i in len(listA):...del listA[i]也会有问题，删除元素后长度变化，循环会越界

filter(lambda x:x !=4,listA)这种方式比较优雅

listA = [ i for i in listA if i !=4] 也不错，或者直接创建一个新的列表算球效率：

1)"if k in my_dict" 优于 "if my_dict.has_key(k)"

2)"for k in my_dict" 优于 "for k in my_dict.keys()",也优于"for k in [....]"

12）set是dict的一种实现 https://docs.python.org/2/library/stdtypes.html#set-types-set-frozenset

>>> s1 = set([1,2,3,4,5])

>>> s2 = set([3,4,5,6,7,8])

>>> s1|s2

set([1, 2, 3, 4, 5, 6, 7, 8])

>>> s1-s2

set([1, 2])

>>> s2-s1

set([8, 6, 7])

5.模块

1)模块名由全局变量__name__得到，文件fibo.py可以作为fibo模块被import fibo导入到其他文件或者解释器中，fibo.py中函数明明必须以fib开头

2)import变体： from fibo import fib, fib2 然后不用前缀直接使用函数

3)sys.path sys.ps1 sys.ps2

4)内置函数 dir() 用于按模块名搜索模块定义，它返回一个字符串类型的存储列表，列出了所有类型的名称：变量，模块，函数，等等

help()也有类似的作用

5)包 import packet1.packet2.module from packet1.packet2 import module from packet1.packet2.module import functionA

6)import 语句按如下条件进行转换：执行 from package import * 时，如果包中的 __init__.py 代码定义了一个名为 __all__ 的列表，就会按照列表中给出的模块名进行导入

7)sys.path打印出当前搜索python库的路径，可以在程序中用sys.path.append("/xxx/xxx/xxx")来添加新的搜索路径

8)安装python模块时可以用easy_install，卸载easy_install -m pkg_name

9)用__doc__可以得到某模块、函数、对象的说明，用__name__可以得到名字（典型用法：if __name__=='__main__'： ...）

6.IO

1)str() unicode() repr() repr() print rjust() ljust() center() zfill() xxx%v xxx%(v1,v2) 打印复杂对象时可用pprint模块（调试时很有用）

对于自定义的类型，要支持pprint需要提供__repr__方法。对于pprint的结果不想直接给标准输出(pprint.pprint(var))可以用pprint.pformat(var). 2)f = open(“fileName”, “w”) w r a r+ Win和Macintosh平台还有一个模式”b”

f.read(size)

f.readline()

f.write(string)

f.writelines(list)

f.tell()

f.seek(offset, from_what) from_what:0开头 1当前 2末尾 offset:byte数http://www.linuxidc.com/Linux/2007-12/9644p3.htm

f.close()

linecache模块可以方便的获取文件某行数据，在http-server端使用时要注意，尤其是操作大文件很危险，并发情况下很容易就让机器内存耗尽、系统直接挂掉（本人血的教训）

文件操作时shutil比较好用

os.walk()遍历目录下所有文件

3)pickle模块(不是只能写入文件中)

封装（pickling）类似于php的序列化：pickle.dump(objectX, fileHandle)

拆封（unpickling）类似于php反序列化：objectX = pickle.load(fileHandle)

msgpack(easy_install msgpack-python)比pickle和cpickle都好用一些,速度较快

msgpack.dump(my_var, file('test_file_name','w'))

msgpack.load(file('test_file_name','r'))

4)raw_input()接受用户输入 7.class

1)以两个下划线下头、以不超过一个下划线结尾成员变量和成员函数都是私有的，父类的私有成员在子类中不可访问

2)调用父类的方法：1>ParentClass.FuncName(self,args) 2>super(ChildName,self).FuncName(args) 第二种方法的使用必须保证类是从object继承下来的，否则super会报错

3)静态方法定义，在方法名前一行写上@staticmethod。可以通过类名直接调用。

#!/bin/python

#encoding=utf8

class A(object):

def __init__(self, a, b):

self.a = a

self.b = b

def show(self):

print "A::show() a=%s b=%s" % (self.a,self.b)

class B(A):

def __init__(self, a, b, c):

#A.__init__(self,a,b)

super(B,self).__init__(a,b) #super这种用法要求父类必须是从object继承的

self.c = c

if __name__ == "__main__":

b = B(1,2,3)

print b.a,b.b,b.c

b.show()

#输出

xudongsong@sysdev:~$ python class_test.py

1 2 3

A::show() a=1 b=2

8.编码

常见的编码转换分为以下几种情况：

unicode->其它编码

例如：a为unicode编码要转为gb2312。a.encode('gb2312')

其它编码->unicode

例如：a为gb2312编码，要转为unicode。 unicode(a, 'gb2312')或a.decode('gb2312')

编码1 -> 编码2

可以先转为unicode再转为编码2

如gb2312转big5

unicode(a, 'gb2312').encode('big5')

判断字符串的编码

isinstance(s, str) 用来判断是否为一般字符串

isinstance(s, unicode) 用来判断是否为unicode

如果一个字符串已经是unicode了，再执行unicode转换有时会出错(并不都出错)

>>> str2 = u"sfdasfafasf"

>>> type(str2)

>>> isinstance(str2,str)

False

>>> isinstance(str2,unicode)

True

>>> type(str2)

>>> str3 = "safafasdf"

>>> type(str3)

>>> isinstance(str3,unicode)

False

>>> isinstance(str3,str)

True

>>> str4 = r'asdfafadf'

>>> isinstance(str4,str)

True

>>> isinstance(str4,unicode)

False

>>> type(str4)

可以写一个通用的转成unicode函数：

def u(s, encoding):

if isinstance(s, unicode):

return s

else:

return unicode(s, encoding)

9.线程

1)要让子线程跟着父线程一起退出，可以对子线程调用setDaemon()

2)对子线程调用join()方法可以让父线程等到子线程退出之后再退出

3)ctrl+c只能被父线程捕获到（子线程不能调用信号捕获函数signal.signal(signal,function)），对子线程调用join()会导致父线程捕获不到ctrl+c，需要子线程退出后才能捕获到

附：成应元老师关于python信号的邮件参考 http://stackoverflow.com/questions/631441/interruptible-thread-join-in-python From http://docs.python.org/library/signal.html#module-signal: Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead. 总是在主线程调用signal设置信号处理器，主线程将是唯一处理信号的线程。因此不要把线程间通信寄托在信号上，而应该用锁。 The second, from http://docs.python.org/library/thread.html#module-thread: Threads interact strangely with interrupts: the KeyboardInterrupt exception will be received by an arbitrary thread. (When the signal module is available, interrupts always go to the main thread.) 当导入signal模块时， KeyboardInterrupt异常总是由主线程收到，否则KeyboardInterrupt异常会被任意一个线程接到。直接按Ctrl+C会导致Python接收到SIGINT信号，转成KeyboardInterrupt异常在某个线程抛出，如果还有线程没有被 setDaemon，则这些线程照运行不误。如果用kill送出非SIGINT信号，且该信号没设置处理函数，则整个进程挂掉，不管有多少个线程还没完成。

下面是signal的一个使用范例：

>>> import signal

>>> def f():

... signal.signal(signal.SIGINT, sighandler)

... signal.signal(signal.SIGTERM, sighandler)

... while True:

... time.sleep(1)

...

>>> def sighandler(signum,frame):

... print signum,frame

...

>>> f()

^C2

^C2

^C2

^C2

signal的设置和清除：

import signal, time

term = False

def sighandler(signum, frame):

print "terminate signal received..."

global term

term = True

def set_signal():

signal.signal(signal.SIGTERM, sighandler)

signal.signal(signal.SIGINT, sighandler)

def clear_signal():

signal.signal(signal.SIGTERM, 0)

signal.signal(signal.SIGINT, 0)

set_signal()

while not term:

print "hello"

time.sleep(1)

print "jumped out of while loop"

clear_signal()

term = False

for i in range(5):

if term:

break

else:

print "hello, again"

time.sleep(1)

[dongsong@bogon python_study]$ python signal_test.py

hello

hello

hello

^Cterminate signal received...

jumped out of while loop

hello, again

hello, again

^C

[dongsong@bogon python_study]$ 多进程程序使用信号时，要想让父进程捕获信号并对子进程做一些操作，应该在子进程启动完成以后再注册信号处理函数，否则子进程继承父进程的地址空间，也会有该信号处理函数，程序会混乱不堪

from multiprocessing import Process, Pipe

import logging, time, signal

g_logLevel = logging.DEBUG

g_logFormat = "%(asctime)s %(levelname)s [%(filename)s:%(lineno)d]%(message)s"

def f(conn):

conn.send([42, None, 'hello'])

#conn.close()

logging.basicConfig(level=g_logLevel,format=g_logFormat,stream=None)

logging.debug("hello,world")

def f2():

while True:

print "hello,world"

time.sleep(1)

termFlag = False

def sighandler(signum, frame):

print "terminate signal received..."

global termFlag

termFlag = True

if __name__ == '__main__':

# parent_conn, child_conn = Pipe()

# p = Process(target=f, args=(child_conn,))

# p.start()

# print parent_conn.recv() # prints "[42, None, 'hello']"

# print parent_conn.recv()

# p.join()

p = Process(target=f2)

p.start()

signal.signal(signal.SIGTERM, sighandler)

signal.signal(signal.SIGINT, sighandler)

while not termFlag:

time.sleep(0.5)

print "jump out of the main loop"

p.terminate()

p.join()

10.Python 的内建函数locals() 。它返回的字典对所有局部变量的名称与值进行映射

11.扩展位置参数

def func(*args): ...

在参数名之前使用一个星号，就是让函数接受任意多的位置参数。

python把参数收集到一个元组中，作为变量args。显式声明的参数之外如果没有位置参数，这个参数就作为一个空元组。

关联item 3.4

12.扩展关键字参数（扩展键参数）

def accept(**kwargs): ...

python在参数名之前使用2个星号来支持任意多的关键字参数。

注意：kwargs是一个正常的python字典类型，包含参数名和值。如果没有更多的关键字参数，kwargs就是一个空字典。

位置参数和关键字参数参考这篇文章：http://blog.csdn.net/qinyilang/article/details/5484415

>>> def func(arg1, arg2 = "hello", *arg3, **arg4):

... print arg1

... print arg2

... print arg3

... print arg4

...

>>> func("xds","t1",t2="t2",t3="t3")

xds

t1

()

{'t2': 't2', 't3': 't3'}

13.装饰器在函数前加上@another_method，用于对已有函数做包装、前提检查=工作，这篇文章写得很透彻 http://daqinbuyi.iteye.com/blog/1161274

14.异常处理的语法

import sys

try:

f = open('myfile.txt')

s = f.readline()

i = int(s.strip())

except IOError, (errno, strerror):

print "I/O error(%s): %s" % (errno, strerror)

except ValueError:

print "Could not convert data to an integer."

except:

print "Unexpected error:", sys.exc_info()[0]

raise

>>> try:

... raise Exception('spam', 'eggs')

... except Exception, inst:

... print "error %s" % str(e)

... print type(inst) # the exception instance

... print inst.args # arguments stored in .args

... print inst # __str__ allows args to printed directly

... x, y = inst # __getitem__ allows args to be unpacked directly

... print 'x =', x

... print 'y =', y

...

('spam', 'eggs')

('spam', 'eggs')

x = spam

y = eggs

15.命令行参数的处理，用python的optparse库处理，具体用法见这篇文章 http://blog.chinaunix.net/space.php?uid=16981447&do=blog&id=2840082

from optparse import OptionParser

[...]

def main():

usage = "usage: %prog [options] arg"

parser = OptionParser(usage)

parser.add_option("-f", "--file", dest="filename",

help="read data from FILENAME")

parser.add_option("-v", "--verbose",

action="store_true", dest="verbose")

parser.add_option("-q", "--quiet",

action="store_false", dest="verbose")

[...]

(options, args) = parser.parse_args()

if len(args) != 1:

parser.error("incorrect number of arguments")

if options.verbose:

print "reading %s..." % options.filename

[...]

if __name__ == "__main__":

main()通俗的讲，make_option()和add_option()用于创建对python脚本的某个命令项的解析方式，用parse_args()解析后单个参数存入args元组，键值对参数存入options；dest指定键值对的key,不写则用命令的长名称作为key；help用于对脚本调用--help/-h时候解释对应命令；action描述参数解析方式，默认store表示命令出现则用dest+后跟的value存入options,store_true表示命令出现则以dest+True存入options,store_false表示命令出现则以dest+False存入options

16.最近用了BeautifulSoup v4，出现如下错误（之前用的是低版本的BeautifulSoup,没遇到这个错误）

HTMLParser.HTMLParseError: malformed start tag解决办法：用easy_install html5lib，安装html5lib，替代HTMLParser

参考：http://topic.csdn.net/u/20090531/09/956454dd-ba13-4fa3-af3c-6bf7af5726dc.html

beautifulsoup官网：http://www.crummy.com/software/BeautifulSoup/

beautifulsoup的手册：http://www.crummy.com/software/BeautifulSoup/bs4/doc/

中文手册（用于快速入门）：http://www.leeon.me/upload/other/beautifulsoup-documentation-zh.html

下面是一个beautifulsoup的一些用法

[dongsong@localhost boosenspider]$ vpython

Python 2.6.6 (r266:84292, Dec 7 2011, 20:48:22)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>>

>>>

>>> from bs4 import BeautifulSoup as soup

>>> s = soup('

')

>>> s

æ‰“å�¡

>>> type(s)

>>>

>>>

>>> t = s.body.contents[0]

>>> t

æ‰“å�¡

>>> import re

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dks")})

[]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")})

[æ‰“å�¡]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'href':None})

[]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'href':re.compile('')})

[æ‰“å�¡]

>>> t.contents[0]

æ‰“å�¡

>>> t.contents[0].string = "hello"

>>> t

>>> t.contents[0].text

u'hello'

>>> t.contents[0].string

u'hello'

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'text':re.compile('')})

[]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'text':re.compile('h')})

[]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'text':re.compile('^h')})

[]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")})

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r''))

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'a'))

[]

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'^hell'))

>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'^hello$'))

>>>

>>> t.findAll(name='a',attrs={},text=re.compile(r'^hello$'))

>>>

>>> t

>>> t1 = soup('

').body.contents[0]

>>>

>>> t1

>>> t == t1

True

>>> re.search(r'(^hello)|(^bbb)','hello')

<_sre.SRE_Match object at 0x25ef718>

>>> re.search(r'(^hello)|(^bbb)','hellosdfsd')

<_sre.SRE_Match object at 0x25ef7a0>

>>> re.search(r'(^hello)|(^bbb)','bbbsdfsdf')

<_sre.SRE_Match object at 0x25ef718>

>>> t2 = t1.contents[0]

>>> t2

>>> t2.findAll(name='a')

[]

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>>

>>>

>>> from bs4 import BeautifulSoup as soup

>>> s = soup('

天涯婚礼堂

')

>>> s.findAll(name='a',attrs={'href':None})

[]

>>> s.findAll(name='a',attrs={'href':True})

[?¤???ˉ????¤????]

>>> import re

>>> s.findAll(name='a',attrs={'href':re.compile(r'')})

[?¤???ˉ????¤????]

>>> s1 =s

>>> s1

?¤???ˉ????¤????

>>> id(s1)

140598579280080

>>> id(s)

140598579280080

>>> s1.body.contents[0].contents[0]['href']=None

>>> s1

?¤???ˉ????¤????

>>> s

?¤???ˉ????¤????

>>> id(s)

140598579280080

>>> id(s1)

140598579280080

>>> s.findAll(name='a',attrs={'href':re.compile(r'')})

[]

>>> s.findAll(name='a',attrs={'href':True})

[]

>>> s.findAll(name='a',attrs={'href':None})

[?¤???ˉ????¤????]

>>> s.findAll(name='a')

[?¤???ˉ????¤????]

#text是一个用于搜索NavigableString对象的参数。它的值可以是字符串，一个正则表达式，一个list或dictionary，True或None，一个以NavigableString为参数的可调用对象

#None,False,''表示不做要求；re.compile(''),True表示必须有NavigableString存在（跟attrs不同，attrs字典中指定为False的属性表示不能存在）

#注意findAll函数text参数的使用，如下：

>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=re.compile(r''))

>>> len(rts)

0

>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text='')

>>> len(rts)

1

>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=True)

>>> len(rts)

0

>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=False)

>>> len(rts)

1

>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=None)

>>> len(rts)

1

#关于string属性的用法，以及其在什么类型元素上出现的问题

>>> from bs4 import BeautifulSoup as soup

>>> soup1 = soup('hello,aaaa').body.contents[0]

>>> soup1

hello,aaaa

>>> soup1.string

>>> soup1.name

u'b'

>>> soup1.text

u'hello,aaaa'

>>> type(soup1)

>>> soup1.contents[0]

u'hello,'

>>> type(soup1.contents[0])

>>> soup1.contents[0].string

u'hello,'

>>> soup2 = soup('hello').body.contents[0]

>>> type(soup2)

>>> soup2.string

u'hello'

#limit的用法，为零表示不限制

>>> soup2.findAll(name='a',text=False,limit=0)

[, åŒ†åŒ†é‚£å¹´]

>>> soup2.findAll(name='a',text=False,limit=1)

[]

BeautifulSoup的性能一般，但是对于不合法的hetml标签有很强的修复和容错能力，对于编码问题，能确定来源页面编码的情况下可以通过BeautifulSoup的构造函数（参数from_encoding）指定（如我解析天涯的页面时就指定了from_encoding='gbk'），不确定来源的话可以依赖bs的自动编码检测和转换(可能会有乱码，毕竟机器没人这么聪明)。

BeautifulSoup返回的对象、以及其各节点内的数据都是其转换后的unicode编码。

---------->

今天遇到一个小问题

有一段html源码在bs3.2.1下构建bs对象失败，抛出UnicodeEncodeError，不论把源码用unicode还是utf-8或者lantin1传入都报错，而且bs3.2.1构造函数居然没有from_encoding的参数可用

尼玛，在bs4下就畅行无阻，不论用unicode编码传入还是utf-8编码传入，都不用指定from_encoding（编码为utf-8、不指定from_encoding时出现乱码，但是也没有报错呀，谁有bs3那么脆弱啊！）

总结一个道理，代码在某个版本库下面测试稳定了以后用的时候安装相应版本的库就ok了，为嘛要委曲求全的做兼容，如果低版本的库有bug我也兼容吗？兼？贱！

<--------------------2012-06-08 18:20

bs4构建对象：

[dongsong@bogon boosenspider]$ cat bs_constrator.py

#encoding=utf-8

from bs4 import BeautifulSoup as soup

from bs4 import Tag

if __name__ == '__main__':

sou = soup('

')

tag1 = Tag(sou, name='div')

tag1['id'] = 'gentie1'

tag1.string = 'hello,tag1'

sou.div.insert(0,tag1)

tag2 = Tag(sou, name='div')

tag2['id'] = 'gentie2'

tag2.string = 'hello,tag2'

sou.div.insert(1,tag2)

print sou

[dongsong@bogon boosenspider]$ vpython bs_constrator.py

hello,tag1

hello,tag2

cgi可以对html字符串转义(escape);HTMLParser可以取消html的转义(unescape)

>>> t = Tag(name='t')

>>> t.string=""

>>> t

>>> str(t)

""

>>> t.string

u""

>>> HTMLParser.HTMLParser().unescape(str(t))

u""

>>> s1

u""

>>>

>>> s2 = cgi.escape(s1)

>>> s2

u"<t><img src='www.baidu.com'/></t>"

>>> HTMLParser.HTMLParser().unescape(s2)

u""

17.加密md5模块或者hashlib模块

>>> md5.md5("asdfadf").hexdigest()

'aee0014b14124efe03c361e1eed93589'

>>> import hashlib

>>> hashlib.md5("asdfadf").hexdigest()

'aee0014b14124efe03c361e1eed93589'

18.urllib2.urlopen(url)不设置超时的话可能会一直等待远端服务器的反馈，导致卡死

urlFile = urllib2.urlopen(url, timeout=g_url_timeout)

urlData = urlFile.read()

19.正则匹配 re模块

用三个单引号括起来的字符串可以跨行，得到的实际字符串里面有\n，这个得注意

用单引号或者双引号加上\也可以实现字符串换行，得到的实际字符串没有\和\n，但是在做正则匹配时写正则串不要用这种方式写，会匹配不上的

>>> ss = '''

... hell0,a

... shhh

... liumingdong

... xudongsong

... hello

... '''

>>> ss

'\nhell0,a\nshhh\nliumingdong\nxudongsong\nhello\n'

SyntaxError: EOL while scanning string literal

>>> sss = 'aaaa\

... bbbb\

... cccccc'

>>> sss

'aaaabbbbcccccc'

>>> s3 = r'(^hello)|\

... (abc$)'

>>>

>>> re.search(s3,'hello,world')

<_sre.SRE_Match object at 0x7f95233047a0>

#第一行的正则串匹配成功

>>> re.search(s3,'aaa,hello,worldabc')

#第二行的匹配失败

>>> s4 = r'(^hello)|(abc$)'

#s4没有用单引号加\做跨行，则两个正则串都匹配上了

>>> re.search(s4,"hello,world")

<_sre.SRE_Match object at 0x182e690>

>>> re.search(s4,"aaa,hello,worldabc")

<_sre.SRE_Match object at 0x7f95233047a0>

>>>

#注意如何取匹配到的子串（把要抽取的子串对应的正则用圆括号括起来，group从1开始就是圆括号对应的子串）

>>> re.search(r'^(\d+)abc(\d+)$','232abc1').group(0,1,2)

('232abc1', '232', '1')

#下面是一个re和lambda混合使用的一个例子

#encoding=utf-8

import re

f = lambda arg: re.search(u'^(\d+)\w+',arg).group(1)

print f(u'1111条评论')

try:

f(u'aaaa')

except AttributeError,e:

print str(e)

:!python re_lambda.py

111

'NoneType' object has no attribute 'group'

re.findall（）很好用的哦

>>> re.findall(r'\\@[A-Za-z0-9]+', s)

['\\@userA', '\\@userB']

>>> s

'hello,world,\\@userA\\@userB'

>>> re.findall(r'\\@([A-Za-z0-9]+)', s)

['userA', 'userB']

20.写了个爬虫，之前在做一些url的连接时总是自己来根据各种情况来处理，比如./xxx #xxxx /xxx神马的都要考虑，太烦了，后来发现有现成的东西可以用

>>>from urlparse import urljoin

>>>import urllib

>>>url = urljoin(r"http://book.douban.com/tag/?view=type",u"./网络小说")

>>> url

u'http://book.douban.com/tag/\u7f51\u7edc\u5c0f\u8bf4'

>>> conn2 = urllib.urlopen(url)

Traceback (most recent call last):

File "", line 1, in

File "/usr/lib64/python2.6/urllib.py", line 86, in urlopen

return opener.open(url)

File "/usr/lib64/python2.6/urllib.py", line 179, in open

fullurl = unwrap(toBytes(fullurl))

File "/usr/lib64/python2.6/urllib.py", line 1041, in toBytes

" contains non-ASCII characters")

UnicodeError: URL u'http://book.douban.com/tag/\u7f51\u7edc\u5c0f\u8bf4' contains non-ASCII characters

>>> conn2 = urllib.urlopen(url.encode('utf-8'))

21.urllib2做http请求时如何添加header，如何获取cookie的值

>>> request = urllib2.Request("http://img1.gtimg.com/finance/pics/hv1/46/178/1031/67086211.jpg",headers={'If-Modified-Since':'Wed, 02 May 2012 18:32:20 GMT'})

#等同于request.add_header('If-Modified-Since','Wed, 02 May 2012 18:32:20 GMT')

>>> urllib2.urlopen(request)

Traceback (most recent call last):

File "", line 1, in

File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen

return _opener.open(url, data, timeout)

File "/usr/lib64/python2.6/urllib2.py", line 397, in open

response = meth(req, response)

File "/usr/lib64/python2.6/urllib2.py", line 510, in http_response

'http', request, response, code, msg, hdrs)

File "/usr/lib64/python2.6/urllib2.py", line 435, in error

return self._call_chain(*args)

File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain

result = func(*args)

File "/usr/lib64/python2.6/urllib2.py", line 518, in http_error_default

raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)

urllib2.HTTPError: HTTP Error 304: Not Modified

>>> urllib.urlencode({"aaa":"bbb"})

'aaa=bbb'

>>> urllib.urlencode([("aaa","bbb")])

'aaa=bbb'

#urlencode的使用，在提交post表单时需要把参数k-v用urlencode处理后放入头部

#urllib2.urlopen(url,data=urllib.urlencode(...))

今天(13.7.4)遇到一个问题是登录某个站点时需要把第一次访问服务器植入的csrftoken作为post数据一起返给服务器，所以就研究了写怎么获取cooke的值，具体代码不便透漏，把栈溢出上的一个例子摆出来(主要看获取cookie数据的那几行代码)

http://stackoverflow.com/questions/10247054/http-post-and-get-with-cookies-for-authentication-in-python

[dongsong@localhost python_study]$ cat cookie.py

from urllib2 import Request, build_opener, HTTPCookieProcessor, HTTPHandler

import httplib, urllib, cookielib, Cookie, os

conn = httplib.HTTPConnection('webapp.pucrs.br')

#COOKIE FINDER

cj = cookielib.CookieJar()

opener = build_opener(HTTPCookieProcessor(cj),HTTPHandler())

req = Request('http://webapp.pucrs.br/consulta/principal.jsp')

f = opener.open(req)

html = f.read()

import pdb

pdb.set_trace()

for cookie in cj:

c = cookie

#FIM COOKIE FINDER

params = urllib.urlencode ({'pr1':111049631, 'pr2':'sssssss'})

headers = {"Content-type":"text/html",

"Set-Cookie" : "JSESSIONID=70E78D6970373C07A81302C7CF800349"}

# I couldn't set the value automaticaly here, the cookie object can't be converted to string, so I change this value on every session to the new cookie's value. Any solutions?

conn.request ("POST", "/consulta/servlet/consulta.aluno.ValidaAluno",params, headers) # Validation page

resp = conn.getresponse()

temp = conn.request("GET","/consulta/servlet/consulta.aluno.Publicacoes") # desired content page

resp = conn.getresponse()

print resp.read()

22.如何修改logging的日志输出文件，尤其在使用multiprocessing模块做多进程编程时这个问题变得更急迫，因为子进程会继承父进程的日志输出文件和格式....

def change_log_file(fileName):

h = logging.FileHandler(fileName)

h.setLevel(g_logLevel)

h.setFormatter(logging.Formatter(g_logFormat))

logger = logging.getLogger()

#print logger.handlers

for handler in logger.handlers:

handler.close()

while len(logger.handlers) > 0:

logger.removeHandler(logger.handlers[0])

logger.addHandler(h)

logging设置logger、handler、formatter可以参见django的配置文件，下面是个人写的一个小例子

[dongsong@localhost python_study]$ cat logging_test.py

#encoding=utf-8

import logging, sys

if __name__ == '__main__':

logger = logging.getLogger('test')

logger.setLevel(logging.DEBUG)

print 'log handlers: %s' % str(logger.manager.loggerDict)

logger.error('here')

logger.warning('here')

logger.info('here')

logger.debug('here')

#handler = logging.FileHandler('test.log')

handler = logging.StreamHandler(sys.stdout)

handler.setLevel(logging.DEBUG)

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

handler.setFormatter(formatter)

logger.addHandler(handler)

#logging.getLogger('test').addHandler(logging.NullHandler()) # python 2.7+

logger.error('here')

logger.warning('here')

logger.info('here')

logger.debug('here')

[dongsong@localhost python_study]$ vpython logging_test.py

log handlers: {'test': }

No handlers could be found for logger "test"

2012-12-26 11:30:48,725 - test - ERROR - here

2012-12-26 11:30:48,725 - test - WARNING - here

2012-12-26 11:30:48,725 - test - INFO - here

2012-12-26 11:30:48,725 - test - DEBUG - here

23.multiprocessing模块使用demo

import multiprocessing

from multiprocessing import Process

import time

def func():

for i in range(3):

print "hello"

time.sleep(1)

proc = Process(target = func)

proc.start()

while True:

childList = multiprocessing.active_children()

print childList

if len(childList) == 0:

break

time.sleep(1)

[dongsong@bogon python_study]$ python multiprocessing_children.py

[]

hello

[]

hello

[]

hello

[]

[]

[dongsong@bogon python_study]$ fg

multiprocessing的Pool模块（进程池）是很好用的，今天差点多此一举的自己写了一个（当然，自己写也是比较easy的，只是必然没官方的考虑周到）

[dongsong@bogon python_study]$ vpython

Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> from multiprocessing import Pool

>>> import time

>>> poolObj = Pool(processes = 10)

>>> procObj = poolObj.apply_async(time.sleep, (20,))

>>> procObj.get(timeout = 1)

Traceback (most recent call last):

File "", line 1, in

File "/usr/lib64/python2.6/multiprocessing/pool.py", line 418, in get

raise TimeoutError

multiprocessing.TimeoutError

>>> print procObj.get(timeout = 21)

None

>>> poolObj.__dict__['_pool']

[, , , , , , , , , ]

>>> poolObj.close()

>>> poolObj.join()

24.关于bs的编码和str()函数编码的问题在下面的demo里面可见一斑(跟str()类似的内建函数是unicode())

#encoding=utf-8

from bs4 import BeautifulSoup as soup

tag = soup((u"

白痴代码

"),from_encoding='unicode').body.contents[0]

newStr = str(tag) #tag内部的__str__()返回utf-8编码的字符串（tag不实现__str__()的话就会按照本文第38条表现了）

print type(newStr),isinstance(newStr,unicode),newStr

try:

print u"[unicode]hello," + newStr #自动把newStr按照unicode解释，报错

except Exception,e:

print str(e)

print "[utf-8]hello," + newStr

print u"[unicode]hello," + newStr.decode('utf-8')

[dongsong@bogon python_study]$ vpython tag_str_test.py

False

白痴代码

'ascii' codec can't decode byte 0xe7 in position 3: ordinal not in range(128)

[utf-8]hello,

白痴代码

[unicode]hello,

白痴代码

25.关于MySQLdb使用的一些问题

http://mysql-python.sourceforge.net/

1>

这里是鸟人11年在某个项目中封装的数据库操作接口database.py，具体的数据库操作可以继承该类并实现跟业务相关的接口

2>cursor.execute(), cursor.fetchall()查出来的是unicode编码，即使指定connect的charset为utf8

3>查询语句需要注意的问题见下述测试代码；推荐的cursor.execute()用法是cursor.execute(sql, args)，因为底层会自动做字符串逃逸

If you're not familiar with the Python DB-API, notethat the SQL statement incursor.execute() uses placeholders,"%s",rather than adding parameters directly within the SQL. If you use thistechnique, the underlying database library will automatically add quotes andescaping to your parameter(s) as necessary. (Also note that Django expects the"%s" placeholder,not the "?" placeholder, which is used by the SQLitePython bindings. This is for the sake of consistency and sanity.)

4>规范的做法需要conn.cursor().execute()后conn.commit()，否则在某些不支持自动提交的数据库版本上会有问题

5>对于插入操作成功后新增记录对应的自增主键可以用MySQLdb.connections.Connection.insert_id()来获取（MySQLdb.connections.Connection就是MySQLdb.connect()返回的mysql连接）（2014.5.29）

#encoding=utf-8

import MySQLdb

conn = MySQLdb.connect(host = "127.0.0.1", port = 3306, user = "xds", passwd = "xds", db = "xds_db", charset = 'utf8')

cursor = conn.cursor()

print cursor

siteName = u"百度贴吧"

bbsNames = [u"明星", u"影视"]

siteName = siteName.encode('utf-8')

for index in range(len(bbsNames)):

bbsNames[index] = bbsNames[index].encode('utf-8')

#正确的用法

#args = tuple([siteName] + bbsNames)

#sql = "select bbs from t_site_bbs where site = %s and bbs in (%s,%s)"

#rts = cursor.execute(sql,args)

#print rts

#正确的用法

args = tuple([siteName] + bbsNames)

sql = "select bbs from t_site_bbs where site = '%s' and bbs in ('%s','%s')" % args

print sql

rts = cursor.execute(sql)

print rts

#错误的用法,报错

#args = tuple([siteName] + bbsNames)

#sql = "select bbs from t_site_bbs where site = %s and bbs in (%s,%s)" % args

#rts = cursor.execute(sql)

print rts

#错误的用法,不报错，但是查不到数据(bbsName的成员是数字串或者英文字符串时正确)

#sql = "select bbs from t_site_bbs where site = '%s' and bbs in %s" % (siteName, str(tuple(bbsNames)))

#print sql

#rts = cursor.execute(sql)

#print rts

rts = cursor.fetchall()

for rt in rts:

print rt[0]

对于有自增列的数据表，insert之后可以通过cursor.lastrowid获取刚插入的记录的自增id，update不行

参考：http://stackoverflow.com/questions/706755/how-do-you-safely-and-efficiently-get-the-row-id-after-an-insert-with-mysql-usin

26.关于时间

[dongsong@bogon boosencms]$ vpython

Python 2.6.6 (r266:84292, Dec 7 2011, 20:48:22)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import time

>>> time.gmtime()

time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=4, tm_min=14, tm_sec=55, tm_wday=4, tm_yday=139, tm_isdst=0)

>>> time.localtime()

time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=12, tm_min=15, tm_sec=2, tm_wday=4, tm_yday=139, tm_isdst=0)

>>> time.time()

1337314595.7790151

>>> time.timezone

-28800

>>> time.gmtime(time.time())

time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=4, tm_min=19, tm_sec=45, tm_wday=4, tm_yday=139, tm_isdst=0)

>>> time.localtime(time.time())

time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=12, tm_min=19, tm_sec=54, tm_wday=4, tm_yday=139, tm_isdst=0)

>>> time.strftime("%a, %d %b %Y %H:%M:%S +0800", time.localtime(time.time()))

'Fri, 18 May 2012 12:21:20 +0800'

>>> time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime(time.time()))

'Fri, 18 May 2012 04:21:36 +0000'

#%Z这玩意到底怎么用的，下面也没搞明白

>>> time.strftime("%a, %d %b %Y %H:%M:%S %Z", time.gmtime(time.time()))

'Fri, 18 May 2012 04:23:09 CST'

>>> time.strftime("%a, %d %b %Y %H:%M:%S %Z", time.localtime(time.time()))

'Fri, 18 May 2012 12:23:31 CST'

>>> timeStr = time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime(time.time()))

>>> timeStr

'Fri, 18 May 2012 04:24:29 +0000'

>>> t = time.strptime(timeStr, "%a, %d %b %Y %H:%M:%S %Z")

Traceback (most recent call last):

File "", line 1, in

File "/usr/lib64/python2.6/_strptime.py", line 454, in _strptime_time

return _strptime(data_string, format)[0]

File "/usr/lib64/python2.6/_strptime.py", line 325, in _strptime

(data_string, format))

ValueError: time data 'Fri, 18 May 2012 04:24:29 +0000' does not match format '%a, %d %b %Y %H:%M:%S %Z'

>>> t = time.strptime(timeStr, "%a, %d %b %Y %H:%M:%S +0000")

>>> t

time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=4, tm_min=24, tm_sec=29, tm_wday=4, tm_yday=139, tm_isdst=-1)

#下面是datetime的用法

>>> import datetime

>>> datetime.datetime.today()

datetime.datetime(2012, 5, 18, 12, 28, 25, 892141)

>>> datetime.datetime(2012,12,12,23,54)

datetime.datetime(2012, 12, 12, 23, 54)

>>> datetime.datetime(2012,12,12,23,54,32)

datetime.datetime(2012, 12, 12, 23, 54, 32)

>>> datetime.datetime.fromtimestamp(time.time())

datetime.datetime(2012, 5, 18, 12, 29, 15, 130257)

>>> datetime.datetime.utcfromtimestamp(time.time())

datetime.datetime(2012, 5, 18, 4, 29, 34, 897017)

>>> datetime.datetime.now()

datetime.datetime(2012, 5, 18, 12, 29, 52, 558249)

>>> datetime.datetime.utcnow()

datetime.datetime(2012, 5, 18, 4, 30, 6, 164009)

>>> datetime.datetime.fromtimestamp(time.time()).strftime("%a, %d %b %Y %H:%M:%S")

'Fri, 18 May 2012 17:05:30'

>>> datetime.datetime.today().strftime("%a, %d %b %Y %H:%M:%S")

'Fri, 18 May 2012 17:05:44'

>>> datetime.datetime.strptime('Fri, 18 May 2012 04:24:29', "%a, %d %b %Y %H:%M:%S")

datetime.datetime(2012, 5, 18, 4, 24, 29)

>>> datetime.datetime.fromtimestamp(time.time()).strftime('%X')

'17:07:14'

>>> datetime.datetime.fromtimestamp(time.time()).strftime('%x')

'02/28/15'

>>> datetime.datetime.fromtimestamp(time.time()).strftime('%c')

'Sat Feb 28 17:07:24 2015'

%a 英文星期简写

%A 英文星期的完全

%b 英文月份的简写

%B 英文月份的完全

%c 显示本地日期时间

%d 日期，取1-31

%H 小时， 0-23

%I 小时， 0-12

%m 月， 01 -12

%M 分钟，0-59

%S 秒，0-61（官网这样写的） %j 年中当天的天数

%w 显示今天是星期几

%W 第几周

%x 当天日期

%X 本地的当天时间

%y 年份 00-99间

%Y 年份的完整拼写

27.关于整数转字符串的陷阱

有些整数是int，有些是long,对于long调用str()处理后返回的字符串是数字+L，该long数字在list等容器中时，对容器调用str()处理时也有这个问题，用者需谨慎啊！至于一个整数什么时候是int，什么时候是long鸟人正在研究...（当然，指定int或者long就肯定是int或者long了） 28.join()的用法（列表中的元素必须是字符串）

>>> l = ['a','b','c','d']

>>> '&'.join(l)

'a&b&c&d'

29.python的pdb调试

http://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/

跟gdb很类似：

b line_number 加断点，还可以指定文件和函数加断点

b 180, childWeiboRt.retweetedId == 3508203280986906 条件断点

b 显示所有断点

cl breakpoint_number 清除某个断点

cl 清除所有断点

c 继续

n 下一步

s 跟进函数内部

bt 调用栈

whatis obj 查看某变量类型（跟python的内置函数type()等效）

up 移到调用栈的上一层（frame）,可以看该调用点的代码和变量（当然，程序实际进行到哪里了是不可改变的）

down 移到调用栈的下一层（frame）,可以看该调用点的代码和变量（当然，程序实际进行到哪里了是不可改变的）

...

调试过程中要查看某实例（instanceObj）的属性值可用下述语句：

for it in [(attr,getattr(instanceObj,attr)) for attr in dir(instanceObj)]: print it[0],'-->',it[1]

30.在函数内部获取函数名

>>> import sys

>>> def f2():

... print sys._getframe().f_code.co_name

...

>>> f2()

f2

31.url中的空格等特殊字符的处理

url出现了有+，空格，/，?，%，#，&，=等特殊符号的时候，可能在服务器端无法获得正确的参数值，如何是好？解决办法将这些字符转化成服务器可以识别的字符，对应关系如下： URL字符转义用其它字符替代吧，或用全角的。 + URL中+号表示空格 %2B 空格 URL中的空格可以用+号或者编码 %20 / 分隔目录和子目录 %2F ? 分隔实际的URL和参数 %3F % 指定特殊字符 %25 # 表示书签 %23 & URL中指定的参数间的分隔符 %26 = URL中指定参数的值 %3D

>>> import urllib

>>> import urlparse

>>> urlparse.urljoin('http://s.weibo.com/weibo/',urllib.quote('python c++'))

'http://s.weibo.com/weibo/python%20c%2B%2B'

当url与特殊字符碰撞、然后参数又用于有特殊字符的搜索引擎（lucene等）....

需要把url转义再转义，否则特殊字符安全通过http协议后就裸体进入搜索引擎了，查到的将不是你要的东东...

参考：http://stackoverflow.com/questions/688766/getting-401-on-twitter-oauth-post-requests

通过观察url可以发现http://s.weibo.com浏览器脚本也是做了这种处理的

[dongsong@bogon python_study]$ cat url.py

#encoding=utf-8

import urllib, urlparse

if __name__ == '__main__':

baseUrl = 'http://s.weibo.com/weibo/'

url = urlparse.urljoin(baseUrl, urllib.quote(urllib.quote('python c++')))

print url

conn = urllib.urlopen(url)

data = conn.read()

f = file('/tmp/d.html', 'w')

f.write(data)

f.close()

[dongsong@bogon python_study]$ vpython url.py

http://s.weibo.com/weibo/python%2520c%252B%252B

32.json模块编码问题

json.dumps()默认行为：

把数据结构中所有字符串转换成unicode编码，然后对unicode串做编码转义(\u56fd变成\\u56fd)再整个导出utf-8编码(由参数encoding的默认值utf-8控制，没必要动它)的json串

如原数据结构中的元素编码不一致不影响dumps函数的行为，因为导出json串之前会把所有元素串转换成unicode串

参数ensure_ascii默认是True，如设置为False会改变dumps的行为：

原数据结构中的字符串编码为unicode则导出的json串是unicode串，且内部unicode串不做转义(\u56fd还是\u56fd)；

原数据结构中的字符串编码为utf-8则导出的json串是utf-8串，且内部utf-8串不做转义(\xe5\x9b\xbd还是\xe5\x9b\xbd)；

如原数据结构中的元素编码不一致则dumps函数会出现错误

通过这种方式拿到的json串是可以做编码转换的，默认行为得到的json串不行(因为原数据结构的字符串元素被转义了，对json串整个做编码转换无法触动原数据结构的字符串元素)

warning--->2012-07-11 10:00:

今天遇到一个问题，用这种方式转一个带繁体字的字典，转换成功，只是把json串入库时报错

_mysql_exceptions.Warning: Incorrect string value: '\xF0\x9F\x91\x91\xE7\xAC...' for column 'detail' at row 1

而用第一种方式存库就没有问题，初步认定是json.dumps(ensure_ascii = False)对繁体字的处理有编码问题

对于一些编码比较杂乱的数据，可能json.loads()会抛UnicodeDecodeError异常（比如我今天（2013.3.19）遇到的qq开放平台API返回的utf8编码json串在反解时总遇到这个问题），可如下解决：

myString = jsonStr.decode('utf-8', 'ignore') #转成unicode,并忽略错误

jsonObj = json.loads(myString)

可能会丢数据，但总比什么也不干要强。

#encoding=utf-8

import json

from pprint import pprint

def show_rt(rt):

pprint(rt)

print rt

print "type(rt) is %s" % type(rt)

if __name__ == '__main__':

unDic = {

u'中国':u'北京',

u'日本':u'东京',

u'法国':u'巴黎'

}

utf8Dic = {

r'中国':r'北京',

r'日本':r'东京',

r'法国':r'巴黎'

}

pprint(unDic)

pprint(utf8Dic)

print "\nunicode instance dumps to string:"

rt = json.dumps(unDic)

show_rt(rt)

print "utf-8 instance dumps to string:"

rt = json.dumps(utf8Dic)

show_rt(rt)

#encoding is the character encoding for str instances, default is UTF-8

#If ensure_ascii is False, then the return value will be a unicode instance, default is True

print "\nunicode instance dumps(ensure_ascii=False) to string:"

rt = json.dumps(unDic,ensure_ascii=False)

show_rt(rt)

print "utf-8 instance dumps(ensure_ascii=False) to string:"

rt = json.dumps(utf8Dic,ensure_ascii=False)

show_rt(rt)

print "\n-----------------数据结构混杂编码-----------------"

unDic[u'日本'] = r'东京'

utf8Dic[r'日本'] = u'东京'

pprint(unDic)

pprint(utf8Dic)

print "\nunicode instance dumps to string:"

try:

rt = json.dumps(unDic)

except Exception,e:

print "%s:%s" % (type(e),str(e))

else:

show_rt(rt)

print "utf-8 instance dumps to string:"

try:

rt = json.dumps(utf8Dic)

except Exception,e:

print "%s:%s" % (type(e),str(e))

else:

show_rt(rt)

print "\nunicode instance dumps(ensure_ascii=False) to string:"

try:

rt = json.dumps(unDic, ensure_ascii=False)

except Exception,e:

print "%s:%s" % (type(e),str(e))

else:

show_rt(rt)

print "utf-8 instance dumps to string:"

try:

rt = json.dumps(utf8Dic, ensure_ascii=False)

except Exception,e:

print "%s:%s" % (type(e),str(e))

else:

show_rt(rt)

[dongsong@bogon python_study]$ vpython json_test.py

{u'\u4e2d\u56fd': u'\u5317\u4eac',

u'\u65e5\u672c': u'\u4e1c\u4eac',

u'\u6cd5\u56fd': u'\u5df4\u9ece'}

{'\xe4\xb8\xad\xe5\x9b\xbd': '\xe5\x8c\x97\xe4\xba\xac',

'\xe6\x97\xa5\xe6\x9c\xac': '\xe4\xb8\x9c\xe4\xba\xac',

'\xe6\xb3\x95\xe5\x9b\xbd': '\xe5\xb7\xb4\xe9\xbb\x8e'}

unicode instance dumps to string:

'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u65e5\\u672c": "\\u4e1c\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece"}'

{"\u4e2d\u56fd": "\u5317\u4eac", "\u65e5\u672c": "\u4e1c\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece"}

type(rt) is

utf-8 instance dumps to string:

'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece", "\\u65e5\\u672c": "\\u4e1c\\u4eac"}'

{"\u4e2d\u56fd": "\u5317\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece", "\u65e5\u672c": "\u4e1c\u4eac"}

type(rt) is

unicode instance dumps(ensure_ascii=False) to string:

u'{"\u4e2d\u56fd": "\u5317\u4eac", "\u65e5\u672c": "\u4e1c\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece"}'

{"中国": "北京", "日本": "东京", "法国": "巴黎"}

type(rt) is

utf-8 instance dumps(ensure_ascii=False) to string:

'{"\xe4\xb8\xad\xe5\x9b\xbd": "\xe5\x8c\x97\xe4\xba\xac", "\xe6\xb3\x95\xe5\x9b\xbd": "\xe5\xb7\xb4\xe9\xbb\x8e", "\xe6\x97\xa5\xe6\x9c\xac": "\xe4\xb8\x9c\xe4\xba\xac"}'

{"中国": "北京", "法国": "巴黎", "日本": "东京"}

type(rt) is

-----------------数据结构混杂编码-----------------

{u'\u4e2d\u56fd': u'\u5317\u4eac',

u'\u65e5\u672c': '\xe4\xb8\x9c\xe4\xba\xac',

u'\u6cd5\u56fd': u'\u5df4\u9ece'}

{'\xe4\xb8\xad\xe5\x9b\xbd': '\xe5\x8c\x97\xe4\xba\xac',

'\xe6\x97\xa5\xe6\x9c\xac': u'\u4e1c\u4eac',

'\xe6\xb3\x95\xe5\x9b\xbd': '\xe5\xb7\xb4\xe9\xbb\x8e'}

unicode instance dumps to string:

'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u65e5\\u672c": "\\u4e1c\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece"}'

{"\u4e2d\u56fd": "\u5317\u4eac", "\u65e5\u672c": "\u4e1c\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece"}

type(rt) is

utf-8 instance dumps to string:

'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece", "\\u65e5\\u672c": "\\u4e1c\\u4eac"}'

{"\u4e2d\u56fd": "\u5317\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece", "\u65e5\u672c": "\u4e1c\u4eac"}

type(rt) is

unicode instance dumps(ensure_ascii=False) to string:

:'ascii' codec can't decode byte 0xe4 in position 1: ordinal not in range(128)

utf-8 instance dumps to string:

:'ascii' codec can't decode byte 0xe4 in position 1: ordinal not in range(128)

33.json序列化字典会把数字key变成字符串

>>> import json

>>> d = {1:[1,2,3,4],0:()}

>>> d

{0: (), 1: [1, 2, 3, 4]}

>>> s = json.dumps(d)

>>> s

'{"0": [], "1": [1, 2, 3, 4]}'

>>> json.loads(s)

{u'1': [1, 2, 3, 4], u'0': []}

官网说明：

Keys in key/value pairs of JSON are always of the type str. Whena dictionary is converted into JSON, all the keys of the dictionary arecoerced to strings. As a result of this, if a dictionary is converedinto JSON and then back into a dictionary, the dictionary may not equalthe original one. That is, loads(dumps(x)) != x if x has non-stringkeys.

34.交互模式下_表示上次最后一次运算的结果

35.多进程模块的比较

os.popen()和popen2.*都不是官方倡导的用法，subprocess才是

os.popen()启动子进程时命令后面如果不加地址符就会把父进程阻塞住；该命令使用非常方便，但是它仅仅返回一个跟子进程通信的pipe（默认的mode是读，读的是子进程的stdout和stderr）而已，没办法直接杀掉子进程或者获取子进程的信息（可以从pipe写信息通知子进程让子进程自行终止，但是这个很扯淡，你懂的）；对pipe的fd调用close()可以得到子进程的退出码（我没用过，^_^）；在前几个项目里面我频繁使用该命令，因为当时的环境对进程的控制比较粗线条

popen2.*这个模块还没用过，不过顾名思义popen2.popen2()就是启动子进程时返回stdin和stdout，popen2.popen3()就是启动子进程时返回stdout,stdin,stderr....跟os.popen好像也没多大改进

multiprocessing是仿多线程threading接口的多进程模块，需要注意文件描述符、数据库连接共享的问题；这个和其他执行命令行命令启动子进程的多进程模块是不一样滴

subprocess注意僵尸进程的产生，系统一般会为已退出的子进程保留一个进程退出码等信息的结构、供父进程使用，当父进程wait()子进程时系统知道父进程已不需要该结构则会释放，如果父进程不wait而直接退出那么该子进程（已退出，等待wait）就会变成僵尸，占用系统进程号

subprocess的用法:

>>> obj2 = subprocess.Popen('python /home/dongsong/python_study/child2.py', shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

>>> dir(obj2)

['__class__', '__del__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_check_timeout', '_child_created', '_close_fds', '_communicate', '_communicate_with_poll', '_communicate_with_select', '_communication_started', '_execute_child', '_get_handles', '_handle_exitstatus', '_input', '_internal_poll', '_remaining_time', '_set_cloexec_flag', '_translate_newlines', 'communicate', 'kill', 'pid', 'poll', 'returncode', 'send_signal', 'stderr', 'stdin', 'stdout', 'terminate', 'universal_newlines', 'wait']

>>> dir(obj2.stdout)

['__class__', '__delattr__', '__doc__', '__enter__', '__exit__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'closed', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines', 'xreadlines']

>>> obj2.stdout.read()

'[]\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\n'

>>> obj2.stdout.read()

''

>>> obj2.communicate()[0]

''

>>> obj2.communicate()[1]

Traceback (most recent call last):

File "", line 1, in

File "/usr/lib64/python2.6/subprocess.py", line 729, in communicate

stdout, stderr = self._communicate(input, endtime)

File "/usr/lib64/python2.6/subprocess.py", line 1310, in _communicate

stdout, stderr = self._communicate_with_poll(input, endtime)

File "/usr/lib64/python2.6/subprocess.py", line 1364, in _communicate_with_poll

register_and_append(self.stdout, select_POLLIN_POLLPRI)

File "/usr/lib64/python2.6/subprocess.py", line 1343, in register_and_append

poller.register(file_obj.fileno(), eventmask)

ValueError: I/O operation on closed file

>>> obj2.stderr.read()

Traceback (most recent call last):

File "", line 1, in

ValueError: I/O operation on closed file

>>> args = shlex.split('python /home/dongsong/python_study/child2.py')

>>> obj = subprocess.Popen(args)

36.设置文件对象非阻塞读取

flags = fcntl.fcntl(procObj.stdout.fileno(), fcntl.F_GETFL)

fcntl.fcntl(procObj.stdout.fileno(), fcntl.F_SETFL, flags|os.O_NONBLOCK)

37.如何创建deamon进程（可避免僵尸进程）

原理在僵尸的百科里有提到：fork两次，父进程fork一个子进程，然后继续工作，子进程fork一个孙进程后退出，那么孙进程被init接管，孙进程结束后，init会回收。不过子进程的回收还要自己做。

可以参考这人的实现，这个只能用于纯粹的学习，没什么实际意义http://blog.csdn.net/snleo/article/details/4410305

38.默认编码和内建函数str()的问题

str(xx)把xx转换成系统默认编码（sys.getdefaultencoding()）的适合打印的字符串，一般默认是ascii,那么xx如果是unicode汉字就会报错；默认编码改成utf-8当然就不会报错了

建议不要修改系统默认编码，会影响一些库的使用；一定要改可用这些方法。其中sys.setdefaultencoding()方法不是任何场景都有效（Thesetdefaultencoding is used in python-installed-dir/site-packages/pyanaconda/sitecustomize.py）

[dongsong@bogon python_study]$ vpython

Python 2.6.6 (r266:84292, Dec 7 2011, 20:48:22)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import sys

>>> sys.getdefaultencoding()

'ascii'

>>> s = u'中国'

>>> str(s)

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

>>> s.encode('utf-8')

'\xe4\xb8\xad\xe5\x9b\xbd'

>>> sys.setdefaultencoding('utf-8')

Traceback (most recent call last):

File "", line 1, in

AttributeError: 'module' object has no attribute 'setdefaultencoding'

>>> d = {u'中国':u'北京'}

>>> d

{u'\u4e2d\u56fd': u'\u5317\u4eac'}

>>> str(d)

"{u'\\u4e2d\\u56fd': u'\\u5317\\u4eac'}"

#修改默认编码

[dongsong@bogon python_study]$ cat ~/venv/lib/python2.6/site-packages/sitecustomize.py

import sys

sys.setdefaultencoding('utf-8')

[dongsong@bogon python_study]$ vpython -c 'import sys; print sys.getdefaultencoding();'

utf-8

[dongsong@bogon python_study]$ vpython

Python 2.6.6 (r266:84292, Dec 7 2011, 20:48:22)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> s = u'中国'

>>> str(s)

'\xe4\xb8\xad\xe5\x9b\xbd'

>>> import sys

>>> print sys.getdefaultencoding()

utf-8

>>> d = {u'中国':u'北京'}

>>> d

{u'\u4e2d\u56fd': u'\u5317\u4eac'}

>>> str(d)

"{u'\\u4e2d\\u56fd': u'\\u5317\\u4eac'}"

可以用python -S 跳过site.py（site.py这个东东可以看看python源码里面的内容），然后sys模块就直接支持setdefaultencoding()方法了。

39.trackback

...

except Exception,e:

if not isinstance(e, APIError):

traceback.print_exc(file=sys.stderr)

或者

import sys

tp,val,td = sys.exc_info()

sys.exc_info()的返回值是一个tuple, (type, value/message, traceback)

这里的type ---- 异常的类型

value/message ---- 异常的信息或者参数

traceback ---- 包含调用栈信息的对象。

可用traceback模块处理traceback对象，traceback.print_tb()打印traceback对象，traceback.format_tb()返回traceback对象的可打印串

参考：http://hi.baidu.com/whaway/item/8136af0b404dd1813c42e207

40.用python做GUI开发的一些选择 GUI Programming in Python( http://wiki.python.org/moin/GuiProgramming)

cocos2d ：Cocos2D家族的前世今生

cocos2d官网

cocos2d-x

pygame：pygame维基

pygame官网

tkinter：tkinter教程

tkinter官网

wxpython:wxpython官网

图像处理和图表见另一篇文章http://blog.csdn.net/xiarendeniao/article/details/7991305

41.类的静态方法和类方法（用内建函数staticmethod()和classmethod()修饰的类的成员方法）

在python中，静态方法和类方法都是可以通过类对象和类对象实例访问。但是区别是：

1>@classmethod修饰的类的方法是类方法，第一个参数cls是接收类变量。有子类继承时，调用该类方法时，传入的类变量cls是子类，而非父类。不同于C++中类的静态方法。调用方法：ClassA.func() or ClassA().func()（后者调用时函数忽略类的实例）classmethod() is useful for creating alternateclass constructors.

>>> class A:

... @classmethod

... def func(cls):

... import pdb

... pdb.set_trace()

... pass

...

>>> A.func()

> (6)func()

(Pdb) cls

(Pdb) type(cls)

(Pdb)

>>> type(A())

2>@staticmethod修饰的类的方法是静态方法，静态方法不接收隐式的第一个参数。基本上跟一个全局函数相同，跟C++中类的静态方法很类似。调用方法：ClassA.func() or ClassA().func() （后者调用时函数忽略类的实例）

3>没有上述修饰的类的方法是普通方法（实例方法），第一个参数是self，接收类的实例。调用方法：ClassA().func()

42.字典合并

>>> d1

{1: 6, 11: 12, 12: 13, 13: 14}

>>> d2

{1: 2, 2: 3, 3: 4}

>>> dict(d2, **d1)

{1: 6, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}

>>> dict(d1,**d2)

{1: 2, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}

>>> d = dict(d1)

>>> d

{1: 6, 11: 12, 12: 13, 13: 14}

>>> d2

{1: 2, 2: 3, 3: 4}

>>> d.update(d2)

>>> d

{1: 2, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}

>>> d = dict(d2)

>>> d

{1: 2, 2: 3, 3: 4}

>>> d1

{1: 6, 11: 12, 12: 13, 13: 14}

>>> d.update(d1)

>>> d

{1: 6, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}

43.网络超时处理

1>>urllib2.urlopen(url,timeout=xx)

2>>socket.setdefaulttimeout(xx) #(全局socket超时设置)

3>>定时器

from urllib2 import urlopen

from threading import Timer

url = "http://www.python.org"

def handler(fh):

fh.close()

fh = urlopen(url)

t = Timer(20.0, handler,[fh])

t.start()

data = fh.read()

t.cancel()

44.excel处理

以前一直用的csv模块，读写csv格式文件，然后用excel软件打开另存为xls文件

今天（2012.10.30）发现这个库更直接，更强大http://www.python-excel.org/

鸟人用的版本：（xlwt-0.7.4 xlrd-0.8.0 xlutils-1.5.2）

设置行的高度可以用sheetObj.row(index).set_style(easyxf('font:height 720;')) 设置列的宽度可以用sheetObj.col(index).width = 1000 其他那些方法差不多都有bug 设置不上http://reliablybroken.com/b/2011/10/widths-heights-with-xlwt-python/

#encoding=utf-8

from xlwt import Workbook, easyxf

book = Workbook(encoding='utf-8')

sheet1 = book.add_sheet('Sheet 1')

sheet1.col_width(20000)

book.add_sheet('Sheet 2')

sheet1.write(0,0,'起点')

sheet1.write(0,1,'B1')

row1 = sheet1.row(1)

row1.write(0,'Ai2')

row1.write(1,'B2')

sheet1.col(0).width = 10000

sheet1.col(1).width = 20000

#sheet1.default_col_width = 20000 #bug invalid

#sheet1.col_width(30000) #bug invalid

#sheet1.default_row_height = 5000 #bug invalid

#sheet1.row(0).height = 5000 #bug invalid

sheet1.row(0).set_style(easyxf('font:height 400;'))

style = easyxf('pattern: pattern solid, fore_colour red;'

'align: vertical center, horizontal center;'

'font: bold true;')

sheet1.write_merge(2,5,2,5,'Merged',style)

sheet2 = book.get_sheet(1)

sheet2.row(0).write(0,'Sheet 2 A1')

sheet2.row(0).write(1,'Sheet 2 B1')

sheet2.flush_row_data()

sheet2.write(1,0,'Sheet 2 A3')

sheet2.col(0).width = 5000

sheet2.col(0).hidden = True

book.save('simple.xls')

用这个库的时候很头疼的一点是不知道设置的宽度/高度/颜色在视觉上到底是什么样子，鸟人写了个脚本把所有支持的颜色和常用的宽高打印出来已备选，具体参见http://blog.csdn.net/xiarendeniao/article/details/8276957

45.在本机有多个ip地址的情况下，urllib2发起http请求时如何指定使用哪个IP地址？两种方式，方便且稍带取巧性质的是篡改socket模块的socket方法（下面的代码是这种），另一种是：A better way is to extendconnect() method in subclass ofHTTPConnection and redefinehttp_open() method in subclass ofHTTPHandler

def bind_alt_socket(alt_ip):

true_socket = socket.socket

def bound_socket(*a, **k):

sock = true_socket(*a, **k)

sock.bind((alt_ip, 0))

return sock

socket.socket = bound_socket参考：

http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/

http://stackoverflow.com/questions/1150332/source-interface-with-python-and-urllib2

46.PyQt4的安装：

1.sip安装

wget http://sourceforge.net/projects/pyqt/files/sip/sip-4.14.1/sip-4.14.1.tar.gz

vpython configure.py

make

sudo make install

2.sudo yum install qt qt-devel -y

sudo yum install qtwebkit qtwebkit-devel -y //没有这一个操作的话，下面configure操作就会不生成QtWebKit的Makefile

3.pyqt安装

wget http://sourceforge.net/projects/pyqt/files/PyQt4/PyQt-4.9.5/PyQt-x11-gpl-4.9.5.tar.gz

vpython configure.py -q/usr/bin/qmake-qt4 -g

make

make installdir(PyQt4)看不到的模块不表示不存在啊亲！so动态库可以用from PyQt4 import QtGui或者import PyQt4.QtGui来引入的啊亲！尼玛，我一直以为安装失败了，各种尝试各种找原因啊，崩溃中...

47.一个python解释器要使用另一个python解释器的环境（安装的模块）

参考：http://mydjangoblog.com/2009/03/30/django-mod_python-and-virtualenv/https://pypi.python.org/pypi/virtualenv

下述示例是在默认python环境中使用virtualenv python中安装的callme模块：

[dongsong@localhost ~]$ python

Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import callme

Traceback (most recent call last):

File "", line 1, in

ImportError: No module named callme

>>> activate_this = '/home/dongsong/venv/bin/activate_this.py'

>>> execfile(activate_this, dict(__file__=activate_this))

>>> import callme

>>> 至于如何使得mod_python使用virtualenv python环境，可参考前述连接：

#myvirtualdjango.py

activate_this = '/home/django/progopedia.ru/ve/bin/activate_this.py'

execfile(activate_this, dict(__file__=activate_this))

from django.core.handlers.modpython import handler

ServerName progopedia.ru

ServerAdmin [email protected]

SetHandler python-program

PythonPath "['/home/django/progopedia.ru/ve/bin', '/home/django/progopedia.ru/src/progopedia_ru_project/'] + sys.path"

PythonHandler myvirtualdjango

SetEnv DJANGO_SETTINGS_MODULE settings

SetEnv PYTHON_EGG_CACHE /var/tmp/egg

PythonInterpreter polyprog_ru

48.格式化输出

%r是一个万能的格式付，它会将后面给的参数原样打印出来，带有类型信息

print 会自动在行末加上回车,如果不需回车，只需在print语句的结尾添加一个逗号”,“，就可以改变它的行为

更多精彩用法请见http://www.pythonclub.org/python-basic/print

%r是用对象的repr形式，%s是用str形式

49.finally 很容易搞错哦！

[dongsong@localhost python_study]$ cat finally_test.py

#encoding=utf-8

def func():

a = 1

try:

return a

except Exception,e:

print '%r' % e

else:

print 'no exception'

finally:

print 'finally'

a += 1

a = func()

print 'func returned %s' % a

[dongsong@localhost python_study]$ vpython finally_test.py

finally

func returned 1

50.stackless

官网：http://www.stackless.com/

中文资料（有例子哦~）：http://gashero.yeax.com/?p=30

1>当调用 stackless.schedule() 的时候，当前活动微进程将暂停执行，并将自身重新插入到调度器队列的末尾，好让下一个微进程被执行。一旦在它前面的所有其他微进程都运行过了，它将从上次停止的地方继续开始运行。这个过程会持续，直到所有的活动微进程都完成了运行过程。这就是使用stackless达到合作式多任务的方式。 2>接收的微进程调用 channel.receive() 的时候，便阻塞住，这意味着该微进程暂停执行，直到有信息从这个通道送过来。除了往这个通道发送信息以外，没有其他任何方式可以让这个微进程恢复运行。若有其他微进程向这个通道发送了信息，则不管当前的调度到了哪里，这个接收的微进程都立即恢复执行；而发送信息的微进程则被转移到调度列表的末尾，就像调用了 stackless.schedule() 一样。同样注意，发送信息的时候，若当时没有微进程正在这个通道上接收，也会使当前微进程阻塞。发送信息的微进程，只有在成功地将数据发送到了另一个微进程之后，才会重新被插入到调度器中。 3>清除堆栈溢出的问题：是否还记得，先前我提到过，那个代码的递归版本，有经验的程序员会一眼看出毛病。但老实说，这里面并没有什么“计算机科学”方面的原因在阻碍它的正常工作，有些让人坚信的东西，其实只是个与实现细节有关的小问题——只因为大多数传统编程语言都使用堆栈。某种意义上说，有经验的程序员都是被洗了脑，从而相信这是个可以接受的问题。而stackless，则真正察觉了这个问题，并除掉了它。 4>微线程--轻量级线程：与当今的操作系统中内建的、和标准Python代码中所支持的普通线程相比，“微线程”要更为轻量级，正如其名称所暗示。它比传统线程占用更少的内存，并且微线程之间的切换，要比传统线程之间的切换更加节省资源。 5>计时：现在，我们对若干次实验运行过程进行计时。Python标准库中有一个 timeit.py 程序，可以用作此目的。 6>我们将channel的preference 设置为1，这使得调用send之后任务不被阻塞而继续运行，以便在之后输出正确的仓库信息。 7>In stackless, the balance of a channel is how many tasklets are waiting to send or receive on it.正数表示有send的个数；负数表示receive的个数；0表示没有等待。

总结：stackless python还是受限于GIL，多核用不上，只是比python的传统thread有些改进而已（http://stackoverflow.com/questions/377254/stackless-python-and-multicores）。所以multiprocessing构建多进程、进程内部用stackless构建微线程是不错的搭配。EVE服务器端使用stackless做的（貌似是C++/stackless python），好想看看他们的代码啊，哈哈哈。

stackless python安装：参考http://opensource.hyves.org/concurrence/install.html#installing-stackless

sudo yum install readline-devel -y

./configure --prefix=/opt/stackless --with-readline --with-zlib=/usr/include

make

make install

51.动态加载模块

内建函数__import__()

[dongsong@localhost python_study]$ touch mds/__init__.py

[dongsong@localhost python_study]$ vpython

Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> m = __import__('mds.m1', globals(), locals(), fromlist=[], level = 0)

>>> m

第一次在自己的代码中实用这个函数（2014.6.25），发现需要注意的问题挺多的，要仔细阅读官方说明

class RobotMeta(type):

def __new__(cls, name, bases, attrs):

newbases = list(bases)

import testcase

import pkgutil

for importer, modname, ispkg in pkgutil.iter_modules(testcase.__path__):

if ispkg: continue

mod = __import__('testcase.'+modname, globals(), locals(), fromlist=(modname,), level=1)

if hasattr(mod, 'Robot'):

newbases.append(mod.Robot)

return super(RobotMeta, cls).__new__(cls, name, tuple(newbases), attrs)importlib库，

importlib.import_module()

[dongsong@localhost python_study]$ touch mds/__init__.py

[dongsong@localhost python_study]$ vpython

Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47)

[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> import importlib

>>> m = importlib.import_module('mds.m1')

>>> m

>>>

52.对于user-defined class，如何使其支持pickle和cPickle？（下面是对项目中一个继承自dict的json串反解对象所做的修改，参考http://stackoverflow.com/questions/5247250/why-does-pickle-getstate-accept-as-a-return-value-the-very-instance-it-requi）

def __getstate__(self):

return dict(self)

def __setstate__(self, state):

return self.update(state)

53.判断字符串的组成

s.isalnum() 所有字符都是数字或者字母 s.isalpha() 所有字符都是字母 s.isdigit() 所有字符都是数字 s.islower() 所有字符都是小写 s.isupper() 所有字符都是大写 s.istitle() 所有单词都是首字母大写，像标题 s.isspace() 所有字符都是空白字符、\t、\n、\r

54.python networking framework, 这种python并发问题三言两语难尽其意，故另起炉灶见http://blog.csdn.net/xiarendeniao/article/details/9143059

Twisted是比较常见和广泛使用的(module index)

concurrence 跟stackless有一腿（stackless和libevent的结合体），所以对我比较有吸引力

cogen 跟上面的那个相似，移植性更好一些

gevent greenlet和libevent的结合体（greenlet是stackless的副产品、只是比stackless更原始一些、更容易满足coder对协程的控制欲），这样看跟concurrence原理差不多哦

得出上述总结的原材料：http://stackoverflow.com/questions/1824418/a-clean-lightweight-alternative-to-pythons-twisted

55.python环境变量（environment variables）

import os

if not os.environ.has_key('DJANGO_SETTINGS_MODULE'):

os.environ['DJANGO_SETTINGS_MODULE'] = 'boosencms.settings'

else:

print 'DJANGO_SETTINGS_MODULE: %s' % os.environ['DJANGO_SETTINGS_MODULE']

56.yield，用于生成generator的语法，generator是一个可迭代一次的对象，用generator做迭代（遍历）相对于list、tuple等结构的优势是没必要所有数据都在内存中，详解见官网文档和栈溢出讨论帖

[dongsong@localhost python-study]$ !cat

cat yield.py

def echo(value=None):

print "Execution starts when 'next()' is called for the first time."

try:

while True:

try:

value = (yield value)

except Exception, e:

print "catched an exception", e

value = e

else:

print "yield received ", value

finally:

print "Don't forget to clean up when 'close()' is called."

generator = echo(1)

print generator.next()

print generator.next()

print generator.send(2)

generator.throw(TypeError, "spam")

generator.close()

[dongsong@localhost python-study]$

[dongsong@localhost python-study]$

[dongsong@localhost python-study]$ !python

python yield.py

Execution starts when 'next()' is called for the first time.

1

yield received None

None

yield received 2

2

catched an exception spam

Don't forget to clean up when 'close()' is called.

57.元类metaclass详解见文章 http://blog.csdn.net/xiarendeniao/article/details/9232021

58.单件模式的实现，栈溢出上这个帖子介绍了四种方式，我比较中意第三种http://stackoverflow.com/questions/6760685/creating-a-singleton-in-python

[dongsong@localhost python_study]$ cat singleton3.py

#encoding=utf-8

class Singleton(type):

_instances = {}

def __call__(cls, *args, **kwargs):

if cls not in cls._instances:

cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)

return cls._instances[cls]

class MyClass(object):

__metaclass__ = Singleton

singletonObj = Singleton('Test',(),{})

myClassObj1 = MyClass()

myClassObj2 = MyClass()

print singletonObj, singletonObj.__class__

print id(myClassObj1),myClassObj1,myClassObj1.__class__

print id(myClassObj2),myClassObj2,myClassObj2.__class__

[dongsong@localhost python_study]$ vpython singleton3.py

139799414931408 <__main__.MyClass object at 0x7f2596777fd0>

139799414931408 <__main__.MyClass object at 0x7f2596777fd0> 59.python magic methods ，有些长，单开一篇文章

http://blog.csdn.net/xiarendeniao/article/details/9270407

60.struct 二进制官方文档 http://docs.python.org/3/library/struct.html

CharacterByte orderSizeAlignment@nativenativenative=nativestandardnonebig-endianstandardnone!network (= big-endian)standardnone

FormatC TypePython typeStandard sizeNotesxpad byteno value ccharbytes of length 11 bsigned charinteger1(1),(3)Bunsigned charinteger1(3)?_Boolbool1(1)hshortinteger2(3)Hunsigned shortinteger2(3)iintinteger4(3)Iunsigned intinteger4(3)llonginteger4(3)Lunsigned longinteger4(3)qlong longinteger8(2), (3)Qunsigned longlonginteger8(2), (3)nssize_tinteger (4)Nsize_tinteger (4)ffloatfloat4(5)ddoublefloat8(5)schar[]bytes pchar[]bytes Pvoid *integer (6)

>>> import struct

>>> struct.pack('HH',1,2)

'\x01\x00\x02\x00'

>>> struct.pack('

'\x01\x00\x02\x00'

>>> struct.pack('>HH',1,2)

'\x00\x01\x00\x02'

>>> s= struct.pack('HH',1,2)

>>> s

'\x01\x00\x02\x00'

>>> len(s)

4

>>> struct.unpack('HH',s)

(1, 2)

>>> struct.unpack_from('H', s, 2)

(2,)

>>> struct.unpack('H',s[0:2])

(1,)

61.闭包

[dongsong@localhost python_study]$ cat enclosing_1.py

#encoding=utf8

a = 1

b = 2

def f(v = 0):

a = 2

c = list()

def g():

print 'a = %s' % a

print 'b = %s' % b

print 'c = %r' % c

if v == 0:

a += 1

else:

a += v

c.append(111)

return g

g = f() #函数返回g函数对象赋值给g; 函数对象g跟a(3)、c([111])绑定构成闭包

f(10)() #内嵌对象跟a(12)、c([111])绑定构成闭包；输出: a=12, b=2, c=[111]

f() #没有任何输出，内嵌函数跟a/c绑定后的结果没有使用

g() #输出: a = 3, b = 2, c = [111]

b = 3

g() #输出: a = 3, b = 3, c = [111] (b是全局变量)

print a #输出全局变量: a = 1

[dongsong@localhost python_study]$ vpython enclosing_1.py

a = 12

b = 2

c = [111]

a = 3

b = 2

c = [111]

a = 3

b = 3

c = [111]

1

62.如何阻止pyc跟py文件同居？看栈溢出的讨论帖http://stackoverflow.com/questions/3522079/changing-the-directory-where-pyc-files-are-created

python3.2之后可以在代码目录加一个__pycache__目录，pyc文件会分居到这个目录下（应该是这个意思，python3我没用过）

python2的话可以在启动解释器的时候加上-B参数阻止pyc字节码文件写盘，不过这样势必会导致import变慢（重新编译）

63.微博数据(账号描述)入库报警告且数据被截断：

[dongsong@localhost tfengyun_py]$ vpython new_user.py debug 1852589841

/data/weibofengyun/workspace-php/tfengyun_py/utils.py:26: Warning: Incorrect string value: '\xF0\x9F\x92\x91\xE4\xBD...' for column 'description' at row 1

try: affectCount = self.cursor.execute(sql)

最终解决办法（直接从Python群里copy来的）：

吓人的鸟(362278013) 11:27:58

对于昨天那个数据入库Mysql报Warning的问题大概整明白了，现分享如下，非常感谢@墨迹 !!

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

mysql5.5.3之前不支持utf8mb4,上周五那个入库警告是因为有部分unicode字符(ios设备的emoji表情)编码成utf-8以后占四字节（正常一般不超过三字节）：

>>> u'\u8bb0'.encode('utf-8')

'\xe8\xae\xb0'

>>> u'\U0001f497'.encode('utf-8')

'\xf0\x9f\x92\x97'

对于不想升级mysql版本来解决问题的情况，可以把这种字符过滤掉，栈溢出上有相关讨论

http://stackoverflow.com/questions/10798605/warning-raised-by-inserting-4-byte-unicode-to-mysql

那么对于同一个Mysql数据库和一样的数据，为什么PHP程序可以正常入库(不报错不报警告、数据不被截断)呢？

原来是因为它内部自动的把utf8的四字节编码部分过滤掉了，入库以后在mysql命令行下查询会发现那些emoji表情符不见了，用PHP程序从数据库把数据查出来验证也确实如此

PS: 知之为知之,不知为不知,是知也. 来提问的都是因为比较着急了，希望各位同仁少些说教，多些实际有效建议。

64.（2014.4.25）Python跟C/C++的混合使用（Python使用C/C++扩展，C/C++嵌套Python），最基本的用法当然是参照官网来做了，我有两个对官网相关文档的翻译，巨麻烦！引用什么的规则太多了，这种低级接口不适宜在项目中直接使用。

项目中首选Boost.Python(http://www.boost.org/doc/libs/1_55_0/libs/python/doc/)，用过C++的应该对Boost不陌生，我对Boost的理解是仅次于C++标准库的标准库(09年老成在昆仑写的聊天服用的就是boost.asio)。其中提供了对Python语言的支持。金山的C++/Python游戏服务器就是用的这个库实现C++跟Python之间交互。

其次，听一个同学讲他们项目(貌似非游戏项目)中有用到Pyrex（http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html）,这是一种类似于C和Python语法混写的新语言，没深入了解过，暂且搁下，我还是对Boost.Python比较感兴趣。

Cython(http://cython.org/) 基于Pyrex，被设计用来编写python的c扩展

说到这里不得不提一下pypy(http://pypy.org/)了（虽然pypy不是用来跟c/c++交互的），pypy是python实现的python解释器，jit（Just-in-time compilation，动态编译）使其运行速度比cpython（官方解释器，一般人用的解释器）要快，支持stackless、提供微线程协作，感觉前景一片光明啊！有消息说pypy会丢弃GIL以提升多线程程序的性能，不过我看官方文档好像没这么说（http://pypy.org/tmdonate2.html#what-is-the-global-interpreter-lock）。

65.exec直接就可以执行代码片段

eval执行的是单条表达式

compile可以把代码片段或者代码文件编译成codeobject，exec和eval都可以执行codeobject

https://docs.python.org/2/library/functions.html#compile

[dongsong@localhost python-study]$ python

Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36)

[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>> s = file("code.py").read()

>>> print s

def func():

print "i am in function func()"

return 1,2,3

>>> codeObj = compile(s,"","exec")

>>> dir()

['__builtins__', '__doc__', '__name__', '__package__', 'codeObj', 's']

>>> codeObj

at 0x7f761cd74738, file "", line 1>

 
  >>> eval(codeObj) 
  >>> dir() 
  ['__builtins__', '__doc__', '__name__', '__package__', 'codeObj', 'func', 's'] 
  >>> func() 
  i am in function func() 
  (1, 2, 3) 
    
  [dongsong@localhost python-study]$ python 
  Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36)  
  [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2 
  Type "help", "copyright", "credits" or "license" for more information. 
  >>> s = file("code.py").read() 
  >>> exec(s) 
  >>> dir() 
  ['__builtins__', '__doc__', '__name__', '__package__', 'func', 's'] 
  >>> func() 
  i am in function func() 
  (1, 2, 3)  
  66.随机字符串 http://stackoverflow.com/questions/2257441/random-string-generation-with-upper-case-letters-and-digits-in-python  
    
  >>> import string 
  >>> import random 
  >>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits): 
  ...    return ''.join(random.choice(chars) for _ in range(size)) 
  ... 
  >>> id_generator() 
  'G5G74W' 
  >>> id_generator(3, "6793YUIO") 
  'Y3U' 
  >>> string.ascii_uppercase 
  'ABCDEFGHIJKLMNOPQRSTUVWXYZ' 
  >>> string.digits 
  '0123456789' 
  >>> string.ascii_uppercase + string.digits 
  'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789' 
  >>> string.lowercase 
  'abcdefghijklmnopqrstuvwxyz'  
  67.内建函数hasattr不能查找对象的私有属性（2014.6.18）  
    
  [dongsong@localhost python-study]$ cat hasattr.py 
  #encoding=utf-8 
    
  class A(object): 
      def __init__(self): 
          self.__a = 100 
          self.a = 200 
      def test(self): 
          if hasattr(self,'__a'): print 'found self.__a:',self.__a 
          else: print 'not found self.__a' 
          if hasattr(self,'a'): print 'found self.a:', self.a 
          else: print 'not found self.a:', self.a 
    
  if __name__ == '__main__': 
      t = A() 
      t.test() 
  [dongsong@localhost python-study]$  
  [dongsong@localhost python-study]$ python hasattr.py  
  not found self.__a 
  found self.a: 200  
  68.Python循环import : Circular (or cyclic) imports   
  http://stackoverflow.com/questions/744373/circular-or-cyclic-imports-in-python  
  说白了，a import b, b import a, 那么在a的主代码块(也就是“import a”时会被执行的代码)中使用module b里面的符号(b.xx、from b import xx)会出错。  
  另，python a.py，那么a.py初次会当做__main__ module，“import a”会重新把a执行一遍（这个在源码剖析里面有提到，也就是使用if __name__ == '__main__'判断的原因）  
    
  [root@test-22 xds]# cat maintest.py 
  import maintest 
  print 'main test in ..' 
  if __name__ == '__main__': 
      print 'aaaa' 
  print 'main test out..' 
  [root@test-22 xds]#  
  [root@test-22 xds]# python maintest.py 
  main test in .. 
  main test out.. 
  main test in .. 
  aaaa 
  main test out..


    
        你可能感兴趣的:([转载] python学习笔记)
        
            
                
                    机器学习(Machine Learning)
                        七指琴魔御清绝
大数据学习
                        原文链接：http://blog.csdn.net/zhoubl668/article/details/42921187希望转载的朋友，你可以不用联系我．但是一定要保留原文链接，因为这个项目还在继续也在不定期更新．希望看到文章的朋友能够学到更多．《BriefHistoryofMachineLearning》介绍:这是一篇介绍机器学习历史的文章，介绍很全面，从感知机、神经网络、决策树、SVM、Ada
                    
                    第五周作业——第十章动手试一试
                        hongsqi

                        10-1Python学习笔记学习笔记：在文本编辑器中新建一个文件，写几句话来总结一下你至此学到的Python知识，其中每一行都以“InPythonyoucan”打头。将这个文件命名为learning_python.txt，并将其存储到为完成本章练习而编写的程序所在的目录中。编写一个程序，它读取这个文件，并将你所写的内容打印三次：第一次打印时读取整个文件；第二次打印时遍历文件对象；第三次打印时将各行
                    
                    MySQL自动建立集合自动分片_mongodb撤销集合分片
                        西风吹浮华
MySQL自动建立集合自动分片
                        mongodb撤销集合分片2019年08月16日16:39:41WFkwYu阅读数31更多版权声明：本文为博主原创文章，遵循CC4.0BY-SA版权协议，转载请附上原文出处链接和本声明。mongodb撤销集合分片基本步骤：停止所有有关和mongodb连接的应用程序导出需要撤销的集合数据禁用分片的自动平衡删除该集合导入集合数据开启分片的自动平衡1、停止所有有关和mongodb连接的应用程序(根据实际
                    
                    字符串相乘——大整数乘法
                        菜鸟日常
算法
                        概述给定两个以字符串形式表示的非负整数num1和num2，返回num1和num2的乘积，它们的乘积也表示为字符串形式。输入:num1=“2”,num2=“3”输出:“6”来源：力扣（LeetCode）链接：https://leetcode-cn.com/problems/multiply-strings著作权归领扣网络所有。商业转载请联系官方授权，非商业转载请注明出处。思路常规思路：1234*45
                    
                    HarmonyOS Next 企业数据备份与恢复策略
                        SameX-4869
harmonyos华为
                        本文旨在深入探讨华为鸿蒙HarmonyOSNext系统（截止目前API12）在企业数据备份与恢复方面的技术细节，基于实际开发实践进行总结。主要作为技术分享与交流载体，难免错漏，欢迎各位同仁提出宝贵意见和问题，以便共同进步。本文为原创内容，任何形式的转载必须注明出处及原作者。一、备份与恢复的重要性（一）关键作用阐述在企业数字化运营的舞台上，数据是当之无愧的主角，而数据备份与恢复则是确保这场演出顺利进
                    
                    opencv cuda例程 OpenCV和Cuda结合编程
                        weixin_44602056
opencvC++
                        本文转载自：https://www.fuwuqizhijia.com/linux/201704/70863.html此网页，仅保存下来供随时查看一、利用OpenCV中提供的GPU模块目前，OpenCV中已提供了许多GPU函数，直接使用OpenCV提供的GPU模块，可以完成大部分图像处理的加速操作。该方法的优点是使用简单，利用GpuMat管理CPU与GPU之间的数据传输，而且不需要关注内核函数调用参
                    
                    Vue 技术博客：从零开始构建一个 Vue Markdown 编辑器
                        王大师王文峰
Java基础到框架vue.js编辑器前端
                        本人详解作者：王文峰，参加过CSDN2020年度博客之星，《Java王大师王天师》公众号：JAVA开发王大师，专注于天道酬勤的Java开发问题中国国学、传统文化和代码爱好者的程序人生，期待你的关注和支持！本人外号：神秘小峯山峯转载说明：务必注明来源（注明：作者：王文峰哦）学习教程（传送门）Vue技术博客：从零开始构建一个VueMarkdown编辑器前言环境准备实现步骤1.引入组件与库2.模板设计3
                    
                    VC----实现汉字简繁转换
                        iteye_13045
VC++VC++
                        转载请注明出处：http://blog.csdn.net/yf210yf/article/details/7850472不知道会不会写的很多，总之先补一下知识：（1）计算机汉字编码简介关于汉字编码为进行信息交换，各汉字使用地区都制订了一系列汉字字符集标准。①GB2313字符集，收入汉字6763个，符号715个，总计7478个字符，这是大陆普遍使用的简体字符集。楷体-GB2313、仿宋-GB2313
                    
                    通过LNK文件(快捷方式)解析出目标文件的路径
                        Oo璀璨星海oO
windows
                        转载自：https://blog.csdn.net/yoie01/article/details/8688686尼玛的~网上找了一堆资料都是有问题的代码，各种转发，错的东西传来传去，误人子弟！！！自己重写了个，加上注释，留着备用引用头：#include关键引用的类IShellLink：IShellLink主要方法：1、GetArguments：获得参数信息2、GetDescription：获得描述
                    
                    基于 Verilog 的经典数字电路设计（1）加法器
                        新芯设计
1专栏革新中禁止订阅！！！FPGAVerilog加法器数字IC设计IC
                        基于Verilog的经典数字电路设计（1）加法器版权所有，新芯设计，转载文章，请注来源引言一、半加器的Verilog代码实现和RTL电路实现一、全加器的Verilog代码实现和RTL电路实现引言  加法器是非常重要的，它不仅是其它复杂算术运算的基础，也是CPU中ALU的核心部件（全加器）。两个二进制数之间的算术逻辑运算例如加减乘除，在数字计算机中都是化为若干步加法操作进行的，因此，学好数字电路，从
                    
                    【Java代码审计 | 第十三篇】XXE漏洞成因及防范
                        秋说
Java代码审计javaXXE
                        未经许可，不得转载。文章目录XXE漏洞成因解析XML的Java方法DocumentBuilder（原生，可回显）SAXReader（DOM4J，第三方库）SAXBuilder（JDOM，第三方库）SAXParserFactory（原生，不可回显）XMLReaderFactoryDigester（ApacheCommonsDigester）支持XInclude的DocumentBuilderSAXP
                    
                    asp.net core使用gzip
                        weixin_30663471
c#测试json
                        http://www.talkingdotnet.com/how-to-enable-gzip-compression-in-asp-net-core/转载于:https://www.cnblogs.com/94pm/p/9063128.html
                    
                    工作中的adb 常用命令
                        跨界混迹车辆网的Android工程师

                        工作中的adb常用命令2017年05月08日10:58:01yang_zhang_1992阅读数：2812版权声明：本文为博主原创文章，未经博主允许不得转载。https://blog.csdn.net/yang_zhang_1992/article/details/71404186adb常用命令大全1.显示当前运行的全部模拟器：adbdevices2.对某一模拟器执行命令：abd-s模拟器编号命令
                    
                    2020年精排模型调研
                        Marcus-Bao
机器不学习人工智能机器学习大数据算法
                        ❝本文经作者同意转载自:https://zhuanlan.zhihu.com/p/335781101作者:Ruhjkg编辑:MarcusBao谢绝任何形式的二次转载！❞2020年精排模型调研前言最近由于工作需要调研了一下2020年关于精排模型的进展。在广告推荐领域的CTR预估问题上，早期以LR+人工特征工程为主的机器学习方法，但由于人工组合特征工程成本较高，不同任务难以复用。后面FM因子分解机提出
                    
                    FreeRTOS第17篇：FreeRTOS链表实现细节05_MiniListItem_t：FreeRTOS内存优化
                        指尖动听知识库
链表数据结构
                        文/指尖动听知识库-星愿文章为付费内容，商业行为，禁止私自转载及抄袭，违者必究！！！文章专栏：深入FreeRTOS内核：从原理到实战的嵌入式开发指南1为什么需要迷你列表项？在嵌入式系统中，内存资源极其宝贵。FreeRTOS为满足不同场景需求，设计了标准列表项（ListItem_t）和迷你列表项（MiniListItem_t），后者通过牺牲部分功能换取内存效率的极致优化。1.1标准列表项的局限性内存
                    
                    133个Java面试题和答案
                        晨旭猿
androidjava面试133个java面试题
                        作者：极乐君链接：https://zhuanlan.zhihu.com/p/23533393来源：知乎著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。特别注释：不要迷信下面这份答案，实践是唯一真理。有几处有疑问的地方我拿红色标注标出来的。其它的可能也会有问题Java面试中的重要话题这份Java面试问题列表包含的主题：多线程，并发及线程基础数据类型转换的基本原则垃圾回收（GC）J
                    
                    Nginx中$http_host、$host、$proxy_host的区别
                        m0_74823434
面试学习路线阿里巴巴nginxhttp运维
                        知识巩固！网上看到这篇文章，这里转载记录一下。简介变量是否显示端口值是否存在host浏览器请求的ip，不显示端口否"Host:value"显示值为a:b的时候，只显示ahttp_host浏览器请求的ip和端口号是“Host:value”，value存在就显示proxy_host被代理服务的ip和端口号默认80不显示其他端口显示"Host:value"显示配置反向代理时，接口请求报404问题应用描述
                    
                    QT常用函数大全(更新中）
                        dori12
Qtqt开发语言c++
                        部分转载于百度文库。显示中文(主要在main函数实现)/********显示中文(主要在main函数实现)**********/#include编码头文件QTextCodecx:setCodecForCStrings(QTextCodec::codecForName("gb18030);//窗口里面可以接收或写中文文字//这个和上面那个是等级的QTextCodec:setCodecForLocal
                    
                    Django+Vue创建项目前后端分离
                        我就是我是好孩子啊
djangovue.jspython
                        转载掘金文章详细介绍了如何搭建项目https://juejin.cn/post/7028812676230807582Django的TemplateView指向生成的前端dist文件即可.1、找到project目录的urls.py，使用通用视图创建最简单的模板控制器，访问『/』时直接返回index.html:fromdjango.conf.urlsimporturlfromdjango.views
                    
                    流形学习-Manifold Learning
                        鹊踏枝-码农
机器学习模式识别流形学习ManifoldLearning
                        来源：转载本文请联系原作者获取授权，同时请注明本文来自张重科学网博客。链接地址：http://blog.sciencenet.cn/blog-722391-583413.html流形（manifold）的概念最早是在1854年由Riemann提出的(德文Mannigfaltigkeit)，现代使用的流形定义则是由HermannWeyl在1913年给出的。江泽涵先生对这个名词的翻译出自文天祥《正气歌
                    
                    VMware ESXi 8.0U3d macOS Unlocker & OEM BIOS 2.7 集成驱动版
                        
esxi
                        VMwareESXi8.0U3dmacOSUnlocker&OEMBIOS2.7集成网卡驱动和NVMe驱动(集成驱动版)发布ESXi8.0U3集成驱动版，在个人电脑上运行企业级工作负载请访问原文链接：https://sysin.org/blog/vmware-esxi-8-u3-sysin/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，ESXi8.0U3d
                    
                    VMware ESXi 8.0U2d macOS Unlocker & OEM BIOS 集成驱动版
                        
esxi
                        VMwareESXi8.0U2dmacOSUnlocker&OEMBIOS集成网卡驱动和NVMe驱动(集成驱动版)发布ESXi8.0U2集成驱动版，在个人电脑上运行企业级工作负载请访问原文链接：https://sysin.org/blog/vmware-esxi-8-u2-sysin/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，ESXi8.0U2d发布，
                    
                    VMware ESXi 7.0U3s macOS Unlocker & OEM BIOS 2.7 集成驱动版
                        
esxi
                        VMwareESXi7.0U3smacOSUnlocker&OEMBIOS2.7集成网卡驱动和NVMe驱动(集成驱动版)ESXi7.0U3标准版集成Intel网卡、RealtekUSB网卡和NVMe驱动请访问原文链接：https://sysin.org/blog/vmware-esxi-7-u3-sysin/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，E
                    
                    VMware ESXi 6.7U3v macOS Unlocker & OEM BIOS 2.7 集成驱动版
                        
esxi
                        VMwareESXi6.7U3vmacOSUnlocker&OEMBIOS2.7集成Realtek网卡驱动和NVMe驱动(集成驱动版)此版本解决的问题：VMwareHostClient无法将现有虚拟磁盘(VMDK)附加到虚拟机请访问原文链接：https://sysin.cn/blog/vmware-esxi-6-sysin/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025
                    
                    VMware ESXi 6.7 U3v (ESXi670-202503001) 下载
                        
esxi
                        VMwareESXi6.7U3v(ESXi670-202503001.zip)下载VMwareESXi6ExtendSupportRelease请访问原文链接：https://sysin.cn/blog/vmware-esxi-6/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，ESXi6.7U3v(ESXi670-202503001)发布，例行更新。产品简
                    
                    VMware Fusion 13.6.3 OEM BIOS 2.7 - 在 macOS 中运行 Windows 虚拟机的最佳方式
                        
vmware
                        VMwareFusion13.6.3OEMBIOS2.7-在macOS中运行Windows虚拟机的最佳方式VMwareFusion13原版App中集成OEMBIOS请访问原文链接：https://sysin.org/blog/vmware-fusion-13-oem/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，13.6.3发布，同步更新。使用VMware
                    
                    VMware Workstation 17.6.3 Pro Unlocker OEM BIOS for Windows
                        

                        VMwareWorkstation17.6.3PromacOSUnlocker&OEMBIOS2.7forWindows在Windows上运行macOSSequoia请访问原文链接：https://sysin.org/blog/vmware-workstation-17-unlocker-windows/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，17.
                    
                    VMware Workstation 17.6.3 Pro Unlocker OEM BIOS 2.7 for Linux
                        

                        VMwareWorkstation17.6.3PromacOSUnlocker&OEMBIOS2.7forLinux在Linux上运行macOSSequoia请访问原文链接：https://sysin.org/blog/vmware-workstation-17-unlocker-linux/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，17.6.3发布，
                    
                    VMware Fusion 13.6.3 OEM BIOS 2.7 - 在 macOS 中运行 Windows 虚拟机的最佳方式
                        
vmware
                        VMwareFusion13.6.3OEMBIOS2.7-在macOS中运行Windows虚拟机的最佳方式VMwareFusion13原版App中集成OEMBIOS请访问原文链接：https://sysin.org/blog/vmware-fusion-13-oem/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，13.6.3发布，同步更新。使用VMware
                    
                    VMware Workstation 17.6.3 Pro Unlocker OEM BIOS for Windows
                        

                        VMwareWorkstation17.6.3PromacOSUnlocker&OEMBIOS2.7forWindows在Windows上运行macOSSequoia请访问原文链接：https://sysin.org/blog/vmware-workstation-17-unlocker-windows/查看最新版。原创作品，转载请保留出处。作者主页：sysin.org2025-03-04，17.
                    
                                Java实现的简单双向Map，支持重复Value
                                    superlxw1234
java双向map
                                    关键字：Java双向Map、DualHashBidiMap 
  
  
有个需求，需要根据即时修改Map结构中的Value值，比如，将Map中所有value=V1的记录改成value=V2，key保持不变。 
  
数据量比较大，遍历Map性能太差，这就需要根据Value先找到Key，然后去修改。 
  
即：既要根据Key找Value，又要根据Value
                                
                                PL/SQL触发器基础及例子
                                    百合不是茶
oracle数据库触发器PL/SQL编程
                                      
触发器的简介; 
触发器的定义就是说某个条件成立的时候，触发器里面所定义的语句就会被自动的执行。因此触发器不需要人为的去调用，也不能调用。触发器和过程函数类似 过程函数必须要调用, 
  
一个表中最多只能有12个触发器类型的,触发器和过程函数相似 触发器不需要调用直接执行,


 
触发时间：指明触发器何时执行，该值可取：
before：表示在数据库动作之前触发
                                
                                [时空与探索]穿越时空的一些问题
                                    comsci
问题
                                     
      我们还没有进行过任何数学形式上的证明,仅仅是一个猜想..... 
 
      这个猜想就是; 任何有质量的物体(哪怕只有一微克)都不可能穿越时空,该物体强行穿越时空的时候,物体的质量会与时空粒子产生反应,物体会变成暗物质,也就是说,任何物体穿越时空会变成暗物质..(暗物质就我的理
                                
                                easy ui datagrid上移下移一行
                                    商人shang
js上移下移easyuidatagrid
                                    /**
 * 向上移动一行
 * 
 * @param dg
 * @param row
 */
function moveupRow(dg, row) {
	var datagrid = $(dg);
	var index = datagrid.datagrid("getRowIndex", row);
	if (isFirstRow(dg, row)) {
                                
                                Java反射
                                    oloz
反射
                                    本人菜鸟，今天恰好有时间，写写博客，总结复习一下java反射方面的知识，欢迎大家探讨交流学习指教 
 
首先看看java中的Class 
 
package demo;

public class ClassTest {
	
	/*先了解java中的Class*/
	
	public static void main(String[] args) {
		
    //任何一个类都
                                
                                springMVC 使用JSR-303 Validation验证
                                    杨白白
springmvc
                                    JSR-303是一个数据验证的规范，但是spring并没有对其进行实现，Hibernate Validator是实现了这一规范的，通过此这个实现来讲SpringMVC对JSR-303的支持。 
 JSR-303的校验是基于注解的，首先要把这些注解标记在需要验证的实体类的属性上或是其对应的get方法上。 
 
登录需要验证类 
 
public class Login {

	@NotEmpty
                                
                                log4j
                                    香水浓
log4j
                                    
log4j.rootCategory=DEBUG, STDOUT, DAILYFILE, HTML, DATABASE
#log4j.rootCategory=DEBUG, STDOUT, DAILYFILE, ROLLINGFILE, HTML

#console
log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
log4
                                
                                使用ajax和history.pushState无刷新改变页面URL
                                    agevs
jquery框架Ajaxhtml5chrome
                                    表现 
如果你使用chrome或者firefox等浏览器访问本博客、github.com、plus.google.com等网站时，细心的你会发现页面之间的点击是通过ajax异步请求的，同时页面的URL发生了了改变。并且能够很好的支持浏览器前进和后退。 
是什么有这么强大的功能呢？ 
HTML5里引用了新的API，history.pushState和history.replaceState，就是通过
                                
                                centos中文乱码
                                    AILIKES
centosOSssh
                                    一、CentOS系统访问 g.cn ，发现中文乱码。 
于是用以前的方式：yum -y install fonts-chinese 
CentOS系统安装后，还是不能显示中文字体。我使用 gedit 编辑源码，其中文注释也为乱码。     
  
      
  
后来，终于找到以下方法可以解决，需要两个中文支持的包： 
fonts-chinese-3.02-12.
                                
                                触发器
                                    baalwolf
触发器
                                    触发器(trigger)：监视某种情况，并触发某种操作。 
触发器创建语法四要素：1.监视地点(table) 2.监视事件(insert/update/delete) 3.触发时间(after/before) 4.触发事件(insert/update/delete) 
语法： 
create trigger triggerName 
after/before 
                                
                                JS正则表达式的i m g
                                    bijian1013
JavaScript正则表达式
                                            g:表示全局（global)模式，即模式将被应用于所有字符串，而非在发现第一个匹配项时立即停止。         i:表示不区分大小写（case-insensitive）模式，即在确定匹配项时忽略模式与字符串的大小写。         m:表示
                                
                                HTML5模式和Hashbang模式
                                    bijian1013
JavaScriptAngularJSHashbang模式HTML5模式
                                            我们可以用$locationProvider来配置$location服务（可以采用注入的方式，就像AngularJS中其他所有东西一样）。这里provider的两个参数很有意思，介绍如下。 
html5Mode 
        一个布尔值，标识$location服务是否运行在HTML5模式下。 
ha
                                
                                [Maven学习笔记六]Maven生命周期
                                    bit1129
maven
                                    从mvn test的输出开始说起 
  
当我们在user-core中执行mvn test时，执行的输出如下： 
  
/software/devsoftware/jdk1.7.0_55/bin/java -Dmaven.home=/software/devsoftware/apache-maven-3.2.1 -Dclassworlds.conf=/software/devs
                                
                                【Hadoop七】基于Yarn的Hadoop Map Reduce容错
                                    bit1129
hadoop
                                    运行于Yarn的Map Reduce作业，可能发生失败的点包括 
 
 Task Failure 
 Application Master Failure 
 Node Manager Failure 
 Resource Manager Failure 
 1. Task Failure 
任务执行过程中产生的异常和JVM的意外终止会汇报给Application Master。僵死的任务也会被A
                                
                                记一次数据推送的异常解决端口解决
                                    ronin47
记一次数据推送的异常解决
                                    　　 需求：从db获取数据然后推送到B 
        程序开发完成，上jboss,刚开始报了很多错，逐一解决，可最后显示连接不到数据库。机房的同事说可以ping 通。 
　　  自已画了个图，逐一排除，把linux 防火墙　和　setenforce　设置最低。 
　　　service iptables stop 

                                
                                巧用视错觉-UI更有趣
                                    brotherlamp
UIui视频ui教程ui自学ui资料
                                    我们每个人在生活中都曾感受过视错觉（optical illusion）的魅力。 
视错觉现象是双眼跟我们开的一个玩笑，而我们往往还心甘情愿地接受我们看到的假象。其实不止如此，视觉错现象的背后还有一个重要的科学原理——格式塔原理。 
格式塔原理解释了人们如何以视觉方式感觉物体，以及图像的结构，视角，大小等要素是如何影响我们的视觉的。 
在下面这篇文章中，我们首先会简单介绍一下格式塔原理中的基本概念，
                                
                                线段树-poj1177-N个矩形求边长（离散化+扫描线）
                                    bylijinnan
数据结构算法线段树
                                    package com.ljn.base;

import java.util.Arrays;
import java.util.Comparator;
import java.util.Set;
import java.util.TreeSet;

/**
 * POJ 1177 (线段树+离散化+扫描线)，题目链接为http://poj.org/problem?id=1177

                                
                                HTTP协议详解
                                    chicony
http协议
                                    引言                                 
                                
                                Scala设计模式
                                    chenchao051
设计模式scala
                                    Scala设计模式 
        
       我的话： 在国外网站上看到一篇文章，里面详细描述了很多设计模式，并且用Java及Scala两种语言描述，清晰的让我们看到各种常规的设计模式，在Scala中是如何在语言特性层面直接支持的。基于文章很nice，我利用今天的空闲时间将其翻译，希望大家能一起学习，讨论。翻译
                                
                                安装mysql
                                    daizj
mysql安装
                                    安装mysql 
  (1)删除linux上已经安装的mysql相关库信息。rpm  -e  xxxxxxx   --nodeps (强制删除) 
     执行命令rpm -qa |grep mysql 检查是否删除干净 
  (2)执行命令  rpm -i MySQL-server-5.5.31-2.el
                                
                                HTTP状态码大全
                                    dcj3sjt126com
http状态码
                                    完整的 HTTP 1.1规范说明书来自于RFC 2616，你可以在http://www.talentdigger.cn/home/link.php?url=d3d3LnJmYy1lZGl0b3Iub3JnLw%3D%3D在线查阅。HTTP 1.1的状态码被标记为新特性，因为许多浏览器只支持 HTTP 1.0。你应只把状态码发送给支持 HTTP 1.1的客户端，支持协议版本可以通过调用request
                                
                                asihttprequest上传图片
                                    dcj3sjt126com
ASIHTTPRequest
                                    NSURL *url =@"yourURL";
    ASIFormDataRequest*currentRequest =[ASIFormDataRequest requestWithURL:url];
    [currentRequest setPostFormat:ASIMultipartFormDataPostFormat];[currentRequest se
                                
                                C语言中，关键字static的作用
                                    e200702084
C++cC#
                                    在C语言中，关键字static有三个明显的作用： 
 
1)在函数体，局部的static变量。生存期为程序的整个生命周期，（它存活多长时间）；作用域却在函数体内（它在什么地方能被访问（空间））。 
一个被声明为静态的变量在这一函数被调用过程中维持其值不变。因为它分配在静态存储区，函数调用结束后并不释放单元，但是在其它的作用域的无法访问。当再次调用这个函数时，这个局部的静态变量还存活，而且用在它的访
                                
                                win7/8使用curl
                                    geeksun
win7
                                    1.  WIN7/8下要使用curl，需要下载curl-7.20.0-win64-ssl-sspi.zip和Win64OpenSSL_Light-1_0_2d.exe。  下载地址：  
http://curl.haxx.se/download.html   请选择不带SSL的版本，否则还需要安装SSL的支持包       2.  可以给Windows增加c
                                
                                Creating a Shared Repository; Users Sharing The Repository
                                    hongtoushizi
git
                                    转载自：  
http://www.gitguys.com/topics/creating-a-shared-repository-users-sharing-the-repository/  Commands discussed in this section: 
 
 git init –bare 
 git clone 
 git remote 
 git pull 
 git p
                                
                                Java实现字符串反转的8种或9种方法
                                    Josh_Persistence
异或反转递归反转二分交换反转java字符串反转栈反转
                                    注：对于第7种使用异或的方式来实现字符串的反转，如果不太看得明白的，可以参照另一篇博客： 
http://josh-persistence.iteye.com/blog/2205768 
  
/**
 * 
 */
package com.wsheng.aggregator.algorithm.string;

import java.util.Stack;

/**

                                
                                代码实现任意容量倒水问题
                                    home198979
PHP算法倒水
                                    形象化设计模式实战             HELLO!架构                     redis命令源码解析 
  
倒水问题：有两个杯子，一个A升，一个B升，水有无限多，现要求利用这两杯子装C
                                
                                Druid datasource
                                    zhb8015
druid
                                    推荐大家使用数据库连接池 DruidDataSource. http://code.alibabatech.com/wiki/display/Druid/DruidDataSource DruidDataSource经过阿里巴巴数百个应用一年多生产环境运行验证，稳定可靠。 它最重要的特点是：监控、扩展和性能。 下载和Maven配置看这里： http
                                
                                两种启动监听器ApplicationListener和ServletContextListener
                                    spjich
javaspring框架
                                    引言:有时候需要在项目初始化的时候进行一系列工作，比如初始化一个线程池，初始化配置文件，初始化缓存等等，这时候就需要用到启动监听器，下面分别介绍一下两种常用的项目启动监听器 
  
ServletContextListener  
特点: 依赖于sevlet容器，需要配置web.xml 
使用方法: 
public class StartListener implements 
                                
                                JavaScript Rounding Methods of the Math object
                                    何不笑
JavaScriptMath
                                        The next group of methods has to do with rounding decimal values into integers. Three methods — Math.ceil(),  Math.floor(), and  Math.round() — handle rounding in differen
                                
                
            
        
    


    
        
            按字母分类：
            ABCDEFGHIJKLMNOPQRSTUVWXYZ其他
        
    


    
        
            首页 -
            关于我们 -
            站内搜索 -
            Sitemap -
            侵权投诉
        
        版权所有 IT知识库 CopyRight © 2000-2050 E-COM-NET.COM , All Rights Reserved.