xiarendeniao

python学习笔记

官网http://www.python.org/

官网library http://docs.python.org/library/

PyPI https://pypi.python.org/pypi

中文手册，适合快速入门 http://download.csdn.net/detail/xiarendeniao/4236870

python cook book中文版 http://download.csdn.net/detail/XIARENDENIAO/3231793

1.数值尤其是实数很方便、字符串操作很炫、列表

   a = complex(1,0.4)
   a.real
   a.imag

Unicode()

字符串前加上r/R表示常规字符串，加上u/U表示unicode字符串

列表的append()方法在列表末尾加一个新元素

2.流程控制

while：
if:
	if xxx:
		...
	elif yyy:
		...
	elif xxx:
		...
	else:
		...
for
range()
break  continue  循环中的else
pass

3.函数
   1)def funA(para)   没有return语句时函数返回None，参数传递进去的是引用
   2)默认参数，默认参数是列表、字典、类实例时要小心
   3)不定参数，def funB(king, *arguments, **keywords) 不带关键字的参数值存在元组arguments中，关键字跟参数值存在字典keywords中。其实是元组封装和序列拆封的一个结合。
   4) def funC(para1, para2, para3) 下面的调用把列表元素分散成函数参数funcC(*list)

5)匿名函数 lambda arg1,arg2...:<expression>

特点：创建一个函数对象，但是没有赋值给标识符（不同于def）;lambda是表达式，不是语句；“：”后面只能是一个表达式

   6)if ‘ok’ in (‘y’, ‘ye’, ‘yes’): xxxxx 关键字in的用法
   7)f = bambda x: x*2 等效于 def f(x): return x*2

4.数据结构
   1)[] help(list) append(x) extend(L) insert(i,x) remove(x) pop([i]) index(x) count(x) sort() reverse()
   2)List的函数化编程 filter() map() reduce()
   3)列表推导式 aimTags = [aimTag for aimTag in aimTags if aimTag not in filterAimTags]
   4)del删除列表切片或者整个变量
   5)() help(tuple) 元组tuple，其中元素和字符串一样不能改变。元组、字符串、列表都是序列。 Python 要求单元素元组中必须使用逗号，以此消除与圆括号表达式之间的歧义。这是新手常犯的错误
   6){} help(dict) 字典 keys() has_key() 可用以键值对元组为元素的列表直接构造字典
   7)循环字典：for k, v in xxx.iteritems():… for item in xxx.items():... 序列：for i, v in enumerate([‘tic’, ‘tac’, ‘toe’]):… 同时循环多个序列：for q, a in zip(questions, answers):…
   8)in   not in   is   is not   a<b==c   and    or   not
   9)相同类型的序列对象之间可以用< > ==进行比较

10)判断变量类型的两种方法：isinstance（var,int） type(var).__name__=="int"

多种类型判断，isinstance(s,(str,unicode))当s是常规字符串或者unicode字符串都会返回True

11）在循环中删除list元素时尤其要注意出问题，for i in listA:... listA.remove(i)是会有问题的，删除一个元素之后后面的元素就前移了；for i in len(listA):...del listA[i]也会有问题，删除元素后长度变化，循环会越界

filter(lambda x:x !=4,listA)这种方式比较优雅

listA = [ i for i in listA if i !=4] 也不错，或者直接创建一个新的列表算球

效率：
1)"if k in my_dict" 优于 "if my_dict.has_key(k)"

2)"for k in my_dict" 优于 "for k in my_dict.keys()",也优于"for k in [....]"

12）set是dict的一种实现 https://docs.python.org/2/library/stdtypes.html#set-types-set-frozenset

>>> s1 = set([1,2,3,4,5]) 
>>> s2 = set([3,4,5,6,7,8]) 
>>> s1|s2
set([1, 2, 3, 4, 5, 6, 7, 8])
>>> s1-s2
set([1, 2])
>>> s2-s1
set([8, 6, 7])

5.模块
   1)模块名由全局变量__name__得到，文件fibo.py可以作为fibo模块被import fibo导入到其他文件或者解释器中，fibo.py中函数明明必须以fib开头
   2)import变体： from fibo import fib, fib2 然后不用前缀直接使用函数
   3)sys.path   sys.ps1   sys.ps2
   4)内置函数 dir() 用于按模块名搜索模块定义，它返回一个字符串类型的存储列表，列出了所有类型的名称：变量，模块，函数，等等
      help()也有类似的作用
   5)包 import packet1.packet2.module       from packet1.packet2 import module       from packet1.packet2.module import functionA
   6)import 语句按如下条件进行转换：执行 from package import * 时，如果包中的 __init__.py 代码定义了一个名为 __all__ 的列表，就会按照列表中给出的模块名进行导入
   7)sys.path打印出当前搜索python库的路径，可以在程序中用sys.path.append("/xxx/xxx/xxx")来添加新的搜索路径
   8)安装python模块时可以用easy_install，卸载easy_install -m pkg_name
   9)用__doc__可以得到某模块、函数、对象的说明，用__name__可以得到名字（典型用法：if __name__=='__main__'： ...）

6.IO

1)str() unicode() repr() repr() print rjust() ljust() center() zfill() xxx%v xxx%(v1,v2) 打印复杂对象时可用pprint模块（调试时很有用）

对于自定义的类型，要支持pprint需要提供__repr__方法。对于pprint的结果不想直接给标准输出(pprint.pprint(var))可以用pprint.pformat(var).

   2)f = open(“fileName”, “w”) w r a r+ Win和Macintosh平台还有一个模式”b”
      f.read(size)
      f.readline()
      f.write(string)
      f.writelines(list)
      f.tell()
      f.seek(offset, from_what) from_what:0开头 1当前 2末尾 offset:byte数http://www.linuxidc.com/Linux/2007-12/9644p3.htm
      f.close()

linecache模块可以方便的获取文件某行数据，在http-server端使用时要注意，尤其是操作大文件很危险，并发情况下很容易就让机器内存耗尽、系统直接挂掉（本人血的教训）

文件操作时shutil比较好用

os.walk()遍历目录下所有文件

   3)pickle模块(不是只能写入文件中)
   封装（pickling）类似于php的序列化：pickle.dump(objectX, fileHandle)
   拆封（unpickling）类似于php反序列化：objectX = pickle.load(fileHandle)

   msgpack(easy_install msgpack-python)比pickle和cpickle都好用一些,速度较快
   msgpack.dump(my_var, file('test_file_name','w'))
   msgpack.load(file('test_file_name','r'))

4)raw_input()接受用户输入

7.class
1)以两个下划线下头、以不超过一个下划线结尾成员变量和成员函数都是私有的，父类的私有成员在子类中不可访问

2)调用父类的方法：1>ParentClass.FuncName(self,args) 2>super(ChildName,self).FuncName(args) 第二种方法的使用必须保证类是从object继承下来的，否则super会报错

3)静态方法定义，在方法名前一行写上@staticmethod。可以通过类名直接调用。

#!/bin/python
#encoding=utf8
class A(object):
        def __init__(self, a, b):
                self.a = a
                self.b = b
        def show(self):
                print "A::show() a=%s b=%s" % (self.a,self.b)

class B(A):
        def __init__(self, a, b, c):
                #A.__init__(self,a,b)
                super(B,self).__init__(a,b) #super这种用法要求父类必须是从object继承的
                self.c = c

if __name__ == "__main__":
        b = B(1,2,3) 
        print b.a,b.b,b.c
        b.show()

#输出
xudongsong@sysdev:~$ python class_test.py 
1 2 3
A::show() a=1 b=2

8.编码
   常见的编码转换分为以下几种情况：

        unicode->其它编码
        例如：a为unicode编码要转为gb2312。a.encode('gb2312')

        其它编码->unicode
        例如：a为gb2312编码，要转为unicode。 unicode(a, 'gb2312')或a.decode('gb2312')

        编码1 -> 编码2
        可以先转为unicode再转为编码2

        如gb2312转big5
        unicode(a, 'gb2312').encode('big5')

        判断字符串的编码
        isinstance(s, str) 用来判断是否为一般字符串
        isinstance(s, unicode) 用来判断是否为unicode

如果一个字符串已经是unicode了，再执行unicode转换有时会出错(并不都出错)

>>> str2 = u"sfdasfafasf"
>>> type(str2)
<type 'unicode'>
>>> isinstance(str2,str)
False
>>> isinstance(str2,unicode)
True
>>> type(str2)
<type 'unicode'>
>>> str3 = "safafasdf"
>>> type(str3)        
<type 'str'>
>>> isinstance(str3,unicode)
False
>>> isinstance(str3,str)    
True
>>> str4 = r'asdfafadf'
>>> isinstance(str4,str)
True
>>> isinstance(str4,unicode)
False
>>> type(str4)
<type 'str'>

可以写一个通用的转成unicode函数：
        def u(s, encoding):
            if isinstance(s, unicode):
                return s
            else:
                return unicode(s, encoding)

9.线程
   1)要让子线程跟着父线程一起退出，可以对子线程调用setDaemon()
   2)对子线程调用join()方法可以让父线程等到子线程退出之后再退出

3)ctrl+c只能被父线程捕获到（子线程不能调用信号捕获函数signal.signal(signal,function)），对子线程调用join()会导致父线程捕获不到ctrl+c，需要子线程退出后才能捕获到

附：成应元老师关于python信号的邮件
参考 http://stackoverflow.com/questions/631441/interruptible-thread-join-in-python
From http://docs.python.org/library/signal.html#module-signal:
Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead.
总是在主线程调用signal设置信号处理器，主线程将是唯一处理信号的线程。因此不要把线程间通信寄托在信号上，而应该用锁。
The second, from http://docs.python.org/library/thread.html#module-thread:
Threads interact strangely with interrupts: the KeyboardInterrupt exception will be received by an arbitrary thread. (When the signal module is available, interrupts always go to the main thread.)
当导入signal模块时， KeyboardInterrupt异常总是由主线程收到，否则KeyboardInterrupt异常会被任意一个线程接到。
直接按Ctrl+C会导致Python接收到SIGINT信号，转成KeyboardInterrupt异常在某个线程抛出，如果还有线程没有被 setDaemon，则这些线程照运行不误。如果用kill送出非SIGINT信号，且该信号没设置处理函数，则整个进程挂掉，不管有多少个线程还没完成。

下面是signal的一个使用范例：

>>> import signal
>>> def f():
...     signal.signal(signal.SIGINT, sighandler)
...     signal.signal(signal.SIGTERM, sighandler)
...     while True:
...             time.sleep(1)
... 
>>> def sighandler(signum,frame):
...     print signum,frame
... 
>>> f()
^C2 <frame object at 0x15b2a40>
^C2 <frame object at 0x15b2a40>
^C2 <frame object at 0x15b2a40>
^C2 <frame object at 0x15b2a40>

signal的设置和清除：

import signal, time

term = False

def sighandler(signum, frame):
        print "terminate signal received..."
        global term
        term = True

def set_signal():
        signal.signal(signal.SIGTERM, sighandler)
        signal.signal(signal.SIGINT, sighandler)

def clear_signal():
        signal.signal(signal.SIGTERM, 0)
        signal.signal(signal.SIGINT, 0)


set_signal()
while not term:
        print "hello"
        time.sleep(1)

print "jumped out of while loop"

clear_signal()
term = False
for i in range(5):
        if term:
                break
        else:
                print "hello, again"
                time.sleep(1)

[dongsong@bogon python_study]$ python signal_test.py 
hello
hello
hello
^Cterminate signal received...
jumped out of while loop
hello, again
hello, again
^C
[dongsong@bogon python_study]$

多进程程序使用信号时，要想让父进程捕获信号并对子进程做一些操作，应该在子进程启动完成以后再注册信号处理函数，否则子进程继承父进程的地址空间，也会有该信号处理函数，程序会混乱不堪

from multiprocessing import Process, Pipe
import logging, time, signal

g_logLevel = logging.DEBUG
g_logFormat = "%(asctime)s %(levelname)s [%(filename)s:%(lineno)d]%(message)s"

def f(conn):
    conn.send([42, None, 'hello'])
    #conn.close()
    logging.basicConfig(level=g_logLevel,format=g_logFormat,stream=None)
    logging.debug("hello,world")

def f2():
    while True:
        print "hello,world"
        time.sleep(1)

termFlag = False
def sighandler(signum, frame):
    print "terminate signal received..."
    global termFlag
    termFlag = True

if __name__ == '__main__':
#    parent_conn, child_conn = Pipe()
#    p = Process(target=f, args=(child_conn,))
#    p.start()
#    print parent_conn.recv()   # prints "[42, None, 'hello']"
#    print parent_conn.recv()
#    p.join()

    p = Process(target=f2)
    p.start()
    signal.signal(signal.SIGTERM, sighandler)
    signal.signal(signal.SIGINT, sighandler)

    while not termFlag:
        time.sleep(0.5)
    print "jump out of the main loop"
    p.terminate()
    p.join()

10.Python 的内建函数locals() 。它返回的字典对所有局部变量的名称与值进行映射

11.扩展位置参数

def func(*args): ...

在参数名之前使用一个星号，就是让函数接受任意多的位置参数。

python把参数收集到一个元组中，作为变量args。显式声明的参数之外如果没有位置参数，这个参数就作为一个空元组。

关联item 3.4

12.扩展关键字参数（扩展键参数）

def accept(**kwargs): ...

python在参数名之前使用2个星号来支持任意多的关键字参数。

注意：kwargs是一个正常的python字典类型，包含参数名和值。如果没有更多的关键字参数，kwargs就是一个空字典。

位置参数和关键字参数参考这篇文章：http://blog.csdn.net/qinyilang/article/details/5484415

>>> def func(arg1, arg2 = "hello", *arg3, **arg4):
...     print arg1
...     print arg2
...     print arg3
...     print arg4
... 

>>> func("xds","t1",t2="t2",t3="t3")
xds
t1
()
{'t2': 't2', 't3': 't3'}

13.装饰器在函数前加上@another_method，用于对已有函数做包装、前提检查=工作，这篇文章写得很透彻 http://daqinbuyi.iteye.com/blog/1161274

14.异常处理的语法

import sys

try:
    f = open('myfile.txt')
    s = f.readline()
    i = int(s.strip())
except IOError, (errno, strerror):
    print "I/O error(%s): %s" % (errno, strerror)
except ValueError:
    print "Could not convert data to an integer."
except:
    print "Unexpected error:", sys.exc_info()[0]
    raise

>>> try:
...    raise Exception('spam', 'eggs')
... except Exception, inst:
...    print "error %s" % str(e)
...    print type(inst)     # the exception instance
...    print inst.args      # arguments stored in .args
...    print inst           # __str__ allows args to printed directly
...    x, y = inst          # __getitem__ allows args to be unpacked directly
...    print 'x =', x
...    print 'y =', y
...
<type 'instance'>
('spam', 'eggs')
('spam', 'eggs')
x = spam
y = eggs

15.命令行参数的处理，用python的optparse库处理，具体用法见这篇文章 http://blog.chinaunix.net/space.php?uid=16981447&do=blog&id=2840082

from optparse import OptionParser
[...]
def main():
    usage = "usage: %prog [options] arg"
    parser = OptionParser(usage)
    parser.add_option("-f", "--file", dest="filename",
                      help="read data from FILENAME")
    parser.add_option("-v", "--verbose",
                      action="store_true", dest="verbose")
    parser.add_option("-q", "--quiet",
                      action="store_false", dest="verbose")
    [...]
    (options, args) = parser.parse_args()
    if len(args) != 1:
        parser.error("incorrect number of arguments")
    if options.verbose:
        print "reading %s..." % options.filename
    [...]

if __name__ == "__main__":
    main()

通俗的讲，make_option()和add_option()用于创建对python脚本的某个命令项的解析方式，用parse_args()解析后单个参数存入args元组，键值对参数存入options；dest指定键值对的key,不写则用命令的长名称作为key；help用于对脚本调用--help/-h时候解释对应命令；action描述参数解析方式，默认store表示命令出现则用dest+后跟的value存入options,store_true表示命令出现则以dest+True存入options,store_false表示命令出现则以dest+False存入options

16.最近用了BeautifulSoup v4，出现如下错误（之前用的是低版本的BeautifulSoup,没遇到这个错误）

HTMLParser.HTMLParseError: malformed start tag

解决办法：用easy_install html5lib，安装html5lib，替代HTMLParser

参考：http://topic.csdn.net/u/20090531/09/956454dd-ba13-4fa3-af3c-6bf7af5726dc.html

beautifulsoup官网：http://www.crummy.com/software/BeautifulSoup/

beautifulsoup的手册：http://www.crummy.com/software/BeautifulSoup/bs4/doc/

中文手册（用于快速入门）：http://www.leeon.me/upload/other/beautifulsoup-documentation-zh.html

下面是一个beautifulsoup的一些用法

[dongsong@localhost boosenspider]$ vpython
Python 2.6.6 (r266:84292, Dec  7 2011, 20:48:22) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> 
>>> from bs4 import BeautifulSoup as soup
>>> s = soup('<li class="dk_dk" id="dkdk"><a href="javascript:;" onclick="MOP.DZH.clickDaka();" class="btn_dk">打卡</a></li>')
>>> s
<html><head></head><body><li class="dk_dk" id="dkdk"><a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">æ‰“å�¡</a></li></body></html>
>>> type(s)
<class 'bs4.BeautifulSoup'>
>>> 
>>> 
>>> t = s.body.contents[0]
>>> t
<li class="dk_dk" id="dkdk"><a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">æ‰“å�¡</a></li>
>>> import re
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dks")})
[]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")}) 
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">æ‰“å�¡</a>]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'href':None})
[]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'href':re.compile('')})
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">æ‰“å�¡</a>]
>>> t.contents[0]
<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">æ‰“å�¡</a>
>>> t.contents[0].string = "hello"
>>> t
<li class="dk_dk" id="dkdk"><a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a></li>
>>> t.contents[0].text
u'hello'
>>> t.contents[0].string
u'hello'
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'text':re.compile('')})
[]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'text':re.compile('h')})
[]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk"),'text':re.compile('^h')})
[]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")})                        
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a>]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'')) 
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a>]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'a'))   
[]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'^hell')) 
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a>]
>>> t.findAll(name='a',attrs={'class':re.compile(r"btn_dk")},text=re.compile(r'^hello$'))
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a>]
>>> 
>>> t.findAll(name='a',attrs={},text=re.compile(r'^hello$'))                             
[<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a>]
>>> 
>>> t
<li class="dk_dk" id="dkdk"><a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a></li>
>>> t1 = soup('<li class="dk_dk" id="dkdk"><a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a></li>').body.contents[0]
>>> 
>>> t1
<li class="dk_dk" id="dkdk"><a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a></li>
>>> t == t1
True
>>> re.search(r'(^hello)|(^bbb)','hello')
<_sre.SRE_Match object at 0x25ef718>
>>> re.search(r'(^hello)|(^bbb)','hellosdfsd')
<_sre.SRE_Match object at 0x25ef7a0>
>>> re.search(r'(^hello)|(^bbb)','bbbsdfsdf') 
<_sre.SRE_Match object at 0x25ef718>
>>> t2 = t1.contents[0]
>>> t2
<a class="btn_dk" href="javascript:;" onclick="MOP.DZH.clickDaka();">hello</a>
>>> t2.findAll(name='a')
[] 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> 
>>> from bs4 import BeautifulSoup as soup
>>> s = soup('<li><a href="http://www.tianya.cn/techforum/articleslist/0/24.shtml" id="item天涯婚礼堂">天涯婚礼堂</a></li>') 
>>> s.findAll(name='a',attrs={'href':None})
[]
>>> s.findAll(name='a',attrs={'href':True})
[<a href="http://www.tianya.cn/techforum/articleslist/0/24.shtml" id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a>]
>>> import re
>>> s.findAll(name='a',attrs={'href':re.compile(r'')})
[<a href="http://www.tianya.cn/techforum/articleslist/0/24.shtml" id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a>]
>>> s1 =s
>>> s1
<html><head></head><body><li><a href="http://www.tianya.cn/techforum/articleslist/0/24.shtml" id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a></li></body></html>
>>> id(s1)
140598579280080
>>> id(s)
140598579280080
>>> s1.body.contents[0].contents[0]['href']=None
>>> s1
<html><head></head><body><li><a href id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a></li></body></html>
>>> s
<html><head></head><body><li><a href id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a></li></body></html>
>>> id(s)
140598579280080
>>> id(s1)
140598579280080
>>> s.findAll(name='a',attrs={'href':re.compile(r'')})
[]
>>> s.findAll(name='a',attrs={'href':True})           
[]
>>> s.findAll(name='a',attrs={'href':None})
[<a href id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a>]
>>> s.findAll(name='a')                    
[<a href id="item?¤???ˉ????¤????">?¤???ˉ????¤????</a>]
<code><a target=_blank target="_blank" name="arg-text"></a>#text是一个用于搜索<code>NavigableString</code>对象的参数。它的值可以是字符串，一个正则表达式，一个list或dictionary，<code>True</code>或<code>None</code>，一个以<code>NavigableString</code>为参数的可调用对象</code>
#None,False,''表示不做要求；re.compile(''),True表示必须有NavigableString存在 （跟attrs不同，attrs字典中指定为False的属性表示不能存在）
 #注意findAll函数text参数的使用，如下：
>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=re.compile(r''))
>>> len(rts)
0
>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text='')
>>> len(rts)
1
>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=True)
>>> len(rts)
0
>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=False)
>>> len(rts)
1
>>> rts = s2.findAll(name=u'ul',attrs={u'id': u'contentbar', u'st_type': 'nav'}, text=None) 
>>> len(rts)
1 
#关于string属性的用法，以及其在什么类型元素上出现的问题
>>> from bs4 import BeautifulSoup as soup
>>> soup1 = soup('<b>hello,<img href="sfdsf">aaaa</img></b>').body.contents[0]
>>> soup1
<b>hello,<img href="sfdsf"/>aaaa</b>
>>> soup1.string
>>> soup1.name
u'b'
>>> soup1.text
u'hello,aaaa'
>>> type(soup1)
<class 'bs4.element.Tag'>
>>> soup1.contents[0]
u'hello,'
>>> type(soup1.contents[0])
<class 'bs4.element.NavigableString'>
>>> soup1.contents[0].string
u'hello,'
>>> soup2 = soup('<b>hello</b>').body.contents[0]
>>> type(soup2)
<class 'bs4.element.Tag'>
>>> soup2.string
u'hello'
#limit的用法，为零表示不限制
>>> soup2.findAll(name='a',text=False,limit=0)
[<a href="http://book.douban.com/subject/4172417/"><img class="m_sub_img" src="http://img1.douban.com/spic/s4424194.jpg"/></a>, <a href="http://book.douban.com/subject/4172417/">åŒ†åŒ†é‚£å¹´</a>]
>>> soup2.findAll(name='a',text=False,limit=1)
[<a href="http://book.douban.com/subject/4172417/"><img class="m_sub_img" src="http://img1.douban.com/spic/s4424194.jpg"/></a>]

BeautifulSoup的性能一般，但是对于不合法的hetml标签有很强的修复和容错能力，对于编码问题，能确定来源页面编码的情况下可以通过BeautifulSoup的构造函数（参数from_encoding）指定（如我解析天涯的页面时就指定了from_encoding='gbk'），不确定来源的话可以依赖bs的自动编码检测和转换(可能会有乱码，毕竟机器没人这么聪明)。

BeautifulSoup返回的对象、以及其各节点内的数据都是其转换后的unicode编码。

---------->

今天遇到一个小问题

有一段html源码在bs3.2.1下构建bs对象失败，抛出UnicodeEncodeError，不论把源码用unicode还是utf-8或者lantin1传入都报错，而且bs3.2.1构造函数居然没有from_encoding的参数可用

尼玛，在bs4下就畅行无阻，不论用unicode编码传入还是utf-8编码传入，都不用指定from_encoding（编码为utf-8、不指定from_encoding时出现乱码，但是也没有报错呀，谁有bs3那么脆弱啊！）

总结一个道理，代码在某个版本库下面测试稳定了以后用的时候安装相应版本的库就ok了，为嘛要委曲求全的做兼容，如果低版本的库有bug我也兼容吗？兼？贱！

<--------------------2012-06-08 18:20

bs4构建对象：

[dongsong@bogon boosenspider]$ cat bs_constrator.py                                        
#encoding=utf-8

from bs4 import BeautifulSoup as soup
from bs4 import Tag

if __name__ == '__main__':
        sou = soup('<div></div>')

        tag1 = Tag(sou, name='div')
        tag1['id'] = 'gentie1'
        tag1.string = 'hello,tag1'
        sou.div.insert(0,tag1)

        tag2 = Tag(sou, name='div')
        tag2['id'] = 'gentie2'
        tag2.string = 'hello,tag2'
        sou.div.insert(1,tag2)

        print sou

[dongsong@bogon boosenspider]$ vpython bs_constrator.py
<html><head></head><body><div><div id="gentie1">hello,tag1</div><div id="gentie2">hello,tag2</div></div></body></html>

cgi可以对html字符串转义(escape);HTMLParser可以取消html的转义(unescape)

>>> t = Tag(name='t')                            
>>> t.string="<img src='www.baidu.com'/>"                          
>>> t
<t><img src='www.baidu.com'/></t>
>>> str(t)
"<t><img src='www.baidu.com'/></t>"
>>> t.string
u"<img src='www.baidu.com'/>"
>>> HTMLParser.HTMLParser().unescape(str(t))
u"<t><img src='www.baidu.com'/></t>"
>>> s1
u"<t><img src='www.baidu.com'/></t>"
>>> 
>>> s2 = cgi.escape(s1)
>>> s2
u"&lt;t&gt;&lt;img src='www.baidu.com'/&gt;&lt;/t&gt;"
>>> HTMLParser.HTMLParser().unescape(s2)
u"<t><img src='www.baidu.com'/></t>"

17.加密md5模块或者hashlib模块

>>> md5.md5("asdfadf").hexdigest()
'aee0014b14124efe03c361e1eed93589'
>>> import hashlib
>>> hashlib.md5("asdfadf").hexdigest()
'aee0014b14124efe03c361e1eed93589'

18.urllib2.urlopen(url)不设置超时的话可能会一直等待远端服务器的反馈，导致卡死

urlFile = urllib2.urlopen(url, timeout=g_url_timeout)
urlData = urlFile.read()

19.正则匹配 re模块

用三个单引号括起来的字符串可以跨行，得到的实际字符串里面有\n，这个得注意

用单引号或者双引号加上\也可以实现字符串换行，得到的实际字符串没有\和\n，但是在做正则匹配时写正则串不要用这种方式写，会匹配不上的

>>> ss = '''
... hell0,a
... shhh
... liumingdong
... xudongsong
... hello
... '''
>>> ss
'\nhell0,a\nshhh\nliumingdong\nxudongsong\nhello\n'
SyntaxError: EOL while scanning string literal
>>> sss = 'aaaa\
... bbbb\
... cccccc'
>>> sss
'aaaabbbbcccccc'
>>> s3 = r'(^hello)|\
... (abc$)'
>>> 
>>> re.search(s3,'hello,world')
<_sre.SRE_Match object at 0x7f95233047a0>
#第一行的正则串匹配成功
>>> re.search(s3,'aaa,hello,worldabc')
#第二行的匹配失败
>>> s4 = r'(^hello)|(abc$)'
#s4没有用单引号加\做跨行，则两个正则串都匹配上了
>>> re.search(s4,"hello,world")
<_sre.SRE_Match object at 0x182e690>
>>> re.search(s4,"aaa,hello,worldabc")
<_sre.SRE_Match object at 0x7f95233047a0>
>>> 
#注意如何取匹配到的子串（把要抽取的子串对应的正则用圆括号括起来，group从1开始就是圆括号对应的子串）
>>> re.search(r'^(\d+)abc(\d+)$','232abc1').group(0,1,2)
('232abc1', '232', '1')

#下面是一个re和lambda混合使用的一个例子

#encoding=utf-8

import re

f = lambda arg: re.search(u'^(\d+)\w+',arg).group(1)
print f(u'1111条评论')
try:
        f(u'aaaa')
except AttributeError,e:
        print str(e)
:!python re_lambda.py
111
'NoneType' object has no attribute 'group'

re.findall（）很好用的哦

>>> re.findall(r'\\@[A-Za-z0-9]+', s)
['\\@userA', '\\@userB']
>>> s
'hello,world,\\@userA\\@userB'
>>> re.findall(r'\\@([A-Za-z0-9]+)', s)
['userA', 'userB']

20.写了个爬虫，之前在做一些url的连接时总是自己来根据各种情况来处理，比如./xxx #xxxx /xxx神马的都要考虑，太烦了，后来发现有现成的东西可以用

>>>from urlparse import urljoin
>>>import urllib 
>>>url = urljoin(r"http://book.douban.com/tag/?view=type",u"./网络小说")
>>> url
u'http://book.douban.com/tag/\u7f51\u7edc\u5c0f\u8bf4'
>>> conn2 = urllib.urlopen(url)               
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/urllib.py", line 86, in urlopen
    return opener.open(url)
  File "/usr/lib64/python2.6/urllib.py", line 179, in open
    fullurl = unwrap(toBytes(fullurl))
  File "/usr/lib64/python2.6/urllib.py", line 1041, in toBytes
    " contains non-ASCII characters")
UnicodeError: URL u'http://book.douban.com/tag/\u7f51\u7edc\u5c0f\u8bf4' contains non-ASCII characters
>>> conn2 = urllib.urlopen(url.encode('utf-8'))

21.urllib2做http请求时如何添加header，如何获取cookie的值

>>> request = urllib2.Request("http://img1.gtimg.com/finance/pics/hv1/46/178/1031/67086211.jpg",headers={'If-Modified-Since':'Wed, 02 May 2012 18:32:20 GMT'})
#等同于request.add_header('If-Modified-Since','Wed, 02 May 2012 18:32:20 GMT')
>>> urllib2.urlopen(request)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib64/python2.6/urllib2.py", line 397, in open
    response = meth(req, response)
  File "/usr/lib64/python2.6/urllib2.py", line 510, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib64/python2.6/urllib2.py", line 435, in error
    return self._call_chain(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 518, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 304: Not Modified
>>> urllib.urlencode({"aaa":"bbb"})
'aaa=bbb'
>>> urllib.urlencode([("aaa","bbb")])
'aaa=bbb'
#urlencode的使用，在提交post表单时需要把参数k-v用urlencode处理后放入头部
#urllib2.urlopen(url,data=urllib.urlencode(...))

今天(13.7.4)遇到一个问题是登录某个站点时需要把第一次访问服务器植入的csrftoken作为post数据一起返给服务器，所以就研究了写怎么获取cooke的值，具体代码不便透漏，把栈溢出上的一个例子摆出来(主要看获取cookie数据的那几行代码)

http://stackoverflow.com/questions/10247054/http-post-and-get-with-cookies-for-authentication-in-python

[dongsong@localhost python_study]$ cat cookie.py 
from urllib2 import Request, build_opener, HTTPCookieProcessor, HTTPHandler
import httplib, urllib, cookielib, Cookie, os

conn = httplib.HTTPConnection('webapp.pucrs.br')

#COOKIE FINDER
cj = cookielib.CookieJar()
opener = build_opener(HTTPCookieProcessor(cj),HTTPHandler())
req = Request('http://webapp.pucrs.br/consulta/principal.jsp')
f = opener.open(req)
html = f.read()
import pdb
pdb.set_trace()
for cookie in cj:
    c = cookie
#FIM COOKIE FINDER

params = urllib.urlencode ({'pr1':111049631, 'pr2':'sssssss'})
headers = {"Content-type":"text/html",
           "Set-Cookie" : "JSESSIONID=70E78D6970373C07A81302C7CF800349"}
            # I couldn't set the value automaticaly here, the cookie object can't be converted to string, so I change this value on every session to the new cookie's value. Any solutions?

conn.request ("POST", "/consulta/servlet/consulta.aluno.ValidaAluno",params, headers) # Validation page
resp = conn.getresponse()

temp = conn.request("GET","/consulta/servlet/consulta.aluno.Publicacoes") # desired content page
resp = conn.getresponse()

print resp.read()

22.如何修改logging的日志输出文件，尤其在使用multiprocessing模块做多进程编程时这个问题变得更急迫，因为子进程会继承父进程的日志输出文件和格式....

def change_log_file(fileName):
	h = logging.FileHandler(fileName)
	h.setLevel(g_logLevel)
	h.setFormatter(logging.Formatter(g_logFormat))
	
	logger = logging.getLogger()
	#print logger.handlers
	for handler in logger.handlers:
		handler.close()
	while len(logger.handlers) > 0:
		logger.removeHandler(logger.handlers[0])
		
	logger.addHandler(h)

logging设置logger、handler、formatter可以参见django的配置文件，下面是个人写的一个小例子

[dongsong@localhost python_study]$ cat logging_test.py 
#encoding=utf-8
import logging, sys

if __name__ == '__main__':
        logger = logging.getLogger('test')
        logger.setLevel(logging.DEBUG)
        print 'log handlers: %s' % str(logger.manager.loggerDict)
        logger.error('here')
        logger.warning('here')
        logger.info('here')
        logger.debug('here')

        #handler = logging.FileHandler('test.log')
        handler = logging.StreamHandler(sys.stdout)
        handler.setLevel(logging.DEBUG)
        formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
        handler.setFormatter(formatter)
        logger.addHandler(handler)
        #logging.getLogger('test').addHandler(logging.NullHandler()) # python 2.7+
        logger.error('here')
        logger.warning('here')
        logger.info('here')
        logger.debug('here')
[dongsong@localhost python_study]$ vpython logging_test.py 
log handlers: {'test': <logging.Logger instance at 0x7f1dde0c2758>}
No handlers could be found for logger "test"
2012-12-26 11:30:48,725 - test - ERROR - here
2012-12-26 11:30:48,725 - test - WARNING - here
2012-12-26 11:30:48,725 - test - INFO - here
2012-12-26 11:30:48,725 - test - DEBUG - here

23.multiprocessing模块使用demo

import multiprocessing
from multiprocessing import Process
import time

def func():
        for i in range(3):
                print "hello"
                time.sleep(1)

proc = Process(target = func)
proc.start()

while True:
        childList = multiprocessing.active_children()
        print childList
        if len(childList) == 0:
                break
        time.sleep(1)

[dongsong@bogon python_study]$ python multiprocessing_children.py 
[<Process(Process-1, started)>]
hello
[<Process(Process-1, started)>]
hello
[<Process(Process-1, started)>]
hello
[<Process(Process-1, started)>]
[]
[dongsong@bogon python_study]$ fg

multiprocessing的Pool模块（进程池）是很好用的，今天差点多此一举的自己写了一个（当然，自己写也是比较easy的，只是必然没官方的考虑周到）

[dongsong@bogon python_study]$ vpython
Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from multiprocessing import Pool
>>> import time
>>> poolObj = Pool(processes = 10)
>>> procObj = poolObj.apply_async(time.sleep, (20,))
>>> procObj.get(timeout = 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/multiprocessing/pool.py", line 418, in get
    raise TimeoutError
multiprocessing.TimeoutError
>>> print procObj.get(timeout = 21)
None
>>> poolObj.__dict__['_pool']
[<Process(PoolWorker-1, started daemon)>, <Process(PoolWorker-2, started daemon)>, <Process(PoolWorker-3, started daemon)>, <Process(PoolWorker-4, started daemon)>, <Process(PoolWorker-5, started daemon)>, <Process(PoolWorker-6, started daemon)>, <Process(PoolWorker-7, started daemon)>, <Process(PoolWorker-8, started daemon)>, <Process(PoolWorker-9, started daemon)>, <Process(PoolWorker-10, started daemon)>]
>>> poolObj.close()
>>> poolObj.join()

24.关于bs的编码和str()函数编码的问题在下面的demo里面可见一斑(跟str()类似的内建函数是unicode())

#encoding=utf-8
from bs4 import BeautifulSoup as soup

tag = soup((u"<p>白痴代码</p>"),from_encoding='unicode').body.contents[0]
newStr = str(tag) #tag内部的__str__()返回utf-8编码的字符串（tag不实现__str__()的话就会按照本文第38条表现了）
print type(newStr),isinstance(newStr,unicode),newStr
try:
        print u"[unicode]hello," + newStr #自动把newStr按照unicode解释，报错
except Exception,e:
        print str(e)
print "[utf-8]hello," + newStr
print u"[unicode]hello," + newStr.decode('utf-8')

[dongsong@bogon python_study]$ vpython tag_str_test.py 
<type 'str'> False <p>白痴代码</p>
'ascii' codec can't decode byte 0xe7 in position 3: ordinal not in range(128)
[utf-8]hello,<p>白痴代码</p>
[unicode]hello,<p>白痴代码</p>

25.关于MySQLdb使用的一些问题 http://mysql-python.sourceforge.net/
1> 这里是鸟人11年在某个项目中封装的数据库操作接口database.py，具体的数据库操作可以继承该类并实现跟业务相关的接口
2>cursor.execute(), cursor.fetchall()查出来的是unicode编码，即使指定connect的charset为utf8

3>查询语句需要注意的问题见下述测试代码；推荐的cursor.execute()用法是cursor.execute(sql, args)，因为底层会自动做字符串逃逸

If you're not familiar with the Python DB-API, notethat the SQL statement incursor.execute() uses placeholders,"%s",rather than adding parameters directly within the SQL. If you use thistechnique, the underlying database library will automatically add quotes andescaping to your parameter(s) as necessary. (Also note that Django expects the"%s" placeholder,not the "?" placeholder, which is used by the SQLitePython bindings. This is for the sake of consistency and sanity.)

4>规范的做法需要conn.cursor().execute()后conn.commit()，否则在某些不支持自动提交的数据库版本上会有问题

5>对于插入操作成功后新增记录对应的自增主键可以用MySQLdb.connections.Connection.insert_id()来获取（MySQLdb.connections.Connection就是MySQLdb.connect()返回的mysql连接）（2014.5.29）

#encoding=utf-8
import MySQLdb

conn = MySQLdb.connect(host = "127.0.0.1", port = 3306, user = "xds", passwd = "xds", db = "xds_db", charset = 'utf8')
cursor = conn.cursor()
print cursor

siteName = u"百度贴吧"
bbsNames = [u"明星", u"影视"]


siteName = siteName.encode('utf-8')
for index in range(len(bbsNames)):
        bbsNames[index] = bbsNames[index].encode('utf-8')

#正确的用法
#args = tuple([siteName] + bbsNames)
#sql = "select bbs from t_site_bbs where site = %s and bbs in (%s,%s)"
#rts = cursor.execute(sql,args)
#print rts

#正确的用法
args = tuple([siteName] + bbsNames)
sql = "select bbs from t_site_bbs where site = '%s' and bbs in ('%s','%s')" % args
print sql
rts = cursor.execute(sql)
print rts

#错误的用法,报错
#args = tuple([siteName] + bbsNames)
#sql = "select bbs from t_site_bbs where site = %s and bbs in (%s,%s)" % args
#rts = cursor.execute(sql)
print rts

#错误的用法,不报错，但是查不到数据(bbsName的成员是数字串或者英文字符串时正确)
#sql = "select bbs from t_site_bbs where site = '%s' and bbs in %s" % (siteName, str(tuple(bbsNames)))
#print sql
#rts = cursor.execute(sql)
#print rts


rts = cursor.fetchall()
for rt in rts:
        print rt[0]

对于有自增列的数据表，insert之后可以通过cursor.lastrowid获取刚插入的记录的自增id，update不行

参考：http://stackoverflow.com/questions/706755/how-do-you-safely-and-efficiently-get-the-row-id-after-an-insert-with-mysql-usin

26.关于时间

[dongsong@bogon boosencms]$ vpython
Python 2.6.6 (r266:84292, Dec  7 2011, 20:48:22) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> time.gmtime()
time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=4, tm_min=14, tm_sec=55, tm_wday=4, tm_yday=139, tm_isdst=0)
>>> time.localtime()
time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=12, tm_min=15, tm_sec=2, tm_wday=4, tm_yday=139, tm_isdst=0)
>>> time.time()
1337314595.7790151
>>> time.timezone
-28800
>>> time.gmtime(time.time())
time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=4, tm_min=19, tm_sec=45, tm_wday=4, tm_yday=139, tm_isdst=0)
>>> time.localtime(time.time())
time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=12, tm_min=19, tm_sec=54, tm_wday=4, tm_yday=139, tm_isdst=0)
>>> time.strftime("%a, %d %b %Y %H:%M:%S +0800", time.localtime(time.time()))
'Fri, 18 May 2012 12:21:20 +0800'
>>> time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime(time.time()))   
'Fri, 18 May 2012 04:21:36 +0000'
#%Z这玩意到底怎么用的，下面也没搞明白
>>> time.strftime("%a, %d %b %Y %H:%M:%S %Z", time.gmtime(time.time()))
'Fri, 18 May 2012 04:23:09 CST'
>>> time.strftime("%a, %d %b %Y %H:%M:%S %Z", time.localtime(time.time()))
'Fri, 18 May 2012 12:23:31 CST'
>>> timeStr = time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime(time.time()))         
>>> timeStr
'Fri, 18 May 2012 04:24:29 +0000'
>>> t = time.strptime(timeStr, "%a, %d %b %Y %H:%M:%S %Z")              
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/_strptime.py", line 454, in _strptime_time
    return _strptime(data_string, format)[0]
  File "/usr/lib64/python2.6/_strptime.py", line 325, in _strptime
    (data_string, format))
ValueError: time data 'Fri, 18 May 2012 04:24:29 +0000' does not match format '%a, %d %b %Y %H:%M:%S %Z'
>>> t = time.strptime(timeStr, "%a, %d %b %Y %H:%M:%S +0000")
>>> t
time.struct_time(tm_year=2012, tm_mon=5, tm_mday=18, tm_hour=4, tm_min=24, tm_sec=29, tm_wday=4, tm_yday=139, tm_isdst=-1)
#下面是datetime的用法
>>> import datetime
>>> datetime.datetime.today()
datetime.datetime(2012, 5, 18, 12, 28, 25, 892141)
>>> datetime.datetime(2012,12,12,23,54)
datetime.datetime(2012, 12, 12, 23, 54)
>>> datetime.datetime(2012,12,12,23,54,32)
datetime.datetime(2012, 12, 12, 23, 54, 32)
>>> datetime.datetime.fromtimestamp(time.time())
datetime.datetime(2012, 5, 18, 12, 29, 15, 130257)
>>> datetime.datetime.utcfromtimestamp(time.time())
datetime.datetime(2012, 5, 18, 4, 29, 34, 897017)
>>> datetime.datetime.now()
datetime.datetime(2012, 5, 18, 12, 29, 52, 558249)
>>> datetime.datetime.utcnow()
datetime.datetime(2012, 5, 18, 4, 30, 6, 164009)
>>> datetime.datetime.fromtimestamp(time.time()).strftime("%a, %d %b %Y %H:%M:%S")                                                    
'Fri, 18 May 2012 17:05:30'
>>> datetime.datetime.today().strftime("%a, %d %b %Y %H:%M:%S")                          
'Fri, 18 May 2012 17:05:44'
>>> datetime.datetime.strptime('Fri, 18 May 2012 04:24:29', "%a, %d %b %Y %H:%M:%S")    
datetime.datetime(2012, 5, 18, 4, 24, 29)

>>> datetime.datetime.fromtimestamp(time.time()).strftime('%X')  
'17:07:14'
>>> datetime.datetime.fromtimestamp(time.time()).strftime('%x')  
'02/28/15'
>>> datetime.datetime.fromtimestamp(time.time()).strftime('%c')  
'Sat Feb 28 17:07:24 2015'

%a 英文星期简写
%A 英文星期的完全
%b 英文月份的简写
%B 英文月份的完全
%c 显示本地日期时间
%d 日期，取1-31
%H 小时， 0-23
%I 小时， 0-12
%m 月， 01 -12

%M 分钟，0-59

%S 秒，0-61（官网这样写的）

%j 年中当天的天数
%w 显示今天是星期几
%W 第几周
%x 当天日期
%X 本地的当天时间
%y 年份 00-99间
%Y 年份的完整拼写

27.关于整数转字符串的陷阱

有些整数是int，有些是long,对于long调用str()处理后返回的字符串是数字+L，该long数字在list等容器中时，对容器调用str()处理时也有这个问题，用者需谨慎啊！
至于一个整数什么时候是int，什么时候是long鸟人正在研究...（当然，指定int或者long就肯定是int或者long了）

28.join()的用法（列表中的元素必须是字符串）

>>> l = ['a','b','c','d']
>>> '&'.join(l)
'a&b&c&d'

29.python的pdb调试

http://www.ibm.com/developerworks/cn/linux/l-cn-pythondebugger/

跟gdb很类似：

b line_number 加断点，还可以指定文件和函数加断点

b 180, childWeiboRt.retweetedId == 3508203280986906 条件断点

b 显示所有断点

cl breakpoint_number 清除某个断点

cl 清除所有断点

c 继续

n 下一步

s 跟进函数内部

bt 调用栈

whatis obj 查看某变量类型（跟python的内置函数type()等效）

up 移到调用栈的上一层（frame）,可以看该调用点的代码和变量（当然，程序实际进行到哪里了是不可改变的）

down 移到调用栈的下一层（frame）,可以看该调用点的代码和变量（当然，程序实际进行到哪里了是不可改变的）

...

调试过程中要查看某实例（instanceObj）的属性值可用下述语句：

for it in [(attr,getattr(instanceObj,attr)) for attr in dir(instanceObj)]: print it[0],'-->',it[1]

30.在函数内部获取函数名

>>> import sys
>>> def f2():
...     print sys._getframe().f_code.co_name
... 
>>> f2()
f2

31.url中的空格等特殊字符的处理

url出现了有+，空格，/，?，%，#，&，=等特殊符号的时候，可能在服务器端无法获得正确的参数值，如何是好？
解决办法
将这些字符转化成服务器可以识别的字符，对应关系如下：
URL字符转义
用其它字符替代吧，或用全角的。
+    URL中+号表示空格                                %2B
空格 URL中的空格可以用+号或者编码           %20
/  分隔目录和子目录                                    %2F
?   分隔实际的URL和参数                             %3F
%   指定特殊字符                                         %25
#   表示书签                                                 %23
&    URL中指定的参数间的分隔符                  %26
=    URL中指定参数的值                               %3D

>>> import urllib
>>> import urlparse
>>> urlparse.urljoin('http://s.weibo.com/weibo/',urllib.quote('python c++')) 
'http://s.weibo.com/weibo/python%20c%2B%2B'

当url与特殊字符碰撞、然后参数又用于有特殊字符的搜索引擎（lucene等）....

需要把url转义再转义，否则特殊字符安全通过http协议后就裸体进入搜索引擎了，查到的将不是你要的东东...

参考：http://stackoverflow.com/questions/688766/getting-401-on-twitter-oauth-post-requests

通过观察url可以发现http://s.weibo.com浏览器脚本也是做了这种处理的

[dongsong@bogon python_study]$ cat url.py 
#encoding=utf-8

import urllib, urlparse

if __name__ == '__main__':
        baseUrl = 'http://s.weibo.com/weibo/'
        url = urlparse.urljoin(baseUrl, urllib.quote(urllib.quote('python c++')))
        print url
        conn = urllib.urlopen(url)
        data = conn.read()
        f = file('/tmp/d.html', 'w')
        f.write(data)
        f.close()

[dongsong@bogon python_study]$ vpython url.py 
http://s.weibo.com/weibo/python%2520c%252B%252B

32.json模块编码问题

json.dumps()默认行为：

把数据结构中所有字符串转换成unicode编码，然后对unicode串做编码转义(\u56fd变成\\u56fd)再整个导出utf-8编码(由参数encoding的默认值utf-8控制，没必要动它)的json串

如原数据结构中的元素编码不一致不影响dumps函数的行为，因为导出json串之前会把所有元素串转换成unicode串

参数ensure_ascii默认是True，如设置为False会改变dumps的行为：

原数据结构中的字符串编码为unicode则导出的json串是unicode串，且内部unicode串不做转义(\u56fd还是\u56fd)；

原数据结构中的字符串编码为utf-8则导出的json串是utf-8串，且内部utf-8串不做转义(\xe5\x9b\xbd还是\xe5\x9b\xbd)；

如原数据结构中的元素编码不一致则dumps函数会出现错误

通过这种方式拿到的json串是可以做编码转换的，默认行为得到的json串不行(因为原数据结构的字符串元素被转义了，对json串整个做编码转换无法触动原数据结构的字符串元素)

warning--->2012-07-11 10:00:

今天遇到一个问题，用这种方式转一个带繁体字的字典，转换成功，只是把json串入库时报错

_mysql_exceptions.Warning: Incorrect string value: '\xF0\x9F\x91\x91\xE7\xAC...' for column 'detail' at row 1

而用第一种方式存库就没有问题，初步认定是json.dumps(ensure_ascii = False)对繁体字的处理有编码问题

对于一些编码比较杂乱的数据，可能json.loads()会抛UnicodeDecodeError异常（比如我今天（2013.3.19）遇到的qq开放平台API返回的utf8编码json串在反解时总遇到这个问题），可如下解决：

myString = jsonStr.decode('utf-8', 'ignore') #转成unicode,并忽略错误

jsonObj = json.loads(myString)

可能会丢数据，但总比什么也不干要强。

#encoding=utf-8

import json
from pprint import pprint

def show_rt(rt):
        pprint(rt)
        print rt
        print "type(rt) is %s" % type(rt)

if __name__ == '__main__':
        unDic = {
                        u'中国':u'北京',
                        u'日本':u'东京',
                        u'法国':u'巴黎'
                }
        utf8Dic = {
                        r'中国':r'北京',
                        r'日本':r'东京',
                        r'法国':r'巴黎'
                }

        pprint(unDic)
        pprint(utf8Dic)

        print "\nunicode instance dumps to string:"
        rt = json.dumps(unDic)
        show_rt(rt)
        print "utf-8 instance dumps to string:"
        rt = json.dumps(utf8Dic)
        show_rt(rt)

        #encoding is the character encoding for str instances, default is UTF-8
        #If ensure_ascii is False, then the return value will be a unicode instance, default is True
        print "\nunicode instance dumps(ensure_ascii=False) to string:"
        rt = json.dumps(unDic,ensure_ascii=False)
        show_rt(rt)
        print "utf-8 instance dumps(ensure_ascii=False) to string:"
        rt = json.dumps(utf8Dic,ensure_ascii=False)
        show_rt(rt)

        print "\n-----------------数据结构混杂编码-----------------"
        unDic[u'日本'] = r'东京'
        utf8Dic[r'日本'] = u'东京'
        pprint(unDic)
        pprint(utf8Dic)

        print "\nunicode instance dumps to string:"
        try:
                rt = json.dumps(unDic)
        except Exception,e:
                print "%s:%s" % (type(e),str(e))
        else:
                show_rt(rt)
        print "utf-8 instance dumps to string:"
        try:
                rt = json.dumps(utf8Dic)
        except Exception,e:
                print "%s:%s" % (type(e),str(e))
        else:
                show_rt(rt)

        print "\nunicode instance dumps(ensure_ascii=False) to string:"
        try:
                rt = json.dumps(unDic, ensure_ascii=False)
        except Exception,e:
                print "%s:%s" % (type(e),str(e))
        else:
                show_rt(rt)
        print "utf-8 instance dumps to string:"
        try:
                rt = json.dumps(utf8Dic, ensure_ascii=False)
        except Exception,e:
                print "%s:%s" % (type(e),str(e))
        else:
                show_rt(rt)

[dongsong@bogon python_study]$ vpython json_test.py 
{u'\u4e2d\u56fd': u'\u5317\u4eac',
 u'\u65e5\u672c': u'\u4e1c\u4eac',
 u'\u6cd5\u56fd': u'\u5df4\u9ece'}
{'\xe4\xb8\xad\xe5\x9b\xbd': '\xe5\x8c\x97\xe4\xba\xac',
 '\xe6\x97\xa5\xe6\x9c\xac': '\xe4\xb8\x9c\xe4\xba\xac',
 '\xe6\xb3\x95\xe5\x9b\xbd': '\xe5\xb7\xb4\xe9\xbb\x8e'}

unicode instance dumps to string:
'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u65e5\\u672c": "\\u4e1c\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece"}'
{"\u4e2d\u56fd": "\u5317\u4eac", "\u65e5\u672c": "\u4e1c\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece"}
type(rt) is <type 'str'>
utf-8 instance dumps to string:
'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece", "\\u65e5\\u672c": "\\u4e1c\\u4eac"}'
{"\u4e2d\u56fd": "\u5317\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece", "\u65e5\u672c": "\u4e1c\u4eac"}
type(rt) is <type 'str'>

unicode instance dumps(ensure_ascii=False) to string:
u'{"\u4e2d\u56fd": "\u5317\u4eac", "\u65e5\u672c": "\u4e1c\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece"}'
{"中国": "北京", "日本": "东京", "法国": "巴黎"}
type(rt) is <type 'unicode'>
utf-8 instance dumps(ensure_ascii=False) to string:
'{"\xe4\xb8\xad\xe5\x9b\xbd": "\xe5\x8c\x97\xe4\xba\xac", "\xe6\xb3\x95\xe5\x9b\xbd": "\xe5\xb7\xb4\xe9\xbb\x8e", "\xe6\x97\xa5\xe6\x9c\xac": "\xe4\xb8\x9c\xe4\xba\xac"}'
{"中国": "北京", "法国": "巴黎", "日本": "东京"}
type(rt) is <type 'str'>

-----------------数据结构混杂编码-----------------
{u'\u4e2d\u56fd': u'\u5317\u4eac',
 u'\u65e5\u672c': '\xe4\xb8\x9c\xe4\xba\xac',
 u'\u6cd5\u56fd': u'\u5df4\u9ece'}
{'\xe4\xb8\xad\xe5\x9b\xbd': '\xe5\x8c\x97\xe4\xba\xac',
 '\xe6\x97\xa5\xe6\x9c\xac': u'\u4e1c\u4eac',
 '\xe6\xb3\x95\xe5\x9b\xbd': '\xe5\xb7\xb4\xe9\xbb\x8e'}

unicode instance dumps to string:
'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u65e5\\u672c": "\\u4e1c\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece"}'
{"\u4e2d\u56fd": "\u5317\u4eac", "\u65e5\u672c": "\u4e1c\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece"}
type(rt) is <type 'str'>
utf-8 instance dumps to string:
'{"\\u4e2d\\u56fd": "\\u5317\\u4eac", "\\u6cd5\\u56fd": "\\u5df4\\u9ece", "\\u65e5\\u672c": "\\u4e1c\\u4eac"}'
{"\u4e2d\u56fd": "\u5317\u4eac", "\u6cd5\u56fd": "\u5df4\u9ece", "\u65e5\u672c": "\u4e1c\u4eac"}
type(rt) is <type 'str'>

unicode instance dumps(ensure_ascii=False) to string:
<type 'exceptions.UnicodeDecodeError'>:'ascii' codec can't decode byte 0xe4 in position 1: ordinal not in range(128)
utf-8 instance dumps to string:
<type 'exceptions.UnicodeDecodeError'>:'ascii' codec can't decode byte 0xe4 in position 1: ordinal not in range(128)

33.json序列化字典会把数字key变成字符串

>>> import json
>>> d = {1:[1,2,3,4],0:()}
>>> d
{0: (), 1: [1, 2, 3, 4]}
>>> s = json.dumps(d)
>>> s
'{"0": [], "1": [1, 2, 3, 4]}'
>>> json.loads(s)
{u'1': [1, 2, 3, 4], u'0': []}

官网说明：

Keys in key/value pairs of JSON are always of the type str. Whena dictionary is converted into JSON, all the keys of the dictionary arecoerced to strings. As a result of this, if a dictionary is converedinto JSON and then back into a dictionary, the dictionary may not equalthe original one. That is, loads(dumps(x)) != x if x has non-stringkeys.

34.交互模式下_表示上次最后一次运算的结果

35.多进程模块的比较

os.popen()和popen2.*都不是官方倡导的用法，subprocess才是

os.popen()启动子进程时命令后面如果不加地址符就会把父进程阻塞住；该命令使用非常方便，但是它仅仅返回一个跟子进程通信的pipe（默认的mode是读，读的是子进程的stdout和stderr）而已，没办法直接杀掉子进程或者获取子进程的信息（可以从pipe写信息通知子进程让子进程自行终止，但是这个很扯淡，你懂的）；对pipe的fd调用close()可以得到子进程的退出码（我没用过，^_^）；在前几个项目里面我频繁使用该命令，因为当时的环境对进程的控制比较粗线条

popen2.*这个模块还没用过，不过顾名思义popen2.popen2()就是启动子进程时返回stdin和stdout，popen2.popen3()就是启动子进程时返回stdout,stdin,stderr....跟os.popen好像也没多大改进

multiprocessing是仿多线程threading接口的多进程模块，需要注意文件描述符、数据库连接共享的问题；这个和其他执行命令行命令启动子进程的多进程模块是不一样滴

subprocess注意僵尸进程的产生，系统一般会为已退出的子进程保留一个进程退出码等信息的结构、供父进程使用，当父进程wait()子进程时系统知道父进程已不需要该结构则会释放，如果父进程不wait而直接退出那么该子进程（已退出，等待wait）就会变成僵尸，占用系统进程号

subprocess的用法:

>>> obj2 = subprocess.Popen('python /home/dongsong/python_study/child2.py', shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>>> dir(obj2)
['__class__', '__del__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_check_timeout', '_child_created', '_close_fds', '_communicate', '_communicate_with_poll', '_communicate_with_select', '_communication_started', '_execute_child', '_get_handles', '_handle_exitstatus', '_input', '_internal_poll', '_remaining_time', '_set_cloexec_flag', '_translate_newlines', 'communicate', 'kill', 'pid', 'poll', 'returncode', 'send_signal', 'stderr', 'stdin', 'stdout', 'terminate', 'universal_newlines', 'wait']
>>> dir(obj2.stdout)
['__class__', '__delattr__', '__doc__', '__enter__', '__exit__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'closed', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines', 'xreadlines']
>>> obj2.stdout.read()
'[<logging.StreamHandler instance at 0x7fdb0ad63248>]\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\naaaaa\n'
>>> obj2.stdout.read()
''
>>> obj2.communicate()[0]
''
>>> obj2.communicate()[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/subprocess.py", line 729, in communicate
    stdout, stderr = self._communicate(input, endtime)
  File "/usr/lib64/python2.6/subprocess.py", line 1310, in _communicate
    stdout, stderr = self._communicate_with_poll(input, endtime)
  File "/usr/lib64/python2.6/subprocess.py", line 1364, in _communicate_with_poll
    register_and_append(self.stdout, select_POLLIN_POLLPRI)
  File "/usr/lib64/python2.6/subprocess.py", line 1343, in register_and_append
    poller.register(file_obj.fileno(), eventmask)
ValueError: I/O operation on closed file
>>> obj2.stderr.read()   
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file
>>> args = shlex.split('python /home/dongsong/python_study/child2.py')
>>> obj = subprocess.Popen(args)

36.设置文件对象非阻塞读取

flags = fcntl.fcntl(procObj.stdout.fileno(), fcntl.F_GETFL)
fcntl.fcntl(procObj.stdout.fileno(), fcntl.F_SETFL, flags|os.O_NONBLOCK)

37.如何创建deamon进程（可避免僵尸进程）

原理在僵尸的百科里有提到：fork两次，父进程fork一个子进程，然后继续工作，子进程fork一个孙进程后退出，那么孙进程被init接管，孙进程结束后，init会回收。不过子进程的回收还要自己做。

可以参考这人的实现，这个只能用于纯粹的学习，没什么实际意义http://blog.csdn.net/snleo/article/details/4410305

38.默认编码和内建函数str()的问题

str(xx)把xx转换成系统默认编码（sys.getdefaultencoding()）的适合打印的字符串，一般默认是ascii,那么xx如果是unicode汉字就会报错；默认编码改成utf-8当然就不会报错了

建议不要修改系统默认编码，会影响一些库的使用；一定要改可用这些方法。其中sys.setdefaultencoding()方法不是任何场景都有效（Thesetdefaultencoding is used in python-installed-dir/site-packages/pyanaconda/sitecustomize.py）

[dongsong@bogon python_study]$ vpython
Python 2.6.6 (r266:84292, Dec  7 2011, 20:48:22) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> s = u'中国'
>>> str(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
>>> s.encode('utf-8')
'\xe4\xb8\xad\xe5\x9b\xbd'
>>> sys.setdefaultencoding('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'setdefaultencoding'
>>> d = {u'中国':u'北京'}
>>> d
{u'\u4e2d\u56fd': u'\u5317\u4eac'}
>>> str(d)
"{u'\\u4e2d\\u56fd': u'\\u5317\\u4eac'}"
#修改默认编码
[dongsong@bogon python_study]$ cat ~/venv/lib/python2.6/site-packages/sitecustomize.py
import sys
sys.setdefaultencoding('utf-8')
[dongsong@bogon python_study]$ vpython -c 'import sys; print sys.getdefaultencoding();'
utf-8
[dongsong@bogon python_study]$ vpython
Python 2.6.6 (r266:84292, Dec  7 2011, 20:48:22) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = u'中国' 
>>> str(s)
'\xe4\xb8\xad\xe5\x9b\xbd'
>>> import sys
>>> print sys.getdefaultencoding()
utf-8
>>> d = {u'中国':u'北京'}
>>> d
{u'\u4e2d\u56fd': u'\u5317\u4eac'}
>>> str(d)
"{u'\\u4e2d\\u56fd': u'\\u5317\\u4eac'}"

可以用python -S 跳过site.py（site.py这个东东可以看看python源码里面的内容），然后sys模块就直接支持setdefaultencoding()方法了。

39.trackback

...
except Exception,e:
                if not isinstance(e, APIError):
                    traceback.print_exc(file=sys.stderr)

或者

import sys
    tp,val,td = sys.exc_info()

sys.exc_info()的返回值是一个tuple, (type, value/message, traceback)
这里的type ---- 异常的类型
value/message ---- 异常的信息或者参数

traceback ---- 包含调用栈信息的对象。

可用traceback模块处理traceback对象，traceback.print_tb()打印traceback对象，traceback.format_tb()返回traceback对象的可打印串

参考：http://hi.baidu.com/whaway/item/8136af0b404dd1813c42e207

40.用python做GUI开发的一些选择 GUI Programming in Python( http://wiki.python.org/moin/GuiProgramming)

cocos2d ：Cocos2D家族的前世今生

cocos2d官网

cocos2d-x

pygame：pygame维基

pygame官网

tkinter：tkinter教程

tkinter官网

wxpython:wxpython官网

图像处理和图表见另一篇文章http://blog.csdn.net/xiarendeniao/article/details/7991305

41.类的静态方法和类方法（用内建函数staticmethod()和classmethod()修饰的类的成员方法）

在python中，静态方法和类方法都是可以通过类对象和类对象实例访问。但是区别是：

1>@classmethod修饰的类的方法是类方法，第一个参数cls是接收类变量。有子类继承时，调用该类方法时，传入的类变量cls是子类，而非父类。不同于C++中类的静态方法。调用方法：ClassA.func() or ClassA().func()（后者调用时函数忽略类的实例）classmethod() is useful for creating alternateclass constructors.

>>> class A:
...     @classmethod
...     def func(cls):
...             import pdb
...             pdb.set_trace()
...             pass
... 
>>> A.func()
> <stdin>(6)func()
(Pdb) cls
<class __main__.A at 0x7fc8b056ea70>
(Pdb) type(cls)
<type 'classobj'>
(Pdb) 
>>> type(A())
<type 'instance'>

2>@staticmethod修饰的类的方法是静态方法，静态方法不接收隐式的第一个参数。基本上跟一个全局函数相同，跟C++中类的静态方法很类似。调用方法：ClassA.func() or ClassA().func() （后者调用时函数忽略类的实例）

3>没有上述修饰的类的方法是普通方法（实例方法），第一个参数是self，接收类的实例。调用方法：ClassA().func()

42.字典合并

>>> d1
{1: 6, 11: 12, 12: 13, 13: 14}
>>> d2
{1: 2, 2: 3, 3: 4}
>>> dict(d2, **d1)
{1: 6, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}
>>> dict(d1,**d2) 
{1: 2, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}
>>> d = dict(d1)
>>> d
{1: 6, 11: 12, 12: 13, 13: 14}
>>> d2
{1: 2, 2: 3, 3: 4}
>>> d.update(d2)
>>> d
{1: 2, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}
>>> d = dict(d2)
>>> d
{1: 2, 2: 3, 3: 4}
>>> d1
{1: 6, 11: 12, 12: 13, 13: 14}
>>> d.update(d1)
>>> d
{1: 6, 2: 3, 3: 4, 11: 12, 12: 13, 13: 14}

43.网络超时处理

1>>urllib2.urlopen(url,timeout=xx)

2>>socket.setdefaulttimeout(xx) #(全局socket超时设置)

3>>定时器

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
def handler(fh):
        fh.close()
fh = urlopen(url)
t = Timer(20.0, handler,[fh])
t.start()
data = fh.read()
t.cancel()

44.excel处理

以前一直用的csv模块，读写csv格式文件，然后用excel软件打开另存为xls文件

今天（2012.10.30）发现这个库更直接，更强大http://www.python-excel.org/

鸟人用的版本：（xlwt-0.7.4 xlrd-0.8.0 xlutils-1.5.2）

设置行的高度可以用sheetObj.row(index).set_style(easyxf('font:height 720;')) 设置列的宽度可以用sheetObj.col(index).width = 1000 其他那些方法差不多都有bug 设置不上http://reliablybroken.com/b/2011/10/widths-heights-with-xlwt-python/

#encoding=utf-8
from xlwt import Workbook, easyxf

book = Workbook(encoding='utf-8')
sheet1 = book.add_sheet('Sheet 1')
sheet1.col_width(20000)
book.add_sheet('Sheet 2')
sheet1.write(0,0,'起点')
sheet1.write(0,1,'B1')
row1 = sheet1.row(1)
row1.write(0,'Ai2')
row1.write(1,'B2')
sheet1.col(0).width = 10000
sheet1.col(1).width = 20000
#sheet1.default_col_width = 20000 #bug invalid
#sheet1.col_width(30000) #bug invalid
#sheet1.default_row_height = 5000 #bug invalid
#sheet1.row(0).height = 5000 #bug invalid
sheet1.row(0).set_style(easyxf('font:height 400;'))
style = easyxf('pattern: pattern solid, fore_colour red;'
                'align: vertical center, horizontal center;'
                'font: bold true;')
sheet1.write_merge(2,5,2,5,'Merged',style)
sheet2 = book.get_sheet(1)
sheet2.row(0).write(0,'Sheet 2 A1')
sheet2.row(0).write(1,'Sheet 2 B1')
sheet2.flush_row_data()
sheet2.write(1,0,'Sheet 2 A3')
sheet2.col(0).width = 5000
sheet2.col(0).hidden = True
book.save('simple.xls')

用这个库的时候很头疼的一点是不知道设置的宽度/高度/颜色在视觉上到底是什么样子，鸟人写了个脚本把所有支持的颜色和常用的宽高打印出来已备选，具体参见http://blog.csdn.net/xiarendeniao/article/details/8276957

45.在本机有多个ip地址的情况下，urllib2发起http请求时如何指定使用哪个IP地址？两种方式，方便且稍带取巧性质的是篡改socket模块的socket方法（下面的代码是这种），另一种是：A better way is to extendconnect() method in subclass ofHTTPConnection and redefinehttp_open() method in subclass ofHTTPHandler

def bind_alt_socket(alt_ip):
	true_socket = socket.socket
	def bound_socket(*a, **k):
		sock = true_socket(*a, **k)
		sock.bind((alt_ip, 0))
		return sock
	socket.socket = bound_socket

参考： http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/

http://stackoverflow.com/questions/1150332/source-interface-with-python-and-urllib2

46.PyQt4的安装：

1.sip安装
wget http://sourceforge.net/projects/pyqt/files/sip/sip-4.14.1/sip-4.14.1.tar.gz
vpython configure.py
make
sudo make install

2.sudo yum install qt qt-devel -y
  sudo yum install qtwebkit qtwebkit-devel -y //没有这一个操作的话，下面configure操作就会不生成QtWebKit的Makefile

3.pyqt安装
wget http://sourceforge.net/projects/pyqt/files/PyQt4/PyQt-4.9.5/PyQt-x11-gpl-4.9.5.tar.gz
vpython configure.py -q/usr/bin/qmake-qt4 -g
make 
make install

dir(PyQt4)看不到的模块不表示不存在啊亲！so动态库可以用from PyQt4 import QtGui或者import PyQt4.QtGui来引入的啊亲！尼玛，我一直以为安装失败了，各种尝试各种找原因啊，崩溃中...

47.一个python解释器要使用另一个python解释器的环境（安装的模块）

参考：http://mydjangoblog.com/2009/03/30/django-mod_python-and-virtualenv/https://pypi.python.org/pypi/virtualenv

下述示例是在默认python环境中使用virtualenv python中安装的callme模块：

[dongsong@localhost ~]$ python
Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import callme
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named callme
>>> activate_this = '/home/dongsong/venv/bin/activate_this.py'            
>>> execfile(activate_this, dict(__file__=activate_this))
>>> import callme
>>>

至于如何使得mod_python使用virtualenv python环境，可参考前述连接：

#myvirtualdjango.py

activate_this = '/home/django/progopedia.ru/ve/bin/activate_this.py'
execfile(activate_this, dict(__file__=activate_this))

from django.core.handlers.modpython import handler

<VirtualHost 127.0.0.1:81>
    ServerName progopedia.ru
    ServerAdmin [email protected]

    <Location "/">
        SetHandler python-program
        PythonPath "['/home/django/progopedia.ru/ve/bin', '/home/django/progopedia.ru/src/progopedia_ru_project/'] + sys.path"
        PythonHandler myvirtualdjango
        SetEnv DJANGO_SETTINGS_MODULE settings
        SetEnv PYTHON_EGG_CACHE /var/tmp/egg
        PythonInterpreter polyprog_ru
    </Location>
</VirtualHost>

48.格式化输出

%r是一个万能的格式付，它会将后面给的参数原样打印出来，带有类型信息

print 会自动在行末加上回车,如果不需回车，只需在print语句的结尾添加一个逗号”,“，就可以改变它的行为

更多精彩用法请见http://www.pythonclub.org/python-basic/print

%r是用对象的repr形式，%s是用str形式

49.finally 很容易搞错哦！

[dongsong@localhost python_study]$ cat finally_test.py 
#encoding=utf-8

def func():
        a = 1
        try:
                return a
        except Exception,e:
                print '%r' % e
        else:
                print 'no exception'
        finally:
                print 'finally'
                a += 1

a = func()
print 'func returned %s' % a
[dongsong@localhost python_study]$ vpython finally_test.py 
finally
func returned 1

50.stackless

官网：http://www.stackless.com/

中文资料（有例子哦~）：http://gashero.yeax.com/?p=30

1>当调用 stackless.schedule() 的时候，当前活动微进程将暂停执行，并将自身重新插入到调度器队列的末尾，好让下一个微进程被执行。
一旦在它前面的所有其他微进程都运行过了，它将从上次停止的地方继续开始运行。这个过程会持续，直到所有的活动微进程都完成了运行过程。这就是使用stackless达到合作式多任务的方式。
2>接收的微进程调用 channel.receive() 的时候，便阻塞住，这意味着该微进程暂停执行，直到有信息从这个通道送过来。除了往这个通道发送信息以外，没有其他任何方式可以让这个微进程恢复运行。
若有其他微进程向这个通道发送了信息，则不管当前的调度到了哪里，这个接收的微进程都立即恢复执行；而发送信息的微进程则被转移到调度列表的末尾，就像调用了 stackless.schedule() 一样。
同样注意，发送信息的时候，若当时没有微进程正在这个通道上接收，也会使当前微进程阻塞。
发送信息的微进程，只有在成功地将数据发送到了另一个微进程之后，才会重新被插入到调度器中。
3>清除堆栈溢出的问题：是否还记得，先前我提到过，那个代码的递归版本，有经验的程序员会一眼看出毛病。但老实说，这里面并没有什么“计算机科学”方面的原因在阻碍它的正常工作，有些让人坚信的东西，其实只是个与实现细节有关的小问题——只因为大多数传统编程语言都使用堆栈。某种意义上说，有经验的程序员都是被洗了脑，从而相信这是个可以接受的问题。而stackless，则真正察觉了这个问题，并除掉了它。
4>微线程--轻量级线程：与当今的操作系统中内建的、和标准Python代码中所支持的普通线程相比，“微线程”要更为轻量级，正如其名称所暗示。它比传统线程占用更少的内存，并且微线程之间的切换，要比传统线程之间的切换更加节省资源。
5>计时：现在，我们对若干次实验运行过程进行计时。Python标准库中有一个 timeit.py 程序，可以用作此目的。
6>我们将channel的preference 设置为1，这使得调用send之后任务不被阻塞而继续运行，以便在之后输出正确的仓库信息。
7>In stackless, the balance of a channel is how many tasklets are waiting to send or receive on it.正数表示有send的个数；负数表示receive的个数；0表示没有等待。

总结：stackless python还是受限于GIL，多核用不上，只是比python的传统thread有些改进而已（http://stackoverflow.com/questions/377254/stackless-python-and-multicores）。所以multiprocessing构建多进程、进程内部用stackless构建微线程是不错的搭配。EVE服务器端使用stackless做的（貌似是C++/stackless python），好想看看他们的代码啊，哈哈哈。

stackless python安装：参考http://opensource.hyves.org/concurrence/install.html#installing-stackless

sudo yum install readline-devel -y
./configure --prefix=/opt/stackless --with-readline --with-zlib=/usr/include
make
make install

51.动态加载模块

内建函数__import__()

[dongsong@localhost python_study]$ touch mds/__init__.py
[dongsong@localhost python_study]$ vpython
Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> m = __import__('mds.m1', globals(), locals(), fromlist=[], level = 0)
>>> m
<module 'mds' from 'mds/__init__.py'>

第一次在自己的代码中实用这个函数（2014.6.25），发现需要注意的问题挺多的，要仔细阅读官方说明

class RobotMeta(type):
    def __new__(cls, name, bases, attrs):
        newbases = list(bases)
        import testcase
        import pkgutil
        for importer, modname, ispkg in pkgutil.iter_modules(testcase.__path__):
            if ispkg: continue
            mod = __import__('testcase.'+modname, globals(), locals(), fromlist=(modname,), level=1)
            if hasattr(mod, 'Robot'):
                newbases.append(mod.Robot)
        return super(RobotMeta, cls).__new__(cls, name, tuple(newbases), attrs)

importlib库， importlib.import_module()

[dongsong@localhost python_study]$ touch mds/__init__.py
[dongsong@localhost python_study]$ vpython
Python 2.6.6 (r266:84292, Jun 18 2012, 14:18:47) 
[GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib
>>> m = importlib.import_module('mds.m1')
>>> m
<module 'mds.m1' from 'mds/m1.py'>
>>>

52.对于user-defined class，如何使其支持pickle和cPickle？（下面是对项目中一个继承自dict的json串反解对象所做的修改，参考http://stackoverflow.com/questions/5247250/why-does-pickle-getstate-accept-as-a-return-value-the-very-instance-it-requi）

def __getstate__(self): 
        return dict(self)
    
def __setstate__(self, state):
        return self.update(state)

53.判断字符串的组成

s.isalnum()  所有字符都是数字或者字母
s.isalpha()  所有字符都是字母
s.isdigit()  所有字符都是数字
s.islower()  所有字符都是小写
s.isupper()  所有字符都是大写
s.istitle()  所有单词都是首字母大写，像标题
s.isspace()  所有字符都是空白字符、\t、\n、\r

54.python networking framework, 这种python并发问题三言两语难尽其意，故另起炉灶见http://blog.csdn.net/xiarendeniao/article/details/9143059

Twisted是比较常见和广泛使用的(module index)

concurrence 跟stackless有一腿（stackless和libevent的结合体），所以对我比较有吸引力

cogen 跟上面的那个相似，移植性更好一些

gevent greenlet和libevent的结合体（greenlet是stackless的副产品、只是比stackless更原始一些、更容易满足coder对协程的控制欲），这样看跟concurrence原理差不多哦

得出上述总结的原材料：http://stackoverflow.com/questions/1824418/a-clean-lightweight-alternative-to-pythons-twisted

55.python环境变量（environment variables）

import os
if not os.environ.has_key('DJANGO_SETTINGS_MODULE'):
    os.environ['DJANGO_SETTINGS_MODULE'] = 'boosencms.settings'
else:
    print 'DJANGO_SETTINGS_MODULE: %s' % os.environ['DJANGO_SETTINGS_MODULE']

56.yield，用于生成generator的语法，generator是一个可迭代一次的对象，用generator做迭代（遍历）相对于list、tuple等结构的优势是没必要所有数据都在内存中，详解见官网文档和栈溢出讨论帖

[dongsong@localhost python-study]$ !cat
cat yield.py 
def echo(value=None):
    print "Execution starts when 'next()' is called for the first time."
    try:
        while True:
            try:
                value = (yield value)
            except Exception, e:
                print "catched an exception", e
                value = e
            else:
                print "yield received ", value
    finally:
        print "Don't forget to clean up when 'close()' is called."

generator = echo(1)
print generator.next()
print generator.next()
print generator.send(2)
generator.throw(TypeError, "spam")
generator.close()
[dongsong@localhost python-study]$ 
[dongsong@localhost python-study]$ 
[dongsong@localhost python-study]$ !python
python yield.py 
Execution starts when 'next()' is called for the first time.
1
yield received  None
None
yield received  2
2
catched an exception spam
Don't forget to clean up when 'close()' is called.

57.元类metaclass详解见文章 http://blog.csdn.net/xiarendeniao/article/details/9232021

58.单件模式的实现，栈溢出上这个帖子介绍了四种方式，我比较中意第三种http://stackoverflow.com/questions/6760685/creating-a-singleton-in-python

[dongsong@localhost python_study]$ cat singleton3.py 
#encoding=utf-8

class Singleton(type):
        _instances = {}
        def __call__(cls, *args, **kwargs):
                if cls not in cls._instances:
                        cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
                return cls._instances[cls]

class MyClass(object):
        __metaclass__ = Singleton

singletonObj = Singleton('Test',(),{})
myClassObj1 = MyClass()
myClassObj2 = MyClass()
print singletonObj, singletonObj.__class__
print id(myClassObj1),myClassObj1,myClassObj1.__class__
print id(myClassObj2),myClassObj2,myClassObj2.__class__
[dongsong@localhost python_study]$ vpython singleton3.py 
<class '__main__.Test'> <class '__main__.Singleton'>
139799414931408 <__main__.MyClass object at 0x7f2596777fd0> <class '__main__.MyClass'>
139799414931408 <__main__.MyClass object at 0x7f2596777fd0> <class '__main__.MyClass'>

59.python magic methods ，有些长，单开一篇文章 http://blog.csdn.net/xiarendeniao/article/details/9270407

60.struct 二进制官方文档 http://docs.python.org/3/library/struct.html

Character	Byte order	Size	Alignment
@	native	native	native
=	native	standard	none
<	little-endian	standard	none
>	big-endian	standard	none
!	network (= big-endian)	standard	none

Format	C Type	Python type	Standard size	Notes
x	pad byte	no value
c	char	bytes of length 1	1
b	signed char	integer	1	(1),(3)
B	unsigned char	integer	1	(3)
?	_Bool	bool	1	(1)
h	short	integer	2	(3)
H	unsigned short	integer	2	(3)
i	int	integer	4	(3)
I	unsigned int	integer	4	(3)
l	long	integer	4	(3)
L	unsigned long	integer	4	(3)
q	long long	integer	8	(2), (3)
Q	unsigned longlong	integer	8	(2), (3)
n	ssize_t	integer		(4)
N	size_t	integer		(4)
f	float	float	4	(5)
d	double	float	8	(5)
s	char[]	bytes
p	char[]	bytes
P	void *	integer		(6)

>>> import struct
>>> struct.pack('HH',1,2)
'\x01\x00\x02\x00'
>>> struct.pack('<HH',1,2)
'\x01\x00\x02\x00'
>>> struct.pack('>HH',1,2)
'\x00\x01\x00\x02'
>>> s= struct.pack('HH',1,2)
>>> s
'\x01\x00\x02\x00'
>>> len(s)
4
>>> struct.unpack('HH',s)  
(1, 2)
>>> struct.unpack_from('H', s, 2) 
(2,)
>>> struct.unpack('H',s[0:2])
(1,)

61.闭包

[dongsong@localhost python_study]$ cat enclosing_1.py 
#encoding=utf8
a = 1
b = 2

def f(v = 0):
        a = 2
        c = list()
        def g():
                print 'a = %s' % a
                print 'b = %s' % b
                print 'c = %r' % c

        if v == 0:
                a += 1
        else:
                a += v
        c.append(111)
        return g

g = f() #函数返回g函数对象赋值给g; 函数对象g跟a(3)、c([111])绑定构成闭包
f(10)() #内嵌对象跟a(12)、c([111])绑定构成闭包；输出: a=12, b=2, c=[111]
f()     #没有任何输出，内嵌函数跟a/c绑定后的结果没有使用
g()     #输出: a = 3, b = 2, c = [111]
b = 3
g() #输出: a = 3, b = 3, c = [111] (b是全局变量)

print a #输出全局变量: a = 1
[dongsong@localhost python_study]$ vpython enclosing_1.py 
a = 12
b = 2
c = [111]
a = 3
b = 2
c = [111]
a = 3
b = 3
c = [111]
1

62.如何阻止pyc跟py文件同居？看栈溢出的讨论帖http://stackoverflow.com/questions/3522079/changing-the-directory-where-pyc-files-are-created

python3.2之后可以在代码目录加一个__pycache__目录，pyc文件会分居到这个目录下（应该是这个意思，python3我没用过）

python2的话可以在启动解释器的时候加上-B参数阻止pyc字节码文件写盘，不过这样势必会导致import变慢（重新编译）

63.微博数据(账号描述)入库报警告且数据被截断：

[dongsong@localhost tfengyun_py]$ vpython new_user.py debug 1852589841
/data/weibofengyun/workspace-php/tfengyun_py/utils.py:26: Warning: Incorrect string value: '\xF0\x9F\x92\x91\xE4\xBD...' for column 'description' at row 1
  try: affectCount = self.cursor.execute(sql)

最终解决办法（直接从Python群里copy来的）：

吓人的鸟(362278013) 11:27:58 
对于昨天那个数据入库Mysql报Warning的问题大概整明白了，现分享如下，非常感谢@墨迹 !!

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html 
mysql5.5.3之前不支持utf8mb4,上周五那个入库警告是因为有部分unicode字符(ios设备的emoji表情)编码成utf-8以后占四字节（正常一般不超过三字节）：
>>> u'\u8bb0'.encode('utf-8')
'\xe8\xae\xb0'
>>> u'\U0001f497'.encode('utf-8')
'\xf0\x9f\x92\x97'
对于不想升级mysql版本来解决问题的情况，可以把这种字符过滤掉，栈溢出上有相关讨论
http://stackoverflow.com/questions/10798605/warning-raised-by-inserting-4-byte-unicode-to-mysql

那么对于同一个Mysql数据库和一样的数据，为什么PHP程序可以正常入库(不报错不报警告、数据不被截断)呢？
原来是因为它内部自动的把utf8的四字节编码部分过滤掉了，入库以后在mysql命令行下查询会发现那些emoji表情符不见了，用PHP程序从数据库把数据查出来验证也确实如此

PS: 知之为知之,不知为不知,是知也.  来提问的都是因为比较着急了，希望各位同仁少些说教，多些实际有效建议。

64.（2014.4.25）Python跟C/C++的混合使用（Python使用C/C++扩展，C/C++嵌套Python），最基本的用法当然是参照官网来做了，我有两个对官网相关文档的翻译，巨麻烦！引用什么的规则太多了，这种低级接口不适宜在项目中直接使用。

项目中首选Boost.Python(http://www.boost.org/doc/libs/1_55_0/libs/python/doc/)，用过C++的应该对Boost不陌生，我对Boost的理解是仅次于C++标准库的标准库(09年老成在昆仑写的聊天服用的就是boost.asio)。其中提供了对Python语言的支持。金山的C++/Python游戏服务器就是用的这个库实现C++跟Python之间交互。

其次，听一个同学讲他们项目(貌似非游戏项目)中有用到Pyrex（http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/About.html）,这是一种类似于C和Python语法混写的新语言，没深入了解过，暂且搁下，我还是对Boost.Python比较感兴趣。

Cython(http://cython.org/) 基于Pyrex，被设计用来编写python的c扩展

说到这里不得不提一下pypy(http://pypy.org/)了（虽然pypy不是用来跟c/c++交互的），pypy是python实现的python解释器，jit（Just-in-time compilation，动态编译）使其运行速度比cpython（官方解释器，一般人用的解释器）要快，支持stackless、提供微线程协作，感觉前景一片光明啊！有消息说pypy会丢弃GIL以提升多线程程序的性能，不过我看官方文档好像没这么说（http://pypy.org/tmdonate2.html#what-is-the-global-interpreter-lock）。

65.exec直接就可以执行代码片段

eval执行的是单条表达式

compile可以把代码片段或者代码文件编译成codeobject，exec和eval都可以执行codeobject

https://docs.python.org/2/library/functions.html#compile

[dongsong@localhost python-study]$ python
Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = file("code.py").read()
>>> print s
def func():
        print "i am in function func()"
        return 1,2,3

>>> codeObj = compile(s,"<string>","exec")  
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'codeObj', 's']
>>> codeObj
<code object <module> at 0x7f761cd74738, file "<string>", line 1>
>>> eval(codeObj)
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'codeObj', 'func', 's']
>>> func()
i am in function func()
(1, 2, 3)

[dongsong@localhost python-study]$ python
Python 2.6.6 (r266:84292, Jan 22 2014, 09:42:36) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = file("code.py").read()
>>> exec(s)
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'func', 's']
>>> func()
i am in function func()
(1, 2, 3)

66.随机字符串 http://stackoverflow.com/questions/2257441/random-string-generation-with-upper-case-letters-and-digits-in-python

>>> import string
>>> import random
>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
...    return ''.join(random.choice(chars) for _ in range(size))
...
>>> id_generator()
'G5G74W'
>>> id_generator(3, "6793YUIO")
'Y3U'
>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
'0123456789'
>>> string.ascii_uppercase + string.digits
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
>>> string.lowercase
'abcdefghijklmnopqrstuvwxyz'

67.内建函数hasattr不能查找对象的私有属性（2014.6.18）

[dongsong@localhost python-study]$ cat hasattr.py
#encoding=utf-8

class A(object):
    def __init__(self):
        self.__a = 100
        self.a = 200
    def test(self):
        if hasattr(self,'__a'): print 'found self.__a:',self.__a
        else: print 'not found self.__a'
        if hasattr(self,'a'): print 'found self.a:', self.a
        else: print 'not found self.a:', self.a

if __name__ == '__main__':
    t = A()
    t.test()
[dongsong@localhost python-study]$ 
[dongsong@localhost python-study]$ python hasattr.py 
not found self.__a
found self.a: 200

68.Python循环import : Circular (or cyclic) imports

http://stackoverflow.com/questions/744373/circular-or-cyclic-imports-in-python

说白了，a import b, b import a, 那么在a的主代码块(也就是“import a”时会被执行的代码)中使用module b里面的符号(b.xx、from b import xx)会出错。

另，python a.py，那么a.py初次会当做__main__ module，“import a”会重新把a执行一遍（这个在源码剖析里面有提到，也就是使用if __name__ == '__main__'判断的原因）

[root@test-22 xds]# cat maintest.py
import maintest
print 'main test in ..'
if __name__ == '__main__':
    print 'aaaa'
print 'main test out..'
[root@test-22 xds]# 
[root@test-22 xds]# python maintest.py
main test in ..
main test out..
main test in ..
aaaa
main test out..

你可能感兴趣的:(python)

python转转商超书籍信息爬虫 Python数据分析与机器学习爬虫 python 网络爬虫爬虫
1基本理论1.1概念体系网络爬虫又称网络蜘蛛、网络蚂蚁、网络机器人等，可以按照我们设置的规则自动化爬取网络上的信息，这些规则被称为爬虫算法。是一种自动化程序，用于从互联网上抓取数据。爬虫通过模拟浏览器的行为，访问网页并提取信息。这些信息可以是结构化的数据（如表格数据），也可以是非结构化的文本。爬虫任务的执行流程通常包括发送HTTP请求、解析HTML文档、提取所需数据等步骤。1.2技术体系1请求库:
Python中的数字类型不爱敲代码的小李0812 python二级通关宝典 python 开发语言后端
目录一、概述二、整数类型三、浮点数四、复数类型一、概述1）Python语言提供三种数字类型：整数类型，浮点数类型和复数类型，分别对应数学中的整数，实数和复数2）1010是整数类型，10.10是一个浮点数类型，10+10j是一个复数类型二、整数类型1）与数学中的整数概念一致，没有取值范围限制。2）整数类型有4种进制表示：十进制，二进制，八进制和十六进制。默认情况，整数采用十进制，其他进制需要增加引导
收藏！Python常用的第三方模块,你知道几个呢？ Python子木_ Python入门 Python学习 Python零基础 python pandas python教程 python基础 python学习 python入门青少年编程
作为一种流行的编程语言,拥有丰富的第三方模块,这些模块极大地扩展了的功能,使得各种开发任务变得更加高效和便捷.本文将介绍几种常用的第三方模块,提供示例展示,并对它们进行分类,以帮助读者更好地理解和使用这些工具.这里插播一条粉丝福利，如果你正在学习Python或者有计划学习Python，想要突破自我，对未来十分迷茫的，可以点击这里获取最新的Python学习资料和学习路线规划（免费分享，记得关注）1.
matlab程序代编程写做代码图像处理BP神经网络机器深度学习python matlabgoodboy 深度学习 matlab 图像处理
1.安装必要的库首先，确保你已经安装了必要的Python库。如果没有安装，请运行以下命令：bash复制代码pipinstallnumpymatplotlibtensorflowopencv-python2.图像预处理我们将使用OpenCV来加载和预处理图像数据。假设你有一个图像数据集，每个类别的图像存放在单独的文件夹中。python复制代码importosimportcv2importnumpya
【Python】Python中对复杂对象列表根据对象属性进行排序花无凋零之时 Python python 开发语言数据结构
对于Python中对象列表进行排序时，我们往往需要根据对象中的属性进行特定的排序。首先我们假设一个类为：classStudent:def__init__(self,name,score,age):self.name=nameself.score=scoreself.age=agedef__str__(self):returnself.name+""+str
Python气象数据分析：风速预报订正、台风预报数据智能订正、机器学习预测风电场的风功率、浅水模型、预测ENSO等小艳加油大气科学 python 人工智能气象机器学习
目录专题一Python和科学计算基础专题二机器学习和深度学习基础理论和实操专题三气象领域中的机器学习应用实例专题四气象领域中的深度学习应用实例更多应用Python是功能强大、免费、开源，实现面向对象的编程语言，在数据处理、科学计算、数学建模、数据挖掘和数据可视化方面具备优异的性能，这些优势使得Python在气象、海洋、地理、气候、水文和生态等地学领域的科研和工程项目中得到广泛应用。可以预见未来Py
YOLOv8/YOLOv11使用web界面推理自己的模型，Gradio框架快速搭建挂科边缘 YOLOv8改进 YOLO 前端计算机视觉目标检测人工智能 python
前言Gradio是一个开源Python库，用于快速构建和共享机器学习模型的Web界面。开发者可以通过简单的Python代码将机器学习模型封装成交互式应用，无需复杂的设置即可在浏览器中使用自己训练好模型。接下来教你使用Gradio框架构建一个简单Web界面推理YOLOv8/YOLOv11模型。话不多说上检测结果：一、YOLOv8/YOLOv11源码下载YOLOv8源码下载：官网打不开的话，从我的网盘
Python二进制模式打开文件open() 牧文山 Python python
我们看到了在文件打开模式中有以下模式：rb、wb……有这种带b的。什么意思呢？就是用二进制的方式打开文件。#1.只读模式打开文件f1=open('d:\\infile.txt')#2.写模式打开文件f2=open('output.txt','w')#3.以二进制写模式打开文件f3=open('record.dat','wb',0)open()函数-modeModeFunctionr以读模式打开w以
python tornado websocket ping_tornado WebSocket详解 weixin_39978276 python tornado websocket ping
1.什么是WebSocketwebsocket和长轮询的区别是客户端和服务器之间是持久连接的双向通信。协议使用ws://URL格式，但它在是在标准HTTP上实现的。2.tornado的WebSocket模块tornado在websocket模块中提供了一个WebSocketHandler类，这个类提供了和已连接的客户端通信的WebSocket事件和方法的钩子。open方法，新的WebSocket连
python读二进制文件字节长度_使用Python进行二进制文件读写的简单方法(推荐) weixin_39574388
总的感觉，python本身并没有对二进制进行支持，不过提供了一个模块来弥补，就是struct模块。python没有二进制类型，但可以存储二进制类型的数据，就是用string字符串类型来存储二进制数据，这也没关系，因为string是以1个字节为单位的。importstructa=12.34#将a变为二进制bytes=struct.pack('i',a)此时bytes就是一个string字符串，字符串
python pipeline库_Easy Pipeline，一种轻量级的Python Pipeline库周不宅 python pipeline库
嗯，很久没有写博客了，最近的工作都是偏开发性质的，以至于没有时间对自己感兴趣的领域进行探索，感觉个人的成长停滞了一些。如何在枯燥的工作中，提取出有助于自己成长的养分，对于每个人来说都是不小的考验。这次，带来的是之前编写的一下挺简单的库，用来简化流水线作业的小框架。起因是这样的，组内有一个需求，需要挖掘视频中的检测难样本，这样可以极大地减少标注的量，从而降低成本。难样本挖掘的策略，简单来说就是如果视
python input 文件路径_python基础 — 文件操作童雅洋梨 python input 文件路径
读取键盘输入Python提供了两个内置函数从标准输入读入一行文本，默认的标准输入是键盘。如下：raw_inputinputraw_input函数raw_input([prompt])函数从标准输入读取一个行，并返回一个字符串(去掉结尾的换行符)。input函数input([prompt])函数和raw_input([prompt])函数基本类似，但是input可以接收一个Python表达式作为输入
Python打包工具pyinstaller和Nuitka比较 w315427783 python
.1使用需求这次也是由于项目需要，要将python的代码转成exe的程序，在找了许久后，发现了2个都能对python项目打包的工具——pyintaller和nuitka。这2个工具同时都能满足项目的需要：隐藏源码。这里的pyinstaller是通过设置key来对源码进行加密的；而nuitka则是将python源码转成C++（这里得到的是二进制的pyd文件，防止了反编译），然后再编译成可执行文件。方
调用asyncio.to_thread后上下文依然一致吗 socratescli python asyncio
使用Python的asyncio时，可以把一个同步的函数放到线程池中执行从而避免这个函数阻塞asyncio自身的事件循环。比如可以把requests库的请求放进去asyncdefto_thread_do_request(url):returnawaitasyncio.to_thread(requests.get,url)这个to_thread_do_request方法就不会造成asyncio的阻塞
【数据集】全球预报系统GFS概述：数据下载及处理 WW、forever 数据集 GFS
【数据集】全球预报系统GFS概述：数据下载及处理GFSweatherdata数据下载NOAANOMADSNOAA数据处理基于Python完成数据重命名参考GFSweatherdata全球预报系统GFS（GlobalForecastSystem）是美国国家海洋和大气管理局（NOAA）开发和运行的数值天气预报模型。它是一个全球性的大气模式，提供中长期天气预报。以下是一些关键点：全球覆盖：GFS提供全球
自学 python 中的异步编程 asyncio (五)：asyncio 与线程thread Eaton5959 python
自学python中的异步编程asyncio(一)：学习基本概念自学python中的异步编程asyncio(二)：asyncio模块与核心组件自学python中的异步编程asyncio(三)：asyncio实现基本异步编程自学python中的异步编程asyncio(四)：基本的异步IO编程自学python中的异步编程asyncio(五)：asyncio与线程thread自学python中的异步编程a
在 Python 异步协程中使用同步队列土谷祠房客 python 协程阻塞
在Python异步协程中使用同步队列使用Pythonasyncio进行异步编程时，可以使用异步队列asyncio.Queue在并发的协程间进行数据交互。不过，asyncio.Queue不是线程安全的，如果需要在不同线程的异步程序之间或者不同线程的异步程序和同步程序间交换数据，就需要使用queue模块中的Queue这个队列，因为它是线程安全的。在asyncio异步协程中使用queue.Queue
2024华为OD机试E卷-数大雁-（C++/Java/Python） 2024剑指offer python 华为od c++java
2024华为OD机试最新E卷题库-(C卷+D卷+E卷)-(JAVA、Python、C++)目录题目描述输入描述输出描述用例1用例2用例3用例4考点题目解析代码c++python题目描述一群大雁往南飞，给定一个字符串记录地面上的游客听到的大雁叫声，请给出叫声最少由几只大雁发出。具体的：大雁发出的完整叫声为”quack“，因为有多只大雁同一时间嘎嘎作响，所以字符串中可能会混合多个”quack”。大雁会
华为OD机试 - 数大雁（Java & Python& JS & C++ & C ）算法大师最新华为OD机试 c++java 华为OD 华为od机试 python 华为od javascript
最新华为OD机试真题目录：点击查看目录华为OD面试真题精选：点击立即查看题目描述一群大雁往南飞，给定一个字符串记录地面上的游客听到的大雁叫声，请给出叫声最少由几只大雁发出。具体的:1.大雁发出的完整叫声为”quack“，因为有多只大雁同一时间嘎嘎作响，所以字符串中可能会混合多个”quack”。2.大雁会依次完整发出”quack”，即字符串中’q’,‘u’,‘a’,‘c’,‘k’这5个字母按顺序完整
【Python】Tkinter电器销售有限公司销售数据分析（源码）【独一无二】不争不抢不显不露 python 数据分析开发语言
一、设计要求该项目创建一个数据分析软件，利用Tkinter和Matplotlib构建图形用户界面（GUI），读取和分析美迪电器销售有限公司销售数据。用户可以通过界面选择月份查看数据详情、生成销量图表并计算月总销量和年总销量。二、设计思路2.模块引入首先引入了所需的模块，包括Tkinter（用于GUI创建和管理）、ttk（Tkinter主题化控件）、messagebox（用于弹出消息框）、panda
【Python】super() 函数和 MRO 顺序的实例剖析彭彭不吃虫子 python 开发语言
1.构造函数（__init__(self[,...])）在类中定义__init__()方法，可以实现在实例化对象的时候进行个性化定制：>>>classC:... def__init__(self,x,y):... self.x=x... self.y=y... defadd(self):... returnself.x+self.y... defmu
【Python】类与对象:self在其中的作用，面向对象的优势，函数和方法的区别彭彭不吃虫子 python 开发语言
1.self在类和对象中的功能与用处在面向对象编程（OOP）中，self是类中方法的第一个参数，它指向当前实例（对象）。每个类的方法第一个参数通常是self，它用于引用当前对象本身，这使得我们能够访问类中的属性和其他方法。功能与用处：访问实例属性：self允许在类的方法中引用对象的属性。例如，如果类中有一个实例属性name，你可以通过self.name来访问它。修改实例属性：通过self，方法可以
Python在WRF模型自动化运行及前后处理中实践技术应用-包括数据处理、模型运行、结果可视化等步骤。 KY_chenzhao python 自动化开发语言
1.背景与目标WRF（WeatherResearchandForecasting）模型是中尺度气象数值模式的佼佼者，广泛应用于气象预报和气候研究。Python在WRF模型中的应用主要体现在前后处理、自动化运行和数据可视化等方面。本文将以风速预测为例，详细说明Python在WRF模型中的具体应用，包括数据处理、模型运行、结果可视化等步骤。2.数据准备数据来源包括WRF模型的输出数据和实际观测数据。这
基于Python机器学习、深度学习技术提升气象、海洋、水文领域实践应用 KY_chenzhao python 机器学习深度学习气象
1.背景与目标ENSO（ElNiño-SouthernOscillation）是全球气候系统中最显著的年际变率现象之一，对全球气候、农业、渔业等有着深远的影响。准确预测ENSO事件的发生和发展对于减灾防灾具有重要意义。近年来，深度学习技术在气象领域得到了广泛应用，其中长短期记忆网络（LSTM）因其在处理时间序列数据方面的优势，被广泛用于ENSO预测。2.数据准备数据来源包括NOAA（美国国家海洋和
PySide6 GUI 学习笔记——Python文件编译打包 Humbunklung PySide6 学习笔记 python
前面编写的软件工具都必须运行在Python环境中，且通过命令行的方式运行，通过Python打包工具，我们可以把.py文件封装成对应平台的运行文件，供用户执行。常见Python打包工具工具简介官网/文档地址py2exe将Python脚本转换为Windows可执行文件https://www.py2exe.orgcx_Freeze跨平台的Python打包工具，它可以将Python脚本打包为可执行文件或动
ChatGPT Canvas：开启AI编程新纪元——你的AI代码生成器来了！前端
OpenAI近日宣布ChatGPTCanvas全面开放，并带来了两项重磅更新：直接运行Python代码和整合GPTs生态系统。这意味着，即使你不是专业的程序员，也能轻松体验编程的乐趣，并利用AI的力量创造出更多可能性。这对于想要学习编程或提高工作效率的用户来说，无疑是一个巨大的福音。这篇文章将深入探讨这两项更新，并展望ChatGPTCanvas的未来发展。直接运行Python代码：降低编程门槛，释
华为OD机试E卷 --数大雁--24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 华为od java javascript python js c语言
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c算法源码题目描述一群大雁往南飞，给定一个字符串记录地面上的游客听到的大雁叫声，请给出叫声最少由几只大雁发出。具体：1.大雁发出的完整叫声为”quack“，因为有多只大雁同一时间嘎嘎作响，所以字符串中可能会混合多个”quack”2.大雁会依次完整发出”quack”，即字符串中’q，u,a，c，k这5个字母按
Python中的Pipeline快速教学、 Coding Is Fun python 开发语言
在Python中，Pipeline通常指的是机器学习工作流中的流水线，尤其是在使用scikit-learn库时。Pipeline允许你将多个数据处理步骤和模型训练步骤串联起来，形成一个有序的工作流程。这不仅使代码更简洁，还能确保在训练和预测时一致的数据处理。以下是一个快速教学，帮助你掌握Python中Pipeline的核心概念和使用方法。目录安装和导入必要的库Pipeline的基本概念创建一个简单
Python实用记录(十五)：PyQt/PySide6打包成exe，精简版（nuitka/pyinstaller/auto-py-to-exe） ZZY_dl 实用操作总结 python pyqt 开发语言
文章目录Python打包工具：Nuitka、PyInstaller和Auto-py-to-exe详解方式一：Nuitka安装与使用方式二：PyInstaller安装环境打包方式使用spec文件打包打包后文件说明打包参数说明方式三：Auto-py-to-exe安装环境✅⚠️▶️➡️⭐❄️✅⚠️▶️➡️⭐❄️✅⚠️▶️➡️⭐❄️✅⚠️Python打包工具：Nuitka、PyInstaller和Auto
Python 操作二进制文件昱晏 Python 1024程序员节 python
在计算机中，文件可以分为两种类型：文本文件和二进制文件。文本文件包含人类可读的字符，而二进制文件包含计算机指令或数据，无法直接阅读。常见的二进制文件包括图片、音频、视频、可执行文件等。Python提供了处理二进制文件的工具，允许你读写任意类型的数据。1以二进制模式打开文件在Python中，操作二进制文件时，需要使用'b'作为文件模式的一部分。常见的二进制文件模式有：'rb'：以二进制读取文件。'w
怎么样才能成为专业的程序员？ cocos2d-x小菜编程 PHP
如何要想成为一名专业的程序员？仅仅会写代码是不够的。从团队合作去解决问题到版本控制，你还得具备其他关键技能的工具包。当我们询问相关的专业开发人员，那些必备的关键技能都是什么的时候，下面是我们了解到的情况。关于如何学习代码，各种声音很多，然后很多人就被误导为成为专业开发人员懂得一门编程语言就够了？！呵呵，就像其他工作一样，光会一个技能那是远远不够的。如果你想要成为
java web开发高并发处理 BreakingBad java Web 并发开发处理高
java处理高并发高负载类网站中数据库的设计方法（java教程,java处理大量数据，java高负载数据）一：高并发高负载类网站关注点之数据库没错,首先是数据库,这是大多数应用所面临的首个SPOF。尤其是Web2.0的应用，数据库的响应是首先要解决的。一般来说MySQL是最常用的，可能最初是一个mysql主机，当数据增加到100万以上，那么，MySQL的效能急剧下降。常用的优化措施是M-S（
mysql批量更新 ekian mysql
mysql更新优化：一版的更新的话都是采用update set的方式，但是如果需要批量更新的话，只能for循环的执行更新。或者采用executeBatch的方式，执行更新。无论哪种方式，性能都不见得多好。三千多条的更新，需要3分多钟。查询了批量更新的优化，有说replace into的方式，即： replace into tableName(id,status) values
微软BI（3） 18289753290 微软BI SSIS
1) Q：该列违反了完整性约束错误；已获得 OLE DB 记录。源:“Microsoft SQL Server Native Client 11.0” Hresult: 0x80004005 说明:“不能将值 NULL 插入列 'FZCHID'，表 'JRB_EnterpriseCredit.dbo.QYFZCH'；列不允许有 Null 值。INSERT 失败。”。 A：一般这类问题的存在是
Java中的List g21121 java
List是一个有序的 collection（也称为序列）。此接口的用户可以对列表中每个元素的插入位置进行精确地控制。用户可以根据元素的整数索引（在列表中的位置）访问元素，并搜索列表中的元素。与 set 不同，列表通常允许重复
读书笔记永夜-极光读书笔记
1. K是一家加工厂,需要采购原材料,有A,B,C,D 4家供应商,其中A给出的价格最低,性价比最高,那么假如你是这家企业的采购经理,你会如何决策? 传统决策: A:100%订单 B,C,D:0% &nbs
centos 安装 Codeblocks 随便小屋 codeblocks
1.安装gcc,需要c和c++两部分,默认安装下,CentOS不安装编译器的,在终端输入以下命令即可yum install gccyum install gcc-c++ 2.安装gtk2-devel,因为默认已经安装了正式产品需要的支持库,但是没有安装开发所需要的文档.yum install gtk2* 3. 安装wxGTK yum search w
23种设计模式的形象比喻 aijuans 设计模式
1、ABSTRACT FACTORY—追MM少不了请吃饭了，麦当劳的鸡翅和肯德基的鸡翅都是MM爱吃的东西，虽然口味有所不同，但不管你带MM去麦当劳或肯德基，只管向服务员说“来四个鸡翅”就行了。麦当劳和肯德基就是生产鸡翅的Factory 　　工厂模式：客户类和工厂类分开。消费者任何时候需要某种产品，只需向工厂请求即可。消费者无须修改就可以接纳新产品。缺点是当产品修改时，工厂类也要做相应的修改。如：
开发管理 CheckLists aoyouzi 开发管理 CheckLists
开发管理 CheckLists(23) -使项目组度过完整的生命周期开发管理 CheckLists(22) -组织项目资源开发管理 CheckLists(21) -控制项目的范围开发管理 CheckLists(20) -项目利益相关者责任开发管理 CheckLists(19) -选择合适的团队成员开发管理 CheckLists(18) -敏捷开发 Scrum Master 工作开发管理 C
js实现切换百合不是茶 JavaScript 栏目切换
js主要功能之一就是实现页面的特效,窗体的切换可以减少页面的大小,被门户网站大量应用思路: 1,先将要显示的设置为display:bisible 否则设为none 2,设置栏目的id ,js获取栏目的id,如果id为Null就设置为显示 3,判断js获取的id名字;再设置是否显示代码实现: html代码: <di
周鸿祎在360新员工入职培训上的讲话 bijian1013 感悟项目管理人生职场
这篇文章也是最近偶尔看到的，考虑到原博客发布者可能将其删除等原因，也更方便个人查找，特将原文拷贝再发布的。“学东西是为自己的，不要整天以混的姿态来跟公司博弈，就算是混，我觉得你要是能在混的时间里，收获一些别的有利于人生发展的东西，也是不错的，看你怎么把握了”，看了之后，对这句话记忆犹新。 &
前端Web开发的页面效果 Bill_chen html Web Microsoft
1.IE6下png图片的透明显示： <img src="图片地址" border="0" style="Filter.Alpha(Opacity)=数值(100),style=数值(3)"/> 或在<head></head>间加一段JS代码让透明png图片正常显示。 2.<li>标
【JVM五】老年代垃圾回收：并发标记清理GC(CMS GC) bit1129 垃圾回收
CMS概述并发标记清理垃圾回收(Concurrent Mark and Sweep GC）算法的主要目标是在GC过程中，减少暂停用户线程的次数以及在不得不暂停用户线程的请夸功能，尽可能短的暂停用户线程的时间。这对于交互式应用，比如web应用来说，是非常重要的。 CMS垃圾回收针对新生代和老年代采用不同的策略。相比同吞吐量垃圾回收，它要复杂的多。吞吐量垃圾回收在执
Struts2技术总结白糖_ struts2
必备jar文件早在struts2.0.*的时候，struts2的必备jar包需要如下几个： commons-logging-*.jar Apache旗下commons项目的log日志包 freemarker-*.jar
Jquery easyui layout应用注意事项 bozch jquery 浏览器 easyui layout
在jquery easyui中提供了easyui-layout布局，他的布局比较局限，类似java中GUI的border布局。下面对其使用注意事项作简要介绍：如果在现有的工程中前台界面均应用了jquery easyui，那么在布局的时候最好应用jquery eaysui的layout布局，否则在表单页面（编辑、查看、添加等等）在不同的浏览器会出
java-拷贝特殊链表：有一个特殊的链表，其中每个节点不但有指向下一个节点的指针pNext，还有一个指向链表中任意节点的指针pRand，如何拷贝这个特殊链表？ bylijinnan java
public class CopySpecialLinkedList { /** * 题目：有一个特殊的链表，其中每个节点不但有指向下一个节点的指针pNext，还有一个指向链表中任意节点的指针pRand，如何拷贝这个特殊链表？拷贝pNext指针非常容易，所以题目的难点是如何拷贝pRand指针。假设原来链表为A1 -> A2 ->... -> An，新拷贝
color Chen.H JavaScript html css
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML> <HEAD>&nbs
[信息与战争]移动通讯与网络 comsci 网络
两个坚持:手机的电池必须可以取下来光纤不能够入户,只能够到楼宇建议大家找这本书看看:<&
oracle flashback query(闪回查询) daizj oracle flashback query flashback table
在Oracle 10g中，Flash back家族分为以下成员： Flashback Database Flashback Drop Flashback Table Flashback Query(分Flashback Query,Flashback Version Query，Flashback Transaction Query) 下面介绍一下Flashback Drop 和Flas
zeus持久层DAO单元测试 deng520159 单元测试
zeus代码测试正紧张进行中,但由于工作比较忙,但速度比较慢.现在已经完成读写分离单元测试了,现在把几种情况单元测试的例子发出来,希望有人能进出意见,让它走下去. 本文是zeus的dao单元测试: 1.单元测试直接上代码 package com.dengliang.zeus.webdemo.test; import org.junit.Test; import o
C语言学习三printf函数和scanf函数学习 dcj3sjt126com c printf scanf language
printf函数 /* 2013年3月10日20:42:32 地点：北京潘家园功能：目的：测试%x %X %#x %#X的用法 */ # include <stdio.h> int main(void) { printf("哈哈！\n"); // \n表示换行 int i = 10; printf
那你为什么小时候不好好读书? dcj3sjt126com life
dady, 我今天捡到了十块钱, 不过我还给那个人了 good girl! 那个人有没有和你讲thank you啊没有啦....他拉我的耳朵我才把钱还给他的, 他哪里会和我讲thank you 爸爸, 如果地上有一张5块一张10块你拿哪一张呢.... 当然是拿十块的咯... 爸爸你很笨的, 你不会两张都拿爸爸为什么上个月那个人来跟你讨钱, 你告诉他没
iptables开放端口 Fanyucai linux iptables 端口
1，找到配置文件 vi /etc/sysconfig/iptables 2，添加端口开放，增加一行，开放18081端口 -A INPUT -m state --state NEW -m tcp -p tcp --dport 18081 -j ACCEPT 3，保存 ESC :wq! 4，重启服务 service iptables
Ehcache（05）——缓存的查询 234390216 排序 ehcache 统计 query
缓存的查询目录 1. 使Cache可查询 1.1 基于Xml配置 1.2 基于代码的配置 2 指定可搜索的属性 2.1 可查询属性类型 2.2 &
通过hashset找到数组中重复的元素 jackyrong hashset
如何在hashset中快速找到重复的元素呢?方法很多，下面是其中一个办法： int[] array = {1,1,2,3,4,5,6,7,8,8}; Set<Integer> set = new HashSet<Integer>(); for(int i = 0
使用ajax和window.history.pushState无刷新改变页面内容和地址栏URL lanrikey history
后退时关闭当前页面 <script type="text/javascript"> jQuery(document).ready(function ($) { if (window.history && window.history.pushState) {
应用程序的通信成本 netkiller.github.com 虚拟机应用服务器陈景峰 netkiller neo
应用程序的通信成本什么是通信一个程序中两个以上功能相互传递信号或数据叫做通信。什么是成本这是是指时间成本与空间成本。时间就是传递数据所花费的时间。空间是指传递过程耗费容量大小。都有哪些通信方式全局变量线程间通信共享内存共享文件管道 Socket 硬件（串口，USB）等等全局变量全局变量是成本最低通信方法，通过设置
一维数组与二维数组的声明与定义恋洁e生二维数组一维数组定义声明初始化
/** * */ package test20111005; /** * @author FlyingFire * @date:2011-11-18 上午04:33:36 * @author ：代码整理 * @introduce :一维数组与二维数组的初始化 *summary： */ public c
Spring Mybatis独立事务配置 toknowme mybatis
在项目中有很多地方会使用到独立事务，下面以获取主键为例（1）修改配置文件spring-mybatis.xml  <tx:annotation-driven transaction-manager="transactionManager" /> &n
更新Anadroid SDK Tooks之后，Eclipse提示No update were found xp9802 eclipse
使用Android SDK Manager 更新了Anadroid SDK Tooks 之后，打开eclipse提示 This Android SDK requires Android Developer Toolkit version 23.0.0 or above, 点击Check for Updates 检测一会后提示 No update were found