python
1.staticmethod和classmethod:
class MethodTest():
var1 = "class var"
def __init__(self, var2 = "object var"):
self.var2 = var2
@staticmethod
def staticFun():
print 'static method'
@classmethod
def classFun(cls):
print 'class method'
相同点:
1.都可以通过类或实例调用
mt = MethodTest()
MethodTest.staticFun()
mt.staticFun()
MethodTest.classFun()
mt.classFun()
2.都无法访问实例成员
@staticmethod
def staticFun():
print var2 //wrong
@classmethod
def classFun(cls):
print var2 //wrong
不同点:
1.staticmethod无需参数,classmethod需要类变量作为参数传递(不是类的实例)
def classFun(cls):
print 'class method' //cls作为类变量传递
2.classmethod可以访问类成员,staticmethod则不可以
@staticmethod
def staticFun():
print var1 //wrong
@classmethod
def classFun(cls):
print cls.var1 //right
详见:http://blog.sina.com.cn/s/blog_45ac0d0a01017mfd.html
2.常用的函数:
2.1 Decimal, round 数字小数点处理等.
例:quotedPrice = price.get("quoted_price", None)
if quotedPrice is not None:
return int(round(Decimal(quotedPrice) / 100) * 100)
2.2 db.connetion.findAndModify() 对mongo的处理
2.3 xrange 与 range的区别. xrange生成器
2.4
3.万能头,解决字符编码问题:
import sys
reload(sys)
if sys.stdout.encoding is None:
import codecs
writer = codecs.getwriter("utf-8")
sys.stdout = writer(sys.stdout)
4.python操作csv文件
1. 写入并生成csv文件
# coding: utf-8
import csv
csvfile = file('csv_test.csv', 'wb')
writer = csv.writer(csvfile)
writer.writerow(['姓名', '年龄', '电话'])
data = [('小河', '25', '1234567'),('小芳', '18', '789456')]
writer.writerows(data)
csvfile.close()
2.读取csv文件
csvfile = file('csv_test.csv', 'rb')
reader = csv.reader(csvfile)
for line in reader:
print line
csvfile.close()
3.解决乱码
import codecs
with codecs.open('/home/yjy/workspace/savedfile/1.csv', 'wb','cp936') as csvfile:
writer = csv.writer(csvfile)
writer.writerow( header )
print header
writer.writerows( result_list )
print result_list
print csv_content
csvfile.close()
4.python Json转换
4.1 string - > Json:
result = json.loads(item)
picInfo = result[3]
4.2 Dict - > Json:
carinfo['salesperson'] = salesperson
carinfoJson = json.dumps(carinfo) #carinfo 为字典
scrapy
1.scrapy爬虫安装:
Python 2.7
pip and setuptools Python packages. Nowadays pip requires and installs setuptools if not installed.
lxml. Most Linux distributions ships prepackaged versions of lxml. Otherwise refer to http://lxml.de/installation.html
sudo apt-get install libxml2-dev libxslt-dev python-dev
sudo pip install Scrapy
更新安装:sudo pip install Scrapy --upgrade
2.爬虫执行:
建爬虫工程:scrapy startproject pro_name
更改设置,在spider下建立文件夹,如dianzan.py,其配置如下
from scrapy.spider import BaseSpider
class Dianzan(BaseSpider):
name = "dianzan"
MAX_PAGE = 10
start_urls = [
'http://beijing.baixing.com/'
]
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
开始爬虫:scrapy crawl dianzan
3.spidermonkey安装:
1、sudo apt-get install pkg-config
2.sudo apt-get install python2.7-dev
3、sudo apt-get install libnspr4-dev
4、sudo easy_install python-spidermonkey
4.scrapy的日志:
4.1
set.py中 :LOG_LEVEL = "DEBUG"
头文件中:
import logging
logger = logging.getLogger()
LOG_FILENAME="./log_test.txt"
logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG)
打印信息:
logger.debug(string)
4.2
控制台:scrapy crawl ganji --logfile=view.log
自动产生文件
打印信息:self.debug(stri