学习python一段时间了,爬两个网页练练手,该源代码是爬取韩寒博客的所有文章,并将文章链接下载到本地,关于将博客纯文字下载到本地见博主另外一篇文章:
# -*- coding: utf-8 -*-
import urllib
page=1
url = [' '] *350
i = 1
while page <=7:
menu = "http://blog.sina.com.cn/s/articlelist_1191258123_0_"+str(page)+".html"
print menu
conn = urllib.urlopen(menu).read() #读取博客首页
#print conn
title = conn.find(r'
http://blog.csdn.net/hpu_a/article/details/51518990