Python学习之爬虫(小甲鱼)

依葫芦画瓢 

用字符串查找图片地址下载 

图片放在当前目录 

GIF下载下来不会动.....

 

 

import urllib.request
import time

def open_url(url):
    #return htmlpage
    print(url)
    req = urllib.request.Request(url)
    req.add_header("User-Agent","Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36")
    response = urllib.request.urlopen(req)
    return response.read()

def getInitialpage():
    #return how many pages we have
    url = "http://jandan.net/ooxx"
    html = open_url(url)
    html = html.decode("utf-8")
    index = html.find("span class=\"current-comment-page\"")
    beginindex = html.find("[" , index)
    endindex = html.find("]" , index)
    initialpage = html[(beginindex+1) : endindex]
    return initialpage

def getpiclist(pageurl):
    html = open_url(pageurl)
    html = html.decode("utf-8")
    piclist = list()
    for i in range(html.count("[查看原图]

补充:

request库应该有一个retrieve方法用于下载,可以替换上述的 savepic() 中的代码,动图可正常显示

 

 

 

 

你可能感兴趣的:(python学习笔记)