一个简单的爬虫源码

一个简单的爬虫源码

一个不正经的视频教学

import requests
import re
import time
headers={
     
    'user-agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'}

#Send the request
response=requests.get("https://mm.enterdesk.com/bizhi/62854-340754.html",headers=headers)
html=response.text
#print(html)

#Parse the file
urls=re.findall(r'                                    class="pics_pics "\
                                    src=".*?"\
                                    href=".*?"><img\
                                        src="(.*?)"\
                                        title="双马尾美女青春迷人写真"/></a></div>\
                                            <div class=".*?"><a\
   ', html)
print(urls)

#Save the page
for url in urls:
    time.sleep(30)  # 延时1秒
    # 图片名字
    file_name = url.split('/')[-1]  # 文件命名
    response = requests.get(url, headers=headers)
    with open(file_name, 'wb') as f:  # 以2进制形式写入文件名
        f.write(response.content)


你可能感兴趣的:(python学习,python)