用python爬取智联招聘

在智联网站上搜索“大数据分析”
用python爬取智联招聘_第1张图片
F12检索网页,找到对应的json
用python爬取智联招聘_第2张图片
抓取URL
用python爬取智联招聘_第3张图片

import requests
import pandas as pd  #用于显示数据框
import time		#时间停留
url = r'https://fe-api.zhaopin.com/c/i/sou?_v=0.15071971&x-zp-page-request-id=eb7282843faf4a2d827002757bcdd769-1575511230999-129076&x-zp-client-id=60004a6b-bf5e-41c0-9a37-e2aa003360c8&MmEwMD=4CkD_v9XYKkRPQD7x5kBhxL5rYE8kibi89tA7lNM3gE_o.GmRqE0CXdzn7hhrlUY4sfi_Dh2K.E.lZ7V2ohekVswQoqKdW8i4H5r5Ie9ke9hETN2dUGVd.VplT9huTmAvE69pF0udgsddmqOiGqOkO55_wyaV0lLV4uTkR41VcMsb5qV3UPIOoHl160CiBmNryjs5bk7Z6RLlZWkhRxVpXb0DbEzTXLv7T8Xst6dQ1dj1fDP.q2ERisCklLcxyRy.WgM8Xtk6mLcYkInmzFbxrmw_wz1SCOi1co_5brw8ntInYCL74N6Cv2TGFU_fxfjlKGQtShNRQliBXrk8w1ULqTTUTMk2TcdyLC3PGDCSHZajRKH7xzeulcZ0jwm9B03YBTixrfpodjXKIs0wH6brVp2Q'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}	#构建请求信息避免反爬虫
response = requests.get(url,headers=headers)
datas = response.json()
datas

爬取到json里面的内容
用python爬取智联招聘_第4张图片
解析json中的内容
用python爬取智联招聘_第5张图片
列一个循环找到[‘data’][‘results’]编写如下:就可以将json内容分别取出

    for i in  res_['data']['results']:
        city = i['city']['items'][0]['name']
        company = i['company']['name']
        type = i['company']['type']['name']
        size = i['company']['size']['name']
        eduLevel = i['eduLevel']['name']
        jobname = i['jobName']
        salary = i['salary']
        workingExp = i['workingExp']['name']
        welfare = i['welfare']
        number = i['number'][0]
    

最后保存到CSV

    with open('zhilian.csv','a',encoding='gbk',newline='')as csv_file:
        writer = csv.writer(csv_file)
        try:
            writer.writerow(item)
        except Exception as e:
            print('writer error:',e)

部分内容如下:
用python爬取智联招聘_第6张图片

你可能感兴趣的:(Python爬虫)