爬虫技巧1:6.6s内获取爬虫需要的cookie和header

爬虫技巧1:6.6s内获取爬虫需要的cookie和header

  • 安居客二手房网站为例
  • https://wenzhou.anjuke.com/sale/rd1/
  • 爬虫技巧1:6.6s内获取爬虫需要的cookie和header_第1张图片
  1. F12 进入开发者工具,->网络,点击下方随意一个文件右击复制为uURL(bash)

转换工具网站:https://www.lddgo.net/convert/curl-to-code
(一定要关注博主,多学轻松,技巧不是偷懒,多学不懒)

  1. 将复制的代码复制进来转换工具网站,切换适合的编程语言转化为爬虫代码。

爬虫技巧1:6.6s内获取爬虫需要的cookie和header_第2张图片
3. 复制的代码后加上
html_text = response.text
print(html_text)
直接获取该网站源代码。

import requests

cookies = {
    'aQQ_ajkguid': '6B111FB5-B268-4EFC-940F-C99FA180148E',
    'sessid': 'C7FAAE9F-D4B2-4764-981E-340D0E753E45',
    'ajk-appVersion': '',
    'ctid': '106',
    'id58': 'CrIfp2VpfXWwN9qRHY1NAg==',
    '58tj_uuid': 'bef0a89a-5177-4b84-9389-c645a1d63e34',
    '_ga': 'GA1.2.1675396380.1701572517',
    'als': '0',
    'new_uv': '2',
    '_ga_DYBJHZFBX2': 'GS1.2.1701577359.2.0.1701577359.0.0.0',
    'twe': '2',
    'fzq_h': 'dab1587be89b7a7bba9e41107430db6d_1702261024042_b6d3496d28fa457eb5c60fa0b8915fb9_1944888259',
    'fzq_js_anjuke_ershoufang_pc': '7bcc203a66e9d32ed7f574544405a0fb_1702261047085_24',
    'obtain_by': '1',
    'xxzl_cid': '7e0d615d9b5242e486a4754c39da4d51',
    'xxzl_deviceid': 'pbTJ0ZylrCzDJmZ1NEVQTfp70hmDzG+EhzjIj1glnWRw/1eV8Yc6n3M8j3Ziur9q',
}

headers = {
    'authority': 'wenzhou.anjuke.com',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'accept-language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
    'cache-control': 'max-age=0',
    # Requests sorts cookies= alphabetically
    # 'cookie': 'aQQ_ajkguid=6B111FB5-B268-4EFC-940F-C99FA180148E; sessid=C7FAAE9F-D4B2-4764-981E-340D0E753E45; ajk-appVersion=; ctid=106; id58=CrIfp2VpfXWwN9qRHY1NAg==; 58tj_uuid=bef0a89a-5177-4b84-9389-c645a1d63e34; _ga=GA1.2.1675396380.1701572517; als=0; new_uv=2; _ga_DYBJHZFBX2=GS1.2.1701577359.2.0.1701577359.0.0.0; twe=2; fzq_h=dab1587be89b7a7bba9e41107430db6d_1702261024042_b6d3496d28fa457eb5c60fa0b8915fb9_1944888259; fzq_js_anjuke_ershoufang_pc=7bcc203a66e9d32ed7f574544405a0fb_1702261047085_24; obtain_by=1; xxzl_cid=7e0d615d9b5242e486a4754c39da4d51; xxzl_deviceid=pbTJ0ZylrCzDJmZ1NEVQTfp70hmDzG+EhzjIj1glnWRw/1eV8Yc6n3M8j3Ziur9q',
    'referer': 'https://wenzhou.anjuke.com/',
    'sec-ch-ua': '"Microsoft Edge";v="119", "Chromium";v="119", "Not?A_Brand";v="24"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'same-origin',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0',
}

params = {
    'kw': '',
}

response = requests.get('https://wenzhou.anjuke.com/sale/rd1/', params=params, cookies=cookies, headers=headers)
html_text = response.text
print(html_text)

最后安居客有IP限制的话需要作代理或者加延时。

你可能感兴趣的:(python爬虫,爬虫,python)