关于爬取京东商品界面出现的passport

MOOC关于爬取京东商品界面的实例:

import requests
url = "https://item.jd.com/100002795959.html"
try:
    r = requests.get(url)
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    print(r.text[:1000])
except:
    print("爬取失败")

运行后出现

<script>window.location.href='https://passport.jd.com/uc/login?ReturnUrl=http://item.jd.com/100007413580.html'</script>

加入修改header的语句

import requests
url = "https://item.jd.com/100002795959.html"
try:
    kv = {'user-agent':'Mozilla/5.0'}
    r = requests.get(url,headers = kv)
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    print(r.text[:1000])
except:
    print("爬取失败")

爬取成功

<!DOCTYPE HTML>
<html lang="zh-CN">
<head>
    <!-- shouji -->
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>【华为P30 Pro】华为 HUAWEI P30 Pro 超感光徕卡四摄10倍混合变焦麒麟980芯片屏内指纹 8GB+128GB极光色全网通版双4G手机【行情 报价 价格 评测】-京东</title>
    <meta name="keywords" content="HUAWEIP30 Pro,华为P30 Pro,华为P30 Pro报价,HUAWEIP30 Pro报价"/>
    <meta name="description" content="【华为P30 Pro】京东JD.COM提供华为P30 Pro正品行货,并包括HUAWEIP30 Pro网购指南,以及华为P30 Pro图片、P30 Pro参数、P30 Pro评论、P30 Pro心得、P30 Pro技巧等信息,网购华为P30 Pro上京东,放心又轻松" />
    <meta name="format-detection" content="telephone=no">
    <meta http-equiv="mobile-agent" content="format=xhtml; url=//item.m.jd.com/product/100002795959.html">
    <meta http-equiv="mobile-agent" content="format=html5; url=//item.m.jd.com/product/100002795959.html">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <link rel="canonical" href="//item.jd.com/100002795959.html"/>
        <link rel="dns-prefetch" href="//misc.360buyimg.com"/>
    <link rel="dns-prefet

你可能感兴趣的:(python入门)