python爬虫:获取菜鸟网站上url

致敬菜鸟网站. 在上面自学了python, html, javascript等

import requests
from bs4 import BeautifulSoup

def main():
    base_url = "http://www.runoob.com/"
    url = "http://www.runoob.com/python3/python3-string.html"
    req_obj = requests.get(url)
    bresp = BeautifulSoup(req_obj.text,'lxml')
    leftcolumn = bresp.find(id='leftcolumn')
    a = leftcolumn.find_all('a')
    for item in a:
        #print(type(item['href']))
        if ("http" in item['href']):
            print(item['href'])
        else:
            print(base_url + item['href'])
    
    return

if __name__ == '__main__':
    main()

运行结果:

http://www.runoob.com//python3/python3-tutorial.html
http://www.runoob.com/python3-install.html
http://www.runoob.com/python3-basic-syntax.html
http://www.runoob.com/python3-data-type.html
http://www.runoob.com//python3/python3-interpreter.html
http://www.runoob.com//python3/python3-comment.html
http://www.runoob.com/python3-basic-operators.html
http://www.runoob.com//python3/python3-number.html
http://www.runoob.com//python3/python3-string.html
http://www.runoob.com//python3/python3-list.html
http://www.runoob.com/python3-tuple.html
http://www.runoob.com/python3-dictionary.html
http://www.runoob.com/python3-set.html
http://www.runoob.com//python3/python3-step1.html
http://www.runoob.com//python3/python3-conditional-statements.html
http://www.runoob.com//python3/python3-loop.html
http://www.runoob.com/python3-iterator-generator.html
http://www.runoob.com//python3/python3-function.html
http://www.runoob.com//python3/python3-data-structure.html
http://www.runoob.com//python3/python3-module.html
http://www.runoob.com//python3/python3-inputoutput.html
http://www.runoob.com/python3-file-methods.html
http://www.runoob.com/python3-os-file-methods.html
http://www.runoob.com//python3/python3-errors-execptions.html
http://www.runoob.com//python3/python3-class.html
http://www.runoob.com//python3/python3-stdlib.html
http://www.runoob.com//python3/python3-examples.html
http://www.runoob.com//quiz/python-quiz.html
http://www.runoob.com//python3/python3-reg-expressions.html
http://www.runoob.com//python3/python3-cgi-programming.html
http://www.runoob.com/python-mysql-connector.html
http://www.runoob.com//python3/python3-mysql.html
http://www.runoob.com//python3/python3-socket.html
http://www.runoob.com//python3/python3-smtp.html
http://www.runoob.com//python3/python3-multithreading.html
http://www.runoob.com//python3/python3-xml-processing.html
http://www.runoob.com//python3/python3-json.html
http://www.runoob.com//python3/python3-date-time.html
http://www.runoob.com//python3/python3-built-in-functions.html
http://www.runoob.com//python3/python-mongodb.html
http://www.runoob.com//python3/python-uwsgi.html

你可能感兴趣的:(python)