Part 1
An introduction about mechanize
import mechanize
br = mechanize.Browser()
br.open("http://www.example.com/")
>>
response1 = br.follow_link()
assert br.viewing_html()
print br.title()
IANA — IANA-managed Reserved Domains
print response1.geturl()
https://www.iana.org/domains/reserved
print response1.info() #headers
Date: Fri, 13 Jul 2018 14:51:52 GMT
X-Frame-Options: SAMEORIGIN
Referrer-Policy: origin-when-cross-origin
Content-Security-Policy: upgrade-insecure-requests
Vary: Accept-Encoding
Last-Modified: Tue, 21 Jul 2015 00:49:48 GMT
Cache-control: public, s-maxage=900, max-age=7202
Expires: Fri, 13 Jul 2018 16:51:52 GMT
Content-Type: text/html; charset=UTF-8
Server: Apache
Strict-Transport-Security: max-age=48211200; preload
X-Cache-Hits: 18
Accept-Ranges: bytes
Content-Length: 10225
Connection: close
content-type: text/html; charset=utf-8
print response1.read() #body
To get the response code from a website, you can the response.code
Part 2
An example to show how to search data in mechanize
from mechanize import Browser
browser = Browser()
browser.set_handle_robots(False) #ignore the robots.txt
response = browser.open("https://www.google.com")
print response.code
200
get all forms from the website
import mechanize
br = mechanize.Browser()
br.set_handle_robots(False) #ignore the robots.txt
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')] # Google demands a user-agent that isn't a robot
br.open("https://www.google.com")
>>
for f in br.forms():
print f
>
Select the search box and search for 'foo'
By default 'f' would represent the name of the form. 'q' would be one of the inputs of the form whose name is set to 'q'
br.select_form('f')
br.form['q'] = 'foo'
get the search results
br.submit()
>>
Find the link to foofighters.com
resp = None
import re
for link in br.links():
siteMatch = re.compile('https://foofighters.com/').search(link.url)
if siteMatch:
resp = br.follow_link(link)
break
print the site
content = resp.get_data()
print content
Part 3 Cheart Sheet
mechanize cheat sheet