scrapy wiki资料汇总

See also: Scrapy homepage, Official documentation, Scrapy snippets on Snipplr

Getting started

If you're new to Scrapy, start by reading Scrapy at a glance.

Google Summer of Code

  • GSoC 2015
  • GSoC 2014

Articles & blog posts

These are guides contributed by the Scrapy community. If you know of any guide not included here please feel free to add it.

  • Building a web crawler with Scrapy
  • Scrapy after the tutorials
  • How to do basic web scraping using Scrapy on a Windows Azure virtual machine
  • Scraping iTunes Charts Using Scrapy
  • SearchHub: Indexing web sites in Solr with Scrapy
  • Using Parsley extraction language with Scrapy
  • Running Scrapy on Amazon EC2
  • How to automatically search and download torrents with Python and Scrapy
  • Scraping Craigslist with Scrapy (includes video) - Nov 5, 2012
  • How to Install Scrapy 0.14 in a 64 bit Windows 7 Environment
  • Using Scrapy with different/many proxies
  • Scrape multi-pages content with Scrapy
  • Calling Scrapy from a Python script
  • Scrapy and Django (1)
  • Scrapy and Django (2)
  • Scrapy and Django (3)
  • Scraping Google Scholar with Scrapy and MongoDB
  • Recursively scraping a blog with Scrapy
  • Setup Macports Python and Scrapy successfully
  • Crawl a website with Scrapy
  • How to use Scrapy with TOR (scrapy-users message)
  • Convert relative paths to absolute paths
  • How to use Scrapy, Tor with multiple user agents
  • (Russian, 2011) Собираем данные с помощью Scrapy
  • How to Run Scrapy Spiders on Cloud Using Heroku and Redis
  • Web Scraping With Scrapy and MongoDB

Videos

  • Scrapy: it GETs the web - PyCon US 2013 talk
  • Installing Scrapy on Windows (video tutorial)
  • Recursively scraping Craigslist (includes video) - Nov 8, 2012
  • Scraping the Web with Scrapy
  • Karthik Ananth: Scrapy Workshop
  • Scrapy / Python playlist on Youtube channel

Slides

English slides:

  • Scrapy - a flexible crawler to power your search - give by Shane Evans in Feb 2013 Cambridge Search Meetup
  • Web Crawling & Metadata Extraction in Python
  • Crawling the web for fun and profit
  • Scrapy for dummies
  • Web scraping 1 2-3 with python + scrapy (Summer BarCampHK 2012 version)
  • Collecting web information with open source tools
  • When big data meet python @ COSCUP 2012
  • How to scrape any website's content using Scrapy

Spanish slides:

  • Scrapy workshop
  • Scrapy - an agile framework for web spider development

Chinese slides:

  • Scrapy+WebKit+MySQL+Redis integration

Portuguese Slides:

  • Scrapy in five minutes

Projects, tools and libraries using Scrapy

  • Django Dynamic Scraper - a web application (written in django) for runnning and controlling Scrapy spiders
  • Slybot - A supervised learning crawler based on Scrapely
  • scrapy-sentry - Logs Scrapy exceptions into Sentry
  • ScrapyGraphite - Output scrapy statistics to carbon/graphite
  • scrapy-mongo - A pipeline to store scrapy items in a MongoDB database
  • scrapy-boilerplate - small set of utilities to simplify writing low-complexity spiders
  • scrapy-inline-requests - provides a decorator to write spider callbacks which performs multiple requests without the need to write multiple callbacks for each request
  • scrapy-redis - providesRedis-backed components for Scrapy
  • scrapyz - Create simple spiders easily.
  • Scrapy-related libraries on PyPI
  • Scrapy_cn - provided a demo to solve encoding problems(utf-8).
  • elite-proxies-scrapy-middleware - get new proxies from your EliteProxies account
  • scrapydo - Crochet-based blocking API for Scrapy.

Companies using Scrapy

See http://scrapy.org/companies/

Release Notes

  • see Release notes in the official documentation

Developer documentation

  • Scrapy Release Procedure
  • PY3: Twisted Dependencies
  • Python-3-Porting

Scrapy Enhancement Proposals

  • SEPs are available in scrapy/sep.

你可能感兴趣的:(scrapy wiki资料汇总)