Crawler Python API¶
Getting started with Crawler is easy.
The main class you need to care about is Crawler
crawler.main¶
-
class
crawler.main.
Crawler
(url, delay, ignore)¶ Main Crawler object.
Example:
c = Crawler('http://example.com') c.crawl()
Parameters: - delay – Number of seconds to wait between searches
- ignore – Paths to ignore
-
crawl
()¶ Crawl the URL set up in the crawler.
This is the main entry point, and will block while it runs.
-
get
(url)¶ Get a specific URL, log its response, and return its content.
Parameters: url – The fully qualified URL to retrieve
-
crawler.main.
run_main
()¶ A small wrapper that is used for running as a CLI Script.