HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.
2.0.0-chaika
Hierarchical Cluster Engine Python language binding
|
Public Member Functions | |
def | __init__ (self, url="") |
def | set_url (self, url) |
def | read (self) |
def | parse (self, lines) |
def | can_fetch (self, user_agent, url, syntax=GYM2008) |
def | mtime (self) |
def | modified (self) |
Public Member Functions inherited from dc_crawler.OwnRobots.RobotExclusionRulesParser | |
def | __init__ (self) |
def | source_url (self) |
def | response_code (self) |
def | sitemap (self) |
def | sitemaps (self) |
def | is_expired (self) |
def | is_allowed (self, user_agent, url, syntax=GYM2008) |
def | get_crawl_delay (self, user_agent) |
def | fetch (self, url, timeout=None) |
def | parse (self, s) |
def | __str__ (self) |
def | __unicode__ (self) |
Public Attributes | |
last_checked | |
Public Attributes inherited from dc_crawler.OwnRobots.RobotExclusionRulesParser | |
user_agent | |
use_local_time | |
expiration_date | |
Private Attributes | |
_user_provided_url | |
A drop-in replacement for the Python standard library's RobotFileParser that retains all of the features of RobotExclusionRulesParser.
Definition at line 671 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.__init__ | ( | self, | |
url = "" |
|||
) |
Definition at line 675 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.can_fetch | ( | self, | |
user_agent, | |||
url, | |||
syntax = GYM2008 |
|||
) |
Definition at line 698 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.modified | ( | self | ) |
Definition at line 706 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.mtime | ( | self | ) |
Definition at line 702 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.parse | ( | self, | |
lines | |||
) |
Definition at line 694 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.read | ( | self | ) |
Definition at line 690 of file OwnRobots.py.
def dc_crawler.OwnRobots.RobotFileParserLookalike.set_url | ( | self, | |
url | |||
) |
Definition at line 684 of file OwnRobots.py.
|
private |
Definition at line 678 of file OwnRobots.py.
dc_crawler.OwnRobots.RobotFileParserLookalike.last_checked |
Definition at line 679 of file OwnRobots.py.