HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.
2.0.0-chaika
Hierarchical Cluster Engine Python language binding
|
Public Member Functions | |
def | __init__ (self) |
def | __str__ (self) |
def | __unicode__ (self) |
def | add_robot_name (self, bot) |
def | add_allow_rule (self, path) |
def | add_disallow_rule (self, path) |
def | is_not_empty (self) |
def | is_default (self) |
def | does_user_agent_match (self, user_agent) |
def | is_url_allowed (self, url, syntax=GYM2008) |
Public Attributes | |
robot_names | |
rules | |
crawl_delay | |
Static Public Attributes | |
int | ALLOW = 1 |
int | DISALLOW = 2 |
_Ruleset represents a set of allow/disallow rules (and possibly a crawl delay) that apply to a set of user agents. Users of this module don't need this class. It's available at the module level only because RobotExclusionRulesParser() instances can't be pickled if _Ruleset isn't visible a the module level.
Definition at line 187 of file OwnRobots.py.
def dc_crawler.OwnRobots._Ruleset.__init__ | ( | self | ) |
Definition at line 198 of file OwnRobots.py.
def dc_crawler.OwnRobots._Ruleset.__str__ | ( | self | ) |
def dc_crawler.OwnRobots._Ruleset.__unicode__ | ( | self | ) |
def dc_crawler.OwnRobots._Ruleset.add_allow_rule | ( | self, | |
path | |||
) |
def dc_crawler.OwnRobots._Ruleset.add_disallow_rule | ( | self, | |
path | |||
) |
def dc_crawler.OwnRobots._Ruleset.add_robot_name | ( | self, | |
bot | |||
) |
Definition at line 222 of file OwnRobots.py.
def dc_crawler.OwnRobots._Ruleset.does_user_agent_match | ( | self, | |
user_agent | |||
) |
Definition at line 237 of file OwnRobots.py.
def dc_crawler.OwnRobots._Ruleset.is_default | ( | self | ) |
Definition at line 234 of file OwnRobots.py.
def dc_crawler.OwnRobots._Ruleset.is_not_empty | ( | self | ) |
Definition at line 231 of file OwnRobots.py.
def dc_crawler.OwnRobots._Ruleset.is_url_allowed | ( | self, | |
url, | |||
syntax = GYM2008 |
|||
) |
|
static |
Definition at line 195 of file OwnRobots.py.
dc_crawler.OwnRobots._Ruleset.crawl_delay |
Definition at line 201 of file OwnRobots.py.
|
static |
Definition at line 196 of file OwnRobots.py.
dc_crawler.OwnRobots._Ruleset.robot_names |
Definition at line 199 of file OwnRobots.py.
dc_crawler.OwnRobots._Ruleset.rules |
Definition at line 200 of file OwnRobots.py.