HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.
2.0.0-chaika
Hierarchical Cluster Engine Python language binding
|
Variables | |
int | FETCHER_TIME_LIMIT_MAX = 100 |
float | CONNECTION_TIMEOUT = 1.0 |
int | MAX_HTTP_REDIRECTS_LIMIT = 5 |
int | MAX_HTTP_SIZE_UNLIMIT = 0 |
int | MAX_HTML_REDIRECTS_LIMIT = 1 |
string | DB_SITES = "dc_sites" |
string | DB_URLS = "dc_urls" |
string | RTC_FINALIZER_APP_NAME = "rtc-finalizer" |
string | RTC_PREPROCESSOR_APP_NAME = "rtc-preprocessor" |
list | pubdateFeedNames = ["pubdate", "published", "pubDate", "published_parsed", "updated_parsed"] |
string | pubdateRssFeedHeaderName = "X-pubdateRssFeed" |
string | rssFeedUrlHeaderName = "X-feed_url" |
string | baseUrlHeaderName = "X-base_url" |
int | HTTP_CODE_200 = 200 |
int | HTTP_CODE_304 = 304 |
int | HTTP_CODE_400 = 400 |
int | HTTP_CODE_403 = 403 |
list | REDIRECT_HTTP_CODES = [301, 302, 303, 304] |
list | REDIRECT_HEADER_FIELDS_FOR_REMOVE = ['referer', 'content-type', 'Location', 'cookie'] |
dictionary | charsetDetectorMap |
dictionary | standardEncodings |
HCE project, Python bindings, Distributed Tasks Manager application. Event objects definitions. @package: dc @file Constants.py @author Oleksii <developers.hce@gmail.com> @author madk <developers.hce@gmail.com> @link: http://hierarchical-cluster-engine.com/ @copyright: Copyright © 2013-2014 IOIX Ukraine @license: http://hierarchical-cluster-engine.com/license/ @since: 0.1
string dc_crawler.Constants.baseUrlHeaderName = "X-base_url" |
Definition at line 34 of file Constants.py.
dictionary dc_crawler.Constants.charsetDetectorMap |
Definition at line 45 of file Constants.py.
float dc_crawler.Constants.CONNECTION_TIMEOUT = 1.0 |
Definition at line 17 of file Constants.py.
string dc_crawler.Constants.DB_SITES = "dc_sites" |
Definition at line 24 of file Constants.py.
string dc_crawler.Constants.DB_URLS = "dc_urls" |
Definition at line 25 of file Constants.py.
int dc_crawler.Constants.FETCHER_TIME_LIMIT_MAX = 100 |
Definition at line 16 of file Constants.py.
int dc_crawler.Constants.HTTP_CODE_200 = 200 |
Definition at line 36 of file Constants.py.
int dc_crawler.Constants.HTTP_CODE_304 = 304 |
Definition at line 37 of file Constants.py.
int dc_crawler.Constants.HTTP_CODE_400 = 400 |
Definition at line 38 of file Constants.py.
int dc_crawler.Constants.HTTP_CODE_403 = 403 |
Definition at line 39 of file Constants.py.
int dc_crawler.Constants.MAX_HTML_REDIRECTS_LIMIT = 1 |
Definition at line 22 of file Constants.py.
int dc_crawler.Constants.MAX_HTTP_REDIRECTS_LIMIT = 5 |
Definition at line 19 of file Constants.py.
int dc_crawler.Constants.MAX_HTTP_SIZE_UNLIMIT = 0 |
Definition at line 20 of file Constants.py.
list dc_crawler.Constants.pubdateFeedNames = ["pubdate", "published", "pubDate", "published_parsed", "updated_parsed"] |
Definition at line 31 of file Constants.py.
string dc_crawler.Constants.pubdateRssFeedHeaderName = "X-pubdateRssFeed" |
Definition at line 32 of file Constants.py.
list dc_crawler.Constants.REDIRECT_HEADER_FIELDS_FOR_REMOVE = ['referer', 'content-type', 'Location', 'cookie'] |
Definition at line 42 of file Constants.py.
list dc_crawler.Constants.REDIRECT_HTTP_CODES = [301, 302, 303, 304] |
Definition at line 41 of file Constants.py.
string dc_crawler.Constants.rssFeedUrlHeaderName = "X-feed_url" |
Definition at line 33 of file Constants.py.
string dc_crawler.Constants.RTC_FINALIZER_APP_NAME = "rtc-finalizer" |
Definition at line 27 of file Constants.py.
string dc_crawler.Constants.RTC_PREPROCESSOR_APP_NAME = "rtc-preprocessor" |
Definition at line 28 of file Constants.py.
dictionary dc_crawler.Constants.standardEncodings |
Definition at line 52 of file Constants.py.