HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.  2.0.0-chaika
Hierarchical Cluster Engine Python language binding
dc_crawler.Fetcher Namespace Reference

Classes

class  BaseFetcher
 
class  ContentFetcher
 
class  RequestsFetcher
 
class  Response
 
class  SeleniumFetcher
 
class  SimpleCharsetDetector
 
class  URLLibFetcher
 

Functions

def checkRedirectsHook (r, args, kwargs)
 

Variables

 logger = logging.getLogger(APP_CONSTS.LOGGER_NAME)
 
int MAX_CONTENT_SIZE_FOR_CHARDET = 5000000
 

Detailed Description

HCE project, Python bindings, Distributed Tasks Manager application.
web page fetchers.

@package: dc
@file Fetcher.py
@author madk, bgv <developers.hce@gmail.com>
@link: http://hierarchical-cluster-engine.com/
@copyright: Copyright &copy; 2013-201 IOIX Ukraine
@license: http://hierarchical-cluster-engine.com/license/
@since: 0.1

Function Documentation

◆ checkRedirectsHook()

def dc_crawler.Fetcher.checkRedirectsHook (   r,
  args,
  kwargs 
)

Definition at line 155 of file Fetcher.py.

155 def checkRedirectsHook(r, *args, **kwargs):
156  logger.debug('r.url = ' + str(r.url))
157  logger.debug('args = ' + str(args))
158  logger.debug('kwargs = ' + str(kwargs))
159  logger.debug('type(r): %s, r = %s', str(type(r)), varDump(r))
160 
161 
162 # # Fetcher base on the requests module, cann't execute javascript
163 #
164 #
def varDump(obj, stringify=True, strTypeMaxLen=256, strTypeCutSuffix='...', stringifyType=1, ignoreErrors=False, objectsHash=None, depth=0, indent=2, ensure_ascii=False, maxDepth=10)
Definition: Utils.py:410
def checkRedirectsHook(r, args, kwargs)
Definition: Fetcher.py:155
Here is the call graph for this function:

Variable Documentation

◆ logger

dc_crawler.Fetcher.logger = logging.getLogger(APP_CONSTS.LOGGER_NAME)

Definition at line 44 of file Fetcher.py.

◆ MAX_CONTENT_SIZE_FOR_CHARDET

int dc_crawler.Fetcher.MAX_CONTENT_SIZE_FOR_CHARDET = 5000000

Definition at line 47 of file Fetcher.py.