HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.  2.0.0-chaika
Hierarchical Cluster Engine Python language binding
dc.EventObjects.SiteURL Class Reference
Inheritance diagram for dc.EventObjects.SiteURL:
Collaboration diagram for dc.EventObjects.SiteURL:

Public Member Functions

def __init__ (self, siteId, url, stateField=None, normalizeMask=URL.URL_NORMALIZE_MASK)
 
- Public Member Functions inherited from dc.EventObjects.URL
def __init__ (self, siteId, url, state=STATE_ENABLED, urlUpdate=None, normalizeMask=URL_NORMALIZE_MASK)
 
def getURL (self, normalizeMask=URL_NORMALIZE_MASK)
 
- Public Member Functions inherited from app.Utils.JsonSerializable
def __init__ (self)
 
def toJSON (self)
 

Public Attributes

 userId
 
- Public Attributes inherited from dc.EventObjects.URL
 siteId
 
 url
 
 type
 
 state
 
 status
 
 siteSelect
 
 crawled
 
 processed
 
 urlMd5
 
 contentType
 
 requestDelay
 
 processingDelay
 
 httpTimeout
 
 charset
 
 batchId
 
 errorMask
 
 crawlingTime
 
 processingTime
 
 totalTime
 
 httpCode
 
 UDate
 
 CDate
 
 httpMethod
 
 size
 
 linksI
 
 linksE
 
 freq
 
 depth
 
 rawContentMd5
 
 parentMd5
 
 lastModified
 
 eTag
 
 mRate
 
 mRateCounter
 
 tcDate
 
 maxURLsFromPage
 
 contentMask
 
 tagsMask
 
 tagsCount
 
 pDate
 
 contentURLMd5
 
 priority
 
 urlUpdate
 
 urlPut
 
 chainId
 
 classifierMask
 
 attributes
 

Additional Inherited Members

- Static Public Member Functions inherited from app.Utils.JsonSerializable
def json_serial (obj)
 
- Static Public Attributes inherited from dc.EventObjects.URL
int STATE_ENABLED = 0
 
int STATE_DISABLED = 1
 
int STATE_ERROR = 2
 
int STATUS_UNDEFINED = 0
 
int STATUS_NEW = 1
 
int STATUS_SELECTED_CRAWLING = 2
 
int STATUS_CRAWLING = 3
 
int STATUS_CRAWLED = 4
 
int STATUS_SELECTED_PROCESSING = 5
 
int STATUS_PROCESSING = 6
 
int STATUS_PROCESSED = 7
 
int STATUS_SELECTED_CRAWLING_INCREMENTAL = 8
 
int CONTENT_EMPTY = 0
 
int CONTENT_STORED_ON_DISK = 1 << 0
 
int TYPE_REGULAR = 0
 
int TYPE_SINGLE = 1
 
int TYPE_REGULAR_EXT = 2
 
int TYPE_NEW_SITE = 3
 
int TYPE_FETCHED = 4
 
int TYPE_REAL_TIME_CRAWLER = 5
 
int TYPE_CHAIN = 6
 
int SITE_SELECT_TYPE_EXPLICIT = 0
 
int SITE_SELECT_TYPE_AUTO = 1
 
int SITE_SELECT_TYPE_QUALIFY_URL = 2
 
int SITE_SELECT_TYPE_NONE = 3
 
string CONTENT_TYPE_TEXT_HTML = "text/html"
 
string CONTENT_TYPE_UNDEFINED = ""
 
 URL_NORMALIZE_MASK = UrlNormalizator.NORM_DEFAULT
 

Detailed Description

Definition at line 565 of file EventObjects.py.

Constructor & Destructor Documentation

◆ __init__()

def dc.EventObjects.SiteURL.__init__ (   self,
  siteId,
  url,
  stateField = None,
  normalizeMask = URL.URL_NORMALIZE_MASK 
)

Definition at line 567 of file EventObjects.py.

567  def __init__(self, siteId, url, stateField=None, normalizeMask=URL.URL_NORMALIZE_MASK):
568  super(SiteURL, self).__init__(siteId, url, stateField, normalizeMask=normalizeMask)
569 
570  self.userId = None
571 
572 
573 
574 # #URLStatus event object
575 #
576 # The URLStatus event object for URL_STATUS operation.
577 #
def __init__(self)
constructor
Definition: UIDGenerator.py:19

Member Data Documentation

◆ userId

dc.EventObjects.SiteURL.userId

Definition at line 570 of file EventObjects.py.


The documentation for this class was generated from the following file: