HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.  2.0.0-chaika
Hierarchical Cluster Engine Python language binding
dc_db.URLHistoryTask.URLHistoryTask Class Reference
Inheritance diagram for dc_db.URLHistoryTask.URLHistoryTask:
Collaboration diagram for dc_db.URLHistoryTask.URLHistoryTask:

Public Member Functions

def __init__ (self, keyValueStorageDir, rawDataDir, dBDataTask)
 
def process (self, urlHistories, queryCallback)
 
def fetchLogsFromDB (self, urlHistory, queryCallback, logCriterions=None)
 
- Public Member Functions inherited from dc_db.BaseTask.BaseTask
def isSiteExist (self, siteId, queryCallback, userId=None)
 
def generateCriterionSQL (self, criterions, additionWhere=None, siteId=None)
 
def fetchByCriterions (self, criterions, queryCallback)
 
def dbLock (self, mutexName, queryCallback, sleepTime=1, mutexLockTTL=Constants.DEFAULT_LOCK_TTL)
 
def dbUnlock (self, mutexName, queryCallback)
 
def createUrlsInsertQuery (self, siteId, localKeys, localValues)
 
def copyUrlsToDcUrls (self, siteId, queryCallback)
 
def statisticLogUpdate (self, localObj, urlMd5, siteId, status, queryCallback, isInsert=False)
 
def calculateMd5FormUrl (self, url, urlType, useNormilize=False)
 

Public Attributes

 uRLCleanUpTask
 

Static Public Attributes

string SQL_LOG_TEMPLATE = "SELECT * FROM %s WHERE `URLMd5`='%s'"
 
string SQL_LOG_TEMPLATE_SHORT = "SELECT * FROM %s"
 

Additional Inherited Members

- Static Public Member Functions inherited from dc_db.BaseTask.BaseTask
def readValueFromSiteProp (siteId, propName, queryCallback, urlMd5=None)
 

Detailed Description

Definition at line 19 of file URLHistoryTask.py.

Constructor & Destructor Documentation

◆ __init__()

def dc_db.URLHistoryTask.URLHistoryTask.__init__ (   self,
  keyValueStorageDir,
  rawDataDir,
  dBDataTask 
)

Definition at line 26 of file URLHistoryTask.py.

26  def __init__(self, keyValueStorageDir, rawDataDir, dBDataTask):
27  super(URLHistoryTask, self).__init__()
28  self.uRLCleanUpTask = URLCleanUpTask(keyValueStorageDir, rawDataDir, dBDataTask)
29 
30 
def __init__(self)
constructor
Definition: UIDGenerator.py:19

Member Function Documentation

◆ fetchLogsFromDB()

def dc_db.URLHistoryTask.URLHistoryTask.fetchLogsFromDB (   self,
  urlHistory,
  queryCallback,
  logCriterions = None 
)

Definition at line 69 of file URLHistoryTask.py.

69  def fetchLogsFromDB(self, urlHistory, queryCallback, logCriterions=None):
70  tableName = Constants.DC_LOG_TABLE_NAME_TEMPLATE % urlHistory.siteId
71  if logCriterions is None:
72  query = self.SQL_LOG_TEMPLATE % (tableName, urlHistory.urlMd5)
73  else:
74  additionWere = "`URLMd5` = '%s'"
75  additionWere = (additionWere % urlHistory.urlMd5)
76  query = self.SQL_LOG_TEMPLATE_SHORT % tableName
77  query += self.generateCriterionSQL(logCriterions, additionWere)
78  ret = queryCallback(query, Constants.LOG_DB_ID, Constants.EXEC_NAME)
79  if ret is not None:
80  for elem in ret:
81  if "CDate" in elem:
82  elem["CDate"] = str(elem["CDate"])
83  if "ODate" in elem:
84  elem["ODate"] = str(elem["ODate"])
85  return ret
86 
Here is the call graph for this function:
Here is the caller graph for this function:

◆ process()

def dc_db.URLHistoryTask.URLHistoryTask.process (   self,
  urlHistories,
  queryCallback 
)

Definition at line 36 of file URLHistoryTask.py.

36  def process(self, urlHistories, queryCallback):
37  uRLHistoryResponses = []
38  for urlHistory in urlHistories:
39  uRLHistoryResponse = None
40  if urlHistory is not None:
41  localMd5s = []
42  if urlHistory.urlMd5 is None:
43  if urlHistory.urlCriterions is not None:
44  localMd5s = self.uRLCleanUpTask.extractUrlByCriterions(urlHistory.siteId, False,
45  urlHistory.urlCriterions, queryCallback)
46  else:
47  localMd5s.append(urlHistory.urlMd5)
48  logger.debug(">>> [URLHistoryTask] localUrls size = " + str(len(localMd5s)))
49  for localMd5 in localMd5s:
50  try:
51  urlHistory.urlMd5 = localMd5
52  res = self.fetchLogsFromDB(urlHistory, queryCallback, urlHistory.logCriterions)
53  if uRLHistoryResponse is None:
54  uRLHistoryResponse = dc.EventObjects.URLHistoryResponse([], urlHistory.siteId)
55  if res is not None and len(res) > 0:
56  uRLHistoryResponse.logRows.extend(res)
57  except Exception as ex:
58  logger.debug(">>> [URLHistoryTask] Some Type Exception = " + str(type(ex)) + " " + str(ex))
59  uRLHistoryResponses.append(uRLHistoryResponse)
60  return uRLHistoryResponses
61 
62 
Here is the call graph for this function:
Here is the caller graph for this function:

Member Data Documentation

◆ SQL_LOG_TEMPLATE

string dc_db.URLHistoryTask.URLHistoryTask.SQL_LOG_TEMPLATE = "SELECT * FROM %s WHERE `URLMd5`='%s'"
static

Definition at line 21 of file URLHistoryTask.py.

◆ SQL_LOG_TEMPLATE_SHORT

string dc_db.URLHistoryTask.URLHistoryTask.SQL_LOG_TEMPLATE_SHORT = "SELECT * FROM %s"
static

Definition at line 22 of file URLHistoryTask.py.

◆ uRLCleanUpTask

dc_db.URLHistoryTask.URLHistoryTask.uRLCleanUpTask

Definition at line 28 of file URLHistoryTask.py.


The documentation for this class was generated from the following file: