HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.  2.0.0-chaika
Hierarchical Cluster Engine Python language binding
dc_processor.SourceTemplateExtractor.SourceTemplateExtractor Class Reference
Inheritance diagram for dc_processor.SourceTemplateExtractor.SourceTemplateExtractor:
Collaboration diagram for dc_processor.SourceTemplateExtractor.SourceTemplateExtractor:

Public Member Functions

def __init__ (self)
 
def scheduleCalc (self, schedule, additionData)
 
def loadTemplateFromSource (self, templateSource, additionData=None, rawContent=None, url=None)
 
def resolveTemplateByHTTP (self, templateSourceElement)
 
def replacePostRawContent (self, post)
 

Public Attributes

 TemplHash
 
 macroDict
 

Static Public Attributes

string SOURCE_NAME_FILE = "file"
 
string SOURCE_NAME_HTTP = "http"
 
list POST_BUFF_MACROS = ["RAW_CONTENT", "URL"]
 

Detailed Description

Definition at line 23 of file SourceTemplateExtractor.py.

Constructor & Destructor Documentation

◆ __init__()

def dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.__init__ (   self)

Definition at line 33 of file SourceTemplateExtractor.py.

33  def __init__(self):
34  self.TemplHash = {}
35  self.macroDict = {}
36 
37 
def __init__(self)
constructor
Definition: UIDGenerator.py:19

Member Function Documentation

◆ loadTemplateFromSource()

def dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.loadTemplateFromSource (   self,
  templateSource,
  additionData = None,
  rawContent = None,
  url = None 
)

Definition at line 91 of file SourceTemplateExtractor.py.

91  def loadTemplateFromSource(self, templateSource, additionData=None, rawContent=None, url=None):
92  ret = []
93  self.macroDict = {}
94  if rawContent is not None:
95  self.macroDict["RAW_CONTENT"] = rawContent
96  if url is not None:
97  self.macroDict["URL"] = url
98  templateSourceStruct = None
99  try:
100  templateSourceStruct = json.loads(templateSource)
101  except Exception as excp:
102  logger.debug(">>> Wrong while json loads from templateSource; err=" + str(excp))
103  # if templateSourceStruct is not None and type(templateSourceStruct) is types.ListType:
104  if templateSourceStruct is not None and isinstance(templateSourceStruct, types.ListType):
105  for templateSourceElement in templateSourceStruct:
106  addedElement = None
107  try:
108  if "schedule" in templateSourceElement and templateSourceElement["schedule"] is not None and \
109  self.scheduleCalc(templateSourceElement["schedule"], additionData):
110  if templateSourceElement["source"] == SourceTemplateExtractor.SOURCE_NAME_FILE:
111  with open(templateSourceElement["request"], "rb") as fd:
112  addedElement = json.loads(fd.read())
113  elif templateSourceElement["source"] == SourceTemplateExtractor.SOURCE_NAME_HTTP:
114  addedElement = self.resolveTemplateByHTTP(templateSourceElement)
115 
116  if addedElement is not None:
117  if isinstance(addedElement, types.ListType) and len(addedElement) > 0:
118  ret.append(addedElement[0])
119  elif isinstance(addedElement, types.DictType):
120  ret.append(addedElement)
121  except Exception as excp:
122  logger.debug(">>> Something wrong with templateSourceElement procession; err=" + str(excp))
123  return ret
124 
125 
Here is the call graph for this function:

◆ replacePostRawContent()

def dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.replacePostRawContent (   self,
  post 
)

Definition at line 175 of file SourceTemplateExtractor.py.

175  def replacePostRawContent(self, post):
176  ret = post
177  for elem in self.POST_BUFF_MACROS:
178  if post.find("%" + elem + "%") >= 0:
179  if elem in self.macroDict:
180  ret = post.replace("%" + elem + "%", self.macroDict[elem])
181  else:
182  ret = post.replace("%" + elem + "%", "")
183  return ret
184 
Here is the caller graph for this function:

◆ resolveTemplateByHTTP()

def dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.resolveTemplateByHTTP (   self,
  templateSourceElement 
)

Definition at line 131 of file SourceTemplateExtractor.py.

131  def resolveTemplateByHTTP(self, templateSourceElement):
132  ret = None
133  requestString = None
134  contentTypeHeader = None
135  if "headers" in templateSourceElement:
136  contentTypeHeader = json.loads(templateSourceElement["headers"]) # {"Content-Type": "application/json"}
137  if templateSourceElement["request"].startswith("http://") or \
138  templateSourceElement["request"].startswith("https://"):
139  requestString = templateSourceElement["request"]
140  else:
141  pass
142  if requestString is not None:
143  if templateSourceElement["post"] is None or templateSourceElement["post"] == "":
144  templateHash = hashlib.md5(requestString).hexdigest()
145  if templateHash in self.TemplHash:
146  ret = self.TemplHash[templateHash]
147  else:
148  ret = requests.get(requestString, headers=contentTypeHeader)
149  self.TemplHash[templateHash] = ret
150  else:
151  templateHash = hashlib.md5(requestString + templateSourceElement["post"]).hexdigest()
152  replacedPost = self.replacePostRawContent(templateSourceElement["post"])
153  if templateHash in self.TemplHash:
154  ret = self.TemplHash[templateHash]
155  else:
156  replacedPost = replacedPost.encode("utf-8")
157  logger.debug(">>> POST Data: requestString:\n" + str(requestString) + \
158  "\ntemplateSourceElement:\n" + str(templateSourceElement["post"]) + \
159  "\nreplacedPost:\n" + str(replacedPost) + "\nheaders:\n" + str(contentTypeHeader))
160  ret = requests.post(requestString, replacedPost, headers=contentTypeHeader)
161  self.TemplHash[templateHash] = ret
162  if ret is not None and ret.status_code == 200 and ret.text is not None:
163  ret = json.loads(ret.text)
164  else:
165  logger.debug(">>> Something wrong with HTTP request, Response code == " + str(ret.status_code) +
166  "content == " + str(ret.text))
167  return ret
168 
169 
Here is the call graph for this function:
Here is the caller graph for this function:

◆ scheduleCalc()

def dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.scheduleCalc (   self,
  schedule,
  additionData 
)

Definition at line 43 of file SourceTemplateExtractor.py.

43  def scheduleCalc(self, schedule, additionData):
44  ret = False
45  scheduleStorageData = None
46  curdatetime = datetime.datetime.now()
47  if "file" in schedule:
48  with open(schedule["file"], "r") as fd: scheduleStorageData = json.loads(fd.read())
49  if schedule["type"] == 0:
50  if additionData["parentMD5"] == "":
51  ret = True
52  elif schedule["type"] == 1:
53  ret = True
54  elif schedule["type"] == 2:
55  if scheduleStorageData is not None:
56  atTime = datetime.datetime.strptime(schedule["at"], "%Y-%m-%d %H:%M")
57  saveAtTime = None
58  if "saveAtTime" in scheduleStorageData and scheduleStorageData["saveAtTime"] is not None:
59  saveAtTime = datetime.datetime.strptime(scheduleStorageData["saveAtTime"], "%Y-%m-%d %H:%M")
60  if saveAtTime != atTime:
61  scheduleStorageData["tCount"] = 0
62  scheduleStorageData["saveAtTime"] = atTime.strftime("%Y-%m-%d %H:%M")
63  if curdatetime > atTime:
64  if scheduleStorageData["tCount"] == 0:
65  ret = True
66  scheduleStorageData["tCount"] += 1
67  elif schedule["type"] == 3:
68  if scheduleStorageData is not None:
69  if "saveNowTime" in scheduleStorageData and scheduleStorageData["saveNowTime"] is not None:
70  atTime = datetime.datetime.strptime(scheduleStorageData["saveNowTime"], "%Y-%m-%d %H:%M")
71  else:
72  atTime = datetime.datetime.strptime(schedule["at"], "%Y-%m-%d %H:%M")
73 
74  if curdatetime > (atTime + datetime.timedelta(minutes=int(schedule["step"]))):
75  scheduleStorageData["saveNowTime"] = curdatetime.strftime("%Y-%m-%d %H:%M")
76  ret = True
77  if scheduleStorageData is not None:
78  scheduleStorageData["datetime"] = curdatetime.strftime("%Y-%m-%d %H:%M")
79  with open(schedule["file"], "w") as fd:
80  fd.write(json.dumps(scheduleStorageData))
81  return ret
82 
83 
84 
Here is the caller graph for this function:

Member Data Documentation

◆ macroDict

dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.macroDict

Definition at line 35 of file SourceTemplateExtractor.py.

◆ POST_BUFF_MACROS

list dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.POST_BUFF_MACROS = ["RAW_CONTENT", "URL"]
static

Definition at line 28 of file SourceTemplateExtractor.py.

◆ SOURCE_NAME_FILE

string dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.SOURCE_NAME_FILE = "file"
static

Definition at line 26 of file SourceTemplateExtractor.py.

◆ SOURCE_NAME_HTTP

string dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.SOURCE_NAME_HTTP = "http"
static

Definition at line 27 of file SourceTemplateExtractor.py.

◆ TemplHash

dc_processor.SourceTemplateExtractor.SourceTemplateExtractor.TemplHash

Definition at line 34 of file SourceTemplateExtractor.py.


The documentation for this class was generated from the following file: