HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.  2.0.0-chaika
Hierarchical Cluster Engine Python language binding
dc_crawler.RTCPreprocessor.RTCPreprocessor Class Reference

RTCPreprocessor Class content main functional for preprocessor for realtime crawling, class inherits from foundation.CementApp. More...

Inheritance diagram for dc_crawler.RTCPreprocessor.RTCPreprocessor:
Collaboration diagram for dc_crawler.RTCPreprocessor.RTCPreprocessor:

Classes

class  Meta
 

Public Member Functions

def __init__ (self)
 constructor More...
 
def setup (self)
 setup application More...
 
def run (self)
 run application More...
 
def getBatchFromInput (self)
 
def cutBatch (self)
 
def split (self, arr, count)
 
def sendBatch (self)
 
def getEnvVars (self)
 
def process (self)
 

Public Attributes

 logger
 
 batch
 
 exitCode
 
 pickled_object
 
 envVars
 

Static Public Attributes

string MSG_ERROR_PARSE_CMD_PARAMS = "Error parse command line parameters."
 Constants error messages used in class. More...
 
string MSG_ERROR_EMPTY_CONFIG_FILE_NAME = "Config file name is empty."
 
string MSG_ERROR_WRONG_CONFIG_FILE_NAME = "Config file name is wrong"
 
string MSG_ERROR_LOAD_APP_CONFIG = "Error loading application config file."
 
string MSG_ERROR_READ_LOG_CONFIG = "Error read log config file."
 
string DRCE_NODE_NUMBER = "DRCE_NODE_NUMBER"
 Constans used in class. More...
 
string DRCE_NODES_TOTAL = "DRCE_NODES_TOTAL"
 
int ERROR_EMPTY_ENV_VARS = 2
 Constans as numeric for exit code. More...
 
string PREPROCESSOR_OPTION_LOG = "log"
 Constans used options from config file. More...
 

Private Member Functions

def __initApp (self)
 initialize application from config files More...
 
def __loadAppConfig (self, configName)
 load application config file More...
 
def __loadLogConfig (self, configName)
 load log config file More...
 

Detailed Description

RTCPreprocessor Class content main functional for preprocessor for realtime crawling, class inherits from foundation.CementApp.

Definition at line 34 of file RTCPreprocessor.py.

Constructor & Destructor Documentation

◆ __init__()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.__init__ (   self)

constructor

Definition at line 60 of file RTCPreprocessor.py.

60  def __init__(self):
61  # call base class __init__ method
62  foundation.CementApp.__init__(self)
63 
64  self.logger = None
65  self.batch = None
66  self.exitCode = APP_CONSTS.EXIT_SUCCESS
67  self.pickled_object = None
68  self.envVars = {self.DRCE_NODES_TOTAL: 1,
69  self.DRCE_NODE_NUMBER: 1}
70 
71 
def __init__(self)
constructor
Definition: UIDGenerator.py:19

Member Function Documentation

◆ __initApp()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.__initApp (   self)
private

initialize application from config files

Parameters
-None
Returns
- None

Definition at line 97 of file RTCPreprocessor.py.

97  def __initApp(self):
98  if self.pargs.config:
99  self.__loadLogConfig(self.__loadAppConfig(self.pargs.config))
100  else:
101  raise Exception(self.MSG_ERROR_LOAD_APP_CONFIG)
102 
103 
def __initApp(self, configName=None)
Here is the call graph for this function:
Here is the caller graph for this function:

◆ __loadAppConfig()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.__loadAppConfig (   self,
  configName 
)
private

load application config file

Parameters
configName- name of application config file
Returns
- log config file name

Definition at line 108 of file RTCPreprocessor.py.

108  def __loadAppConfig(self, configName):
109  #variable for result
110  confLogFileName = ""
111 
112  try:
113  config = ConfigParser.ConfigParser()
114  config.optionxform = str
115 
116  readOk = config.read(configName)
117 
118  if len(readOk) == 0:
119  raise Exception(self.MSG_ERROR_WRONG_CONFIG_FILE_NAME + ": " + configName)
120 
121  if config.has_section(APP_CONSTS.CONFIG_APPLICATION_SECTION_NAME):
122  confLogFileName = str(config.get(APP_CONSTS.CONFIG_APPLICATION_SECTION_NAME, self.PREPROCESSOR_OPTION_LOG))
123 
124  except Exception, err:
125  raise Exception(self.MSG_ERROR_LOAD_APP_CONFIG + ' ' + str(err))
126 
127  return confLogFileName
128 
129 
Here is the caller graph for this function:

◆ __loadLogConfig()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.__loadLogConfig (   self,
  configName 
)
private

load log config file

Parameters
configName- name of log rtc-finalizer config file
Returns
- None

Definition at line 134 of file RTCPreprocessor.py.

134  def __loadLogConfig(self, configName):
135  try:
136  if isinstance(configName, str) and len(configName) == 0:
137  raise Exception(self.MSG_ERROR_EMPTY_CONFIG_FILE_NAME)
138 
139  logging.config.fileConfig(configName)
140 
141  #call rotation log files and initialization logger
142  self.logger = Utils.MPLogger().getLogger()
143 
144  except Exception, err:
145  raise Exception(self.MSG_ERROR_READ_LOG_CONFIG + ' ' + str(err))
146 
147 
Here is the call graph for this function:
Here is the caller graph for this function:

◆ cutBatch()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.cutBatch (   self)

Definition at line 152 of file RTCPreprocessor.py.

152  def cutBatch(self):
153  self.batch = (pickle.loads(self.pickled_object))
154  self.logger.info("Before id:%s items: %s", str(self.batch.id), str(len(self.batch.items)))
155  self.logger.debug("self.batch: %s", varDump(self.batch))
156  items = self.batch.items
157  if len(items) > 1:
158  splitted_items = self.split(self.batch.items, int(self.envVars[self.DRCE_NODES_TOTAL]))
159  self.logger.debug("Input items: %s", str(self.batch.items))
160  self.logger.debug("Splitted items: %s", str(splitted_items))
161  self.batch.items = splitted_items[int(self.envVars[self.DRCE_NODE_NUMBER]) - 1]
162  self.logger.debug("Output items: %s", str(self.batch.items))
163  self.logger.debug("Output batch: %s", varDump(self.batch))
164  self.pickled_object = pickle.dumps(self.batch)
165 
166  self.logger.info("After id:%s items: %s", str(self.batch.id), str(len(self.batch.items)))
167 
def varDump(obj, stringify=True, strTypeMaxLen=256, strTypeCutSuffix='...', stringifyType=1, ignoreErrors=False, objectsHash=None, depth=0, indent=2, ensure_ascii=False, maxDepth=10)
Definition: Utils.py:410
Here is the call graph for this function:
Here is the caller graph for this function:

◆ getBatchFromInput()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.getBatchFromInput (   self)

Definition at line 148 of file RTCPreprocessor.py.

148  def getBatchFromInput(self):
149  self.pickled_object = sys.stdin.read()
150 
151 
Here is the caller graph for this function:

◆ getEnvVars()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.getEnvVars (   self)

Definition at line 177 of file RTCPreprocessor.py.

177  def getEnvVars(self):
178  for key in self.envVars.keys():
179  if key in os.environ and os.environ[key] != "":
180  self.envVars[key] = os.environ[key]
181  self.logger.debug("os.environ[%s]: set to <<%s>>" % (key, self.envVars[key]))
182  else:
183  self.logger.debug("os.environ[%s]: not set. Use default value: <<%s>>" % (key, self.envVars[key]))
184  self.exitCode = self.ERROR_EMPTY_ENV_VARS
185 
186 
Here is the caller graph for this function:

◆ process()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.process (   self)

Definition at line 187 of file RTCPreprocessor.py.

187  def process(self):
188  try:
189  self.getBatchFromInput()
190  self.getEnvVars()
191  if self.exitCode != self.ERROR_EMPTY_ENV_VARS:
192  self.logger.info("The batch possible will be reduced")
193  self.cutBatch()
194  else:
195  self.logger.info("The batch will not be reduced")
196  self.sendBatch()
197  except Exception:
198  self.exitCode = APP_CONSTS.EXIT_FAILURE
199 
Here is the call graph for this function:
Here is the caller graph for this function:

◆ run()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.run (   self)

run application

Definition at line 79 of file RTCPreprocessor.py.

79  def run(self):
80  # call base class run method
81  foundation.CementApp.run(self)
82 
83  # call initialization application
84  self.__initApp()
85 
86  # call internal processing
87  self.process()
88 
89  # Finish logging
90  self.logger.info(APP_CONSTS.LOGGER_DELIMITER_LINE)
91 
92 
Here is the call graph for this function:

◆ sendBatch()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.sendBatch (   self)

Definition at line 172 of file RTCPreprocessor.py.

172  def sendBatch(self):
173  print self.pickled_object
174  sys.stdout.flush()
175 
176 
Here is the caller graph for this function:

◆ setup()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.setup (   self)

setup application

Definition at line 73 of file RTCPreprocessor.py.

73  def setup(self):
74  # call base class setup method
75  foundation.CementApp.setup(self)
76 
77 

◆ split()

def dc_crawler.RTCPreprocessor.RTCPreprocessor.split (   self,
  arr,
  count 
)

Definition at line 168 of file RTCPreprocessor.py.

168  def split(self, arr, count):
169  return [arr[i::count] for i in range(count)]
170 
171 
Here is the caller graph for this function:

Member Data Documentation

◆ batch

dc_crawler.RTCPreprocessor.RTCPreprocessor.batch

Definition at line 65 of file RTCPreprocessor.py.

◆ DRCE_NODE_NUMBER

string dc_crawler.RTCPreprocessor.RTCPreprocessor.DRCE_NODE_NUMBER = "DRCE_NODE_NUMBER"
static

Constans used in class.

Definition at line 44 of file RTCPreprocessor.py.

◆ DRCE_NODES_TOTAL

string dc_crawler.RTCPreprocessor.RTCPreprocessor.DRCE_NODES_TOTAL = "DRCE_NODES_TOTAL"
static

Definition at line 45 of file RTCPreprocessor.py.

◆ envVars

dc_crawler.RTCPreprocessor.RTCPreprocessor.envVars

Definition at line 68 of file RTCPreprocessor.py.

◆ ERROR_EMPTY_ENV_VARS

int dc_crawler.RTCPreprocessor.RTCPreprocessor.ERROR_EMPTY_ENV_VARS = 2
static

Constans as numeric for exit code.

Definition at line 48 of file RTCPreprocessor.py.

◆ exitCode

dc_crawler.RTCPreprocessor.RTCPreprocessor.exitCode

Definition at line 66 of file RTCPreprocessor.py.

◆ logger

dc_crawler.RTCPreprocessor.RTCPreprocessor.logger

Definition at line 64 of file RTCPreprocessor.py.

◆ MSG_ERROR_EMPTY_CONFIG_FILE_NAME

string dc_crawler.RTCPreprocessor.RTCPreprocessor.MSG_ERROR_EMPTY_CONFIG_FILE_NAME = "Config file name is empty."
static

Definition at line 38 of file RTCPreprocessor.py.

◆ MSG_ERROR_LOAD_APP_CONFIG

string dc_crawler.RTCPreprocessor.RTCPreprocessor.MSG_ERROR_LOAD_APP_CONFIG = "Error loading application config file."
static

Definition at line 40 of file RTCPreprocessor.py.

◆ MSG_ERROR_PARSE_CMD_PARAMS

string dc_crawler.RTCPreprocessor.RTCPreprocessor.MSG_ERROR_PARSE_CMD_PARAMS = "Error parse command line parameters."
static

Constants error messages used in class.

Definition at line 37 of file RTCPreprocessor.py.

◆ MSG_ERROR_READ_LOG_CONFIG

string dc_crawler.RTCPreprocessor.RTCPreprocessor.MSG_ERROR_READ_LOG_CONFIG = "Error read log config file."
static

Definition at line 41 of file RTCPreprocessor.py.

◆ MSG_ERROR_WRONG_CONFIG_FILE_NAME

string dc_crawler.RTCPreprocessor.RTCPreprocessor.MSG_ERROR_WRONG_CONFIG_FILE_NAME = "Config file name is wrong"
static

Definition at line 39 of file RTCPreprocessor.py.

◆ pickled_object

dc_crawler.RTCPreprocessor.RTCPreprocessor.pickled_object

Definition at line 67 of file RTCPreprocessor.py.

◆ PREPROCESSOR_OPTION_LOG

string dc_crawler.RTCPreprocessor.RTCPreprocessor.PREPROCESSOR_OPTION_LOG = "log"
static

Constans used options from config file.

Definition at line 51 of file RTCPreprocessor.py.


The documentation for this class was generated from the following file: