HCE Project Python language Distributed Tasks Manager Application, Distributed Crawler Application and client API bindings.  2.0.0-chaika
Hierarchical Cluster Engine Python language binding
scraper_json_viewer Namespace Reference

Functions

def processBatch ()
 

Variables

string config_db_dir = "../data/dc_dbdata"
 

Detailed Description

HCE project, Python bindings, Distributed Tasks Manager application.
Event objects definitions.

@package: dc
@file scraper_json_viewer.py
@author Oleksii <developers.hce@gmail.com>
@link: http://hierarchical-cluster-engine.com/
@copyright: Copyright &copy; 2013-2014 IOIX Ukraine
@license: http://hierarchical-cluster-engine.com/license/
@since: 0.1

Function Documentation

◆ processBatch()

def scraper_json_viewer.processBatch ( )

Definition at line 27 of file scraper_json_viewer.py.

27 def processBatch():
28  json = None
29  # read pickled batch object from stdin and unpickle it
30  input_pickled_object = sys.stdin.read()
31  # print input_pickled_object
32  input_data = (pickle.loads(input_pickled_object)).items[0]
33  # print("Batch item: siteId: %s, urlId: %s" %(input_data.siteId, input_data.urlId))
34  if len(input_data.siteId):
35  db_name = config_db_dir + "/" + input_data.siteId + ".db"
36  else:
37  db_name = config_db_dir + "/0.db"
38  con = lite.connect(db_name)
39  with con:
40  cur = con.cursor()
41  query = "SELECT `data` FROM `articles` WHERE `id`='%s' order by `CDate` DESC LIMIT 1" % (input_data.urlId)
42  cur.execute(query)
43  json = cur.fetchone()
44  print decode(json[0])
45 
46 
Here is the call graph for this function:

Variable Documentation

◆ config_db_dir

string scraper_json_viewer.config_db_dir = "../data/dc_dbdata"

Definition at line 25 of file scraper_json_viewer.py.