Changelog:
Release 1.4.0 (2015-02-20)
- Added support of system resources usage balancing modes.
- Added support of node properties fast accumulated updates.
- Added support of flexible DRCE tasks planning.
- Additions extended hce-node management php cli utilities including configurations change and state checks support.
- Fixes for the DRCE functionality of tasks states modification and manage.
- Fixed several bugs of DRCE functional object.
- Many fixes for Python API additions to support new DRCE functionality.
- Updated “Distributed Crawler” DC application:
- Support of four types of asynchronous tasks processes: crawl, process (scraping), age and purge
- Support of multi-threading re-crawl process.
- Complete separated crawling and processing with possibility to configure all of options, schedules, and manage limitations.
- Improved real-time crawling processing, updated post-processing procedure and states management.
- Improved processing algorithms support including common unified algorithms selection and usage.
- Improved scraping algorithms usage and estimation of the results indicators, tags quality estimation and so on.
- Extended management automation scripts to start, check state and stop service to support tasks queues check and wait on real tasks finish.
- Updated “Distributed Tasks Manager” DTM application:
- Improved tasks management and states definition, including the re-scheduling, retrying, remove garbage at start/stop service.
- Extended client and management tools with possibility to get tasks queue with complete fields set at run time.
- Extended management automation scripts to start, check state and stop service to support tasks queues check and wait on real tasks finish.
- Fixed several bugs related with handling specific tasks states on execution environment.