Demo Test Suit (DTS) based on end-user API bindings for PHP
This is readme.txt from DTS package archive hce-node-tests.zip.
Copyright (c) 2013 IOIX Ukraine, http://hierarchical-cluster-engine.com
Table of Contents
- 1. Introduction
- 1.1. About the demo set and test suit
- 1.2. Hierarchical Cluster Engine (HCE) usage and demo set
- 1.2.1. General description of cluster data modes and structure
- 1.2.2. Cluster start and stop
- 1.2.3. Cluster diagnostic tools
- 1.3. Hierarchical Cluster Engine (HCE) usage and test suit
- 1.3.1. Localhost 1-1-2 cluster complex functional tests
- 1.3.2. Localhost 1-1-2 cluster Sphinx search tests
- 1.3.3. Localhost 1-1-2 cluster DRCE exec tests
- 2. Installation
- 2.1. First time installation
- 2.2. Upgrade
- 2.3. Deinstallation and cleanup
- 3. Usage
- 3.1. Cluster structure models
- 3.2. Spinx indexes and data schema
- 3.3. Cluster start and shutdown
- 3.4. Cluster state and management
- 3.5. Test suit for Sphinx indexation and search
- 3.5.1. Functional test ft01 – replica data nodes mode
- 3.5.2. Functional test ft02 – replica data nodes mode
- 3.5.3. Functional test ft02 – shard data nodes mode
Chapter 1. Introduction
Table of Contents
- 1.1. About the demo set and test suit
- 1.2. Hierarchical Cluster Engine (HCE) usage and demo set
- 1.2.1. General description of cluster data modes and structure
- 1.2.2. Cluster start and stop
- 1.2.3. Cluster diagnostic tools
- 1.3. Hierarchical Cluster Engine (HCE) usage and test suit
- 1.3.1. Localhost 1-1-2 cluster complex functional tests
- 1.3.2. Localhost 1-1-2 cluster Sphinx search tests
- 1.3.3. Localhost 1-1-2 cluster DRCE exec tests
1.1. About the demo set and test suit
The demo set and test suit for PHP language – it is mix of bash scripts, php scripts with usage of php API for HCE projct and some data that was prepared for tests. The main aim it is to give simple easy and light way to try the HCE functionality, construct and up simple network cluster, test it on productivity and stability reasons on single localhost server configuration.
The demo mode can be used to check common usage form and configuration settings as well as to try to play with network- and CPU- dependent parameters.
Directories structure:
- /api/php/bin/ – php utilities, executable php scripts for bash environment
- /api/php/cfg/ – bash include configuration definitions for tests sets
- /api/php/data/ – data files for Sphinx indexation and another data for test suit
- /api/php/doc/ – documentation
- /api/php/inc/ – PHP language API includes, used by php utilities and external interfaces
- /api/php/ini/ – ini files for nodes
- /api/php/log/ – demo test suit run-time logs
- /api/php/manage/ – manage bash scripts to start, stop cluster and more
- /api/php/tests/ – demo test suit bash scripts, main executable parts for different structure clusters functional operations
- /data/ – index data directory for Sphinx search indexes representation
- /etc/ – configuration templates and related files for Sphinx search functionality
- /log/ – logs directory of Sphinx searchd process instances
- /run/ – pid directory of Sphinx searchd process instances
1.2. Hierarchical Cluster Engine (HCE) usage and demo set
1.2.1 General description of cluster data modes and structure
The hce-node application can be tested on demo set pre-configured simple cluster for one physical host server. This pre-configured set of settings, configuration settings and parameters are provided by this demo test suit archive.
The demo set defines cluster basic architecture 1-1-2-x. This means one router, one data (shard or replica) manager, two Sphinx index manage data (shard or replica) nodes in cluster:
[router node]
|
[data manager node]
/ \
[data node 1] [data node 2] [data node 3] [data node 4]
The cluster entry point is a roter node. It uses zmq API to receive client connections and requests. It acts as a service with internal network protocol. Many clients can connect to the router and make requests simultaneously.
The 1-1-2 structure can be used to simulate two types of data node sharding model – proportional and replication.
The proportional type supposes sharding between two nodes and unique set of document index for Sphinx search engine. This type also supposes multicast mode of requests dispatching for two data nodes. This is simulation of cluster unit for huge index that nedd to be split on several parts. The search messages processing for one request handled in parallel mode. Manager node collects results from all nodes and do reducing task. Reducing task for Sphinx search includes merging, sort and duplicates removing. Unique results returned as a response to the router node.
The replication type supposes mirroring of data between two nodes and complete duplication of data. This type supposes round-robin or cyclic executive requests dispatching between of two data nodes. In this case the manager node do the same job as for proportional type, but number of responses collected is always one.
To switch between this two principal modes the configuration parameter “MANAGER_MODE” can be set as “smanager” or “rmanager” value in the
~/hce-node-tests/api/php/cfg/c112_cfg.sh
After type of data node sharding model was changed cluster needs to be restarted.
1.2.2 Cluster start and stop
After installation of hce-node package the main executable application is ready to use, but needs to get correct parameters to sturt and construct proper cluster tructure. The demo test suit contains complete set of bash scripts to manage it. They are located in the /manage directory:
- c112_localhost_start_all.sh – start cluster at localhost
- c112_localhost_stop_all.sh – stop cluster at localhost
After cluster started once it runs several instances of hce-node application. For localhost 1-1-2 cluster it is four instances. Total number of instances at some period of time depends on cluster structural schema and state.
After start script finished the state of each nodes can be checked by logs located at the:
~/hce-node-tests/api/php/log/n***.log
directory. The 1-1-2 cluster’s nodes named as:
- n000 – router
- n010 – manager (shard or replica)
- n011 – data node admin 1
- n012 – data node admin 2
- n013 – data node searcher 1
- n014 – data node searcher 2
In case of all is okay with TCP and all required TCP ports are available to use – the log file contains information about binding and connection as well as periodical statistics of main indicators of node activity.
The TCP ports that is used for cluster architecture building defined in the configuration file for corresponded cluster structure schema, for example, for 1-1-2:
~/hce-node-tests/api/php/cfg/c112_localhost_cfg.sh
The list of LAN ports are separated on three types:
- Router ports: ROUTER_SERVER_PORT, ROUTER_ADMIN_PORT and ROUTER_CLIENT_PORT
- Shard/replica manager node(s) ports: SHARD_MANAGER_ADMIN_PORT and SHARD_MANAGER_SERVER_PORT
- Replica node(s) ports: REPLICAS_MANAGE_ADMIN_PORTS and REPLICAS_POOL_ADMIN_PORTS
The replica node admin ports separated on the manage and pool ports. Manage ports used for data index management comand like Sphinx index commands and general nodes management. Pool ports used to manage the searchers or load-balancing nodes pool state verification and management.
The cluster stop script sends shutdown command to all nodes via admin port, but not wait on shutdown finish. Current cluster state can be verified by any diagnostic tool script.
By default cluster 1-1-2 configured to have six nodes. See the 1.2.3 chart schema diagnostic tool.
After all Demo Test Suit dependencies for PHP language installed and configured proper way (please, see dependencies components installation manual at main site documentation secton:
http://hierarchical-cluster-engine.com/install/
“Install Demo Test Suit Environment for PHP language” cluster can be started by start all components script:
cd ~/hce-node-tests/api/php/manage/
./c112_localhost_start_all.sh
Before nodes instances started the log directory cleaned from logs of previous start and all in memory node processed killed by name.
Started cluster can be stopped by stop all components script:
./c112_localhost_stop_all.sh
The stop script calls shutdown command for each node and starts diagnostic script tool c112_localhost_status.sh. Each node instance starts hce-node process and uses from one to three ports depends on role. After successful cluster stop all nodes instances shutdown and ports are freed.
Some parts of cluster can be started and shutdowned separatedly to get a possibility to manage them during the cluster session. This can be used to do some tests and demo simulation of state of some cluster nodes in router role, shard manager or replica. The dedicated manage start scripts are:
c112_localhost_start_replicas_manage.sh
c112_localhost_start_replicas_pool.sh
c112_localhost_start_replicas.sh
c112_localhost_start_router.sh
c112_localhost_start_shard_manager.sh
and stop scripts are:
c112_localhost_stop_replicas.sh
c112_localhost_stop_router.sh
c112_localhost_stop_shard_manager.sh
To get more detailed start/stop management please see the manual for PHP utility manager.php.
1.2.3. Cluster diagnostic tools
The demo test suit includes several diagnostic tools:
- c112_localhost_loglevel.sh – get and set log level of all nodes
- c112_localhost_manager_mode.sh – get or set the shard manager node mode
- c112_localhost_properties.sh – get properties of all handlers of all nodes
- c112_localhost_schema_chart.sh – get cluster schema in ASCII text chart format
- c112_localhost_schema_json.sh – get cluster schema in json format
- c112_localhost_status.sh – check status of all nodes instances processes
This tool scripts can be used to get some additional information at run-time period.
The logging of information of node state and messages processing can be done in three levels:
- 0 – minimal, includes only initialization and periodical statistical information.
- 1 – middle, includes also the requests data contexts as well as additional information about handlers state.
- 2 – maximal, includes also complete data of all messages fields and state, as well as additional information about functional objects state, like Sphinx Search Functional Object.
The properties information – it is handler’s specific fields values. Each field can be information or state. State fields can be changed by additional API calls. Many state fields like TCP ports, data mode, logging level and so on can be changed by dedicated API and manager commands at runtime. The properties information can be used for diagnostic and tracking purposes.
The cluster schema – it is structural relations between all nodes that detected at run time. It can be used to construct, check, verify and log the cluster structure. In future this information can be used to restore structure after some faults, to create mirrors and so on. The ASCII chart schema for default cluster configuration looks like:
n000
router
localhost
A[*:5546]
S[*:5557]
C[*:5556]
|
n010
rmanager
localhost
A[*:5549]
S[*:5558]
C[localhost:5557]
|
____________________
| |
n013 n014
replica replica
localhost localhost
A[*:5530] A[*:5531]
C[localhost:5558] C[localhost:5558]
n011 n012
replica replica
localhost localhost
A[*:5540] A[*:5541]
First line of each node item – it is node name.
Second line – node mode.
Third line – the host of scanned admin port, for LAN cluster version if diagnostic tool used at the same OS shell session it is always “localhost”. Next lines A[], S[] and C[] – represents Admin, Server and Client connection ports used to listen to or to connect to depends on the node role.
In the example displayed above, all nodes instances in all roles uses the Admin port 5546, 5549, 5530, 5531, 5540 and 5541 to listen on manager connection and admin commands requests like manage, diagnostic or administration of node or Sphinx index. But S[] and C[] connection ports used in different way depends on roles.
The router node uses S[] port 5557 – to listen on rmanager node connection and C[] port 5556 – to listen on manager connection and data command requests like Sphinx search or DRCE exec. So node in router mode uses all three ports to bind and to listen on connection.
The rmanager node uses S[] connection port to listen on replica node connection and the C[] connection port – to connect to router node.
The replica node from pool set uses C[] connection port to connect to the rmanager or smanager node. The replica from admin manage set does not use any connectin ports but only A[].
So, connection ports S[] and C[] used to create and to hold the cluster structure. The A[] connection ports used for management and diagnostic.
The status information tool c112_localhost_status.sh gets only “Admin” handler’s fields. This is bit faster than all handlers fields in properties information check. This information tool can be used to fast check nodes state, for example, after start, shutdown, test set finish or some another reasons.
Complete nodes handlers run-time properties can be fetched by the c112_localhost_properties.sh script. It get all handlers data from all cluster nodes and store them in json forma in the log file. This log file can be used to get some property by one for tracking toollike Zabbix (c) system. For this purpose the fetcher tool named zabbix_fetch_indicator.php included. Test script to check is the fetcher worked also provided – zabbix_fetch_indicator.sh. It fetches the value of “requestsTotal” property indicator of Admin handler from node localhost:5541.
Any diagnostic tool script can be started in any time without command line arguments. Information displayed on console output or stored in the correspondent log file in the ~hce-node-tests/api/php/log/ directory.
1.3. Hierarchical Cluster Engine (HCE) usage and test suit
1.3.1. Localhost 1-1-2 cluster complex functional tests
The demo tests suit set contains several minimal functional tests sets to get check of cluster functionality, productivity on concrete hardware platform, Sphinx search tasks usage, API tests and usage checks and so on.
The functional tests combined in to the complete sequential actions that reflects of typical usage of distributed data processing like documents indexation and search, as well as administrative commands and dignostic tools.
Test suit contains functional tests sets that located in the:
~/hce-node-tests/api/php/tests/*ft*.sh
files.
The *ft01*.sh test – it is complete life time simulation of the Sphinx search cluster with replicated data nodes. Replication model is fully mirrored. So, independently on the manager mode (shard or replica) this test will fill both data nodes with the same documents. This is important to understand the search results and productivity values. The correspondent mode of data nodes manager for this test is “replica manager”.
The life cycle of “ft01” includes execution of operations:
- create new index,
- fill index with branches document source files,
- fill index with schema,
- rebuild Sphinx indexes of all branches,
- merge all branches indexes to the trunk index,
- start index usage for Sphinx search,
- full text search tests,
- state diagnostic
And after testing cleanup actions like:
- stop index usage,
- remove index (including all data files and directories that was created)
After cluster was started – the “ft01” test suit unit can be started by execution of the corresponded bash script, for example, for 1-1-2 cluster it is the:
~/hce-node-tests/api/php/tests/c112_localhost_ft01.sh
After finish the execution logs can be checked. Logs are located in the:
~/hce-node-tests/api/php/log/c112_localhost_ft01.sh.log
After tests was done, any diagnostic tools or search tests can be started. When all kind of tests or another actions finished, to cleanup the indexes data from disk and to set cluster in initial state the ft01 cleanup script need to be executed once, for example, for 1-1-2 localhost cluster:
~/hce-node-tests/api/php/log/c112_localhost_ft01_cleanup.sh
As well as for test suit unit results, cleanup execution results can be checked in logs:
~/hce-node-tests/api/php/log/c112_localhost_ft01_cleanup.log
The *ft02*.sh test – it is complete life time simulation of the Sphinx search cluster with fillng the data node indexes from data source directory and supports shard and replication data mode. The sharding method depends on how the cluster was started or what data mode used or manager node. If manager node uses shard mode – smanager, indexes will be filled sequentially, from one xml document file to another and all documents will be distributed between several indexes managed by own data node. If manager node uses replica mode – indexes will be filled with the same documents as complete mirrors for all data nodes. Another operations of “ft02” – the same as for “ft01”. After “ft02” finished, logs can be checked to get information about how each
stage was finis. The same way, after “ft02” was complete all kind of dagnostic or search can be executed. To cleanup the indexes data and return nodes back in to initial state the “ft02” cleanup script nedd to be executed.
After cleanup script executed, the data node became at initial state, no more indexes exists and search will always return empty results. Any another operations and commands for Sphinx search index will return error.
1.3.2. Localhost 1-1-2 cluster Sphinx search tests
The demo tests includes the Sphinx index search tests. This tests are two processing models – single-client and multi-client. The single-client uses one client connection to execute set of search requests. The searcher.php tool utility used to execute search, but several required parameters are set by bash script.
To execute set of searches in single-client mode, the script:
~/hce-node-tests/api/php/tests/c112_localhost_search_test.sh
can be executed. The search results located in the log file:
~/hce-node-tests/api/php/log/c112_localhost_search_test.sh.log
Different parameters like searched keyword, number of requests, number of results, filters and so on can be changed inside this bash script as: QUERY_STRING_DEFAULT, REQUESTS, RESULTS, LOG_LEVEL variables.
Multiple thread search can be started by:
~/hce-node-tests/api/php/tests/c112_localhost_search_test_multiclient.sh
Optional first parameter is searched string, default value is “for”.
Default clients number is 16, requests per client 1000, max. results per one search 10, log level 0. This settings defined in this file as variables:
QUERIES=1000
RESULTS=10
CLIENTS=16
TIMEOUT=5000
SEARCH_STRING="for"
and can be changed before run.
1.3.3. Localhost 1-1-2 cluster DRCE exec test #1
The demo tests for DRCE includes set of prepared requests in json format for different target execution environment (script programming languages, bash and binary executable).
If the request requires source code or binary executable – it is stored as separated file with the same name. It read and placed inside request json by macro definition “READ_FROM_FILE:”. If file is binary it need to be base64 encoded by set of highest bit in the “action” mask. Please see protocol specification doc DRCE_Functional_object_protocol.pdf.
Three types of demo test scripts provides possibility to execute some request in single thread (from one start to N iterations), execute all prepared requests sequentially and to execute one of prepared request in multiple threads parallel mode.
Set of prepared requests located in the:
~/hce-node-tests/api/php/data/c112_localhost_drce_ft01
directory. Each txt file contains prepared request in json format according with the DRCE specification: DRCE_Functional_object_protocol.pdf
Each of prepared request enumerated as suffix of .txt file and can be addressed by this number, for example file c112_localhost_drce_json00.txt can be addressed as request 00, c112_localhost_drce_json01a.txt – as 01a and so on.
After cluster started with default configuration in balanced mode (shard manager mode is “rmanager”) single prepared request, for example, “00” can be executed in single environment once by:
cd ~/hce-node-tests/api/php/tests/
./c112_localhost_drce_test.sh 00
Default log level is 4 (maximal) and log file with correspondent file name will be stored in the log directory – c112_localhost_drce_test.sh.log. In case of success execution complete response message structure debugged, including execution result stdout, execution time, and so on…
To execute all prepared requests use:
./c112_localhost_drce_test_all.sh
The execution time depends on power of hardware platform and OS environment settings. The log file c112_localhost_drce_test_all.sh.log will contain all tests responses in the same format as single.
To execute one prepared request several times to get productivity report for single thread parameter located in file c112_localhost_drce_test.sh
REQUESTS=1
need to be changed, for example as:
REQUESTS=100
to execute specified prepared request 100 times sequentially. The log level for multiple sequential execution can be set as 0. Parameter
LOG_LEVEL
To execute one prepared request by multiple clients, use:
./c112_localhost_drce_test_multiclient.sh 00
Default clients number 16, requests for each client 1000, log level 0. This will start 16 process of DRCE client utility drce.php. The execution state can be evaluated by cluster properties statistics, nodes logs and CPU utilization level. Each instance of drce.php creates own log file, for example, drce_client.1.log, drce_client.2.log and so on up to drce_client.16.log. Logs contains the same information as single thread result. Reqests number, clients number, timeout delay and the log level can be set by variables:
QUERIES=1000
CLIENTS=16
TIMEOUT=5000
LOG_LEVEL=0
This test can be used to evaluate the platform power and possibility to process some parallel tasks by target execution engine.
1.3.4. Localhost 1-1-2 cluster DRCE exec test #2
This test set is list of algorithms taken from the project: http://benchmarksgame.alioth.debian.org/
Languages covered:
- C
- C++
- PHP
- Python
- Ruby
- Perl
- Java
Algorithms list:
Dependencies:
BC Math:
sudo apt-get install bc
~\hce_hce-node-tests\api\php\data\c112_localhost_drce_ft02\binarytrees.gcc-7.c
Require the libapr1-dev package that can be installed by:
sudo apt-get install libapr1-dev
~\hce_hce-node-tests\api\php\data\c112_localhost_drce_ft02\pidigits.c
sudo apt-get install libgmp-dev
~\hce_hce-node-tests\api\php\data\c112_localhost_drce_ft02\pidigits.gpp-3.c++
sudo apt-get install libboost-dev
~\hce_hce-node-tests\api\php\data\c112_localhost_drce_ft02\pidigits.php-5.php
sudo apt-get install php5-gmp
Add extension=gmp.so in /etc/php/php.ini file.
~\hce_hce-node-tests\api\php\data\c112_localhost_drce_ft02\pidigits.python3-2.py
sudo apt-get install python-setuptools
sudo easy_install pip
sudo pip install virtualenv
sudo apt-get install python-dev
sudo apt-get install libmpfr-dev
sudo apt-get install libmpc-dev
sudo pip install gmpy2 --global-option=build_ext --global-option=-Ddir=/home/case/local
sudo apt-get install python-gmpy
~\hce_hce-node-tests\api\php\data\c112_localhost_drce_ft02\pidigits.perl-4.perl
sudo perl -MCPAN -e 'install Net::SSH::Perl'
to be continued…