Monitoring Sample

Sample script to show a python REST client of Commander, presenting the work flow of monitoring and repair. Simply put into crontab to work.

One of the important use cases of Commander is HTTP server cluster monitoring, since Commander runs as a web service to quickly send request to the cluster, you can easily create a simple script which invokes Commander command to get the health status of the cluster, and setup a timer-triggered crontab job to run this script regularly to monitor the cluster.
Assuming we have a group of HTTP servers on hundreds of machines, where each of the machines can be accessed from the same http request to get the real time health status(see send same request to different servers using REST Commander). We want to use Commander to quickly find out the unhealthy machines within the group, and apply appropriate remediation approaches (kill the process, restart the servers, etc.) to recover these unhealthy servers.

3.1 Define the Request

In the example above, you first need to define the following parameters in the http request sent to REST Commander web service:

1. Target server groups which contains all the machines to be monitored;
2. The http request to access the health status page on each server;
3. Regular expression to extract health status from server response.

The composed request json body looks like the following:
POST URL: http://localhost:9000/commands/generateUpdateSendAgentCommandToAdhocNodeGroup
POST BODY:
{
   "targetNodes":[
      "www.restcommander.com",
      "www.jeffpei.com",
      "www.yangli907.com"
   ],
   "willAggregateResponse":true,
   "useNewAggregation":true,
   "agentCommandType":"GET_VALIDATE_INTERNAL",
   "aggregationType":"PATTERN_PARSE_MONITOR_HEALTH",
   "newAgentCommandLine":"GET_VALIDATE_INTERNAL GET http 80 /validateInternals.html 0 0 5000 SUPERMAN_GLOBAL",
   "newAggregationExpression":".%2A%3Ctd%3EServer-Is-Healthy%3C%2Ftd%3E%5Cs%2A%3Ctd%3E%28.%2A%3F%29%3C%2Ftd%3E%5B%5Cs%5CS%5D%2A",
   "useNewAgentCommand":"true"}
            

3.2 Send Request to Aggregate Cluster Health Status

The next step is to send the request defined above to REST Commander web service, and extract the list of unhealthy server from Commander response.
Server Response with aggregation on the "Server-Is-Healthy" attribute:
{
   "aggregationMap":{
      "False":"1",
      "True":"2"
   },
   "aggregationValueToNodesList":[
      {
         "isError":false,
         "nodeList":[
            "www.yangli907.com"
         ],
         "value":"False"
      },
      {
         "isError":false,
         "nodeList":[
            "www.jeffpei.com",
            "www.restcommander.com"
         ],
         "value":"True"
      }
   ]
}
          

3.3 Recover the Unhealthy Servers

After the unhealthy server information is retrieved, we can use Commander web service to send another http request to recover the server, such as killing problematic process, or restart the server. The process to define the request and send to REST Commander web service is similar to the request above.