Skip to main content

Swarm Watchdog tool

Description

The watchdog is a Perl script to launch and maintain Whisperers attached to any service of a Docker Swarm cluster.

  • Runs in background and put/check a lock file in /var/lock
  • Reads a configuration file as first argument, continuously checks for updates
  • Checks swarm services to monitor and restart missing Whisperers every minute
  • Output progress and errors in standards output
  • Can be easily put to cron every x minutes (idempotent and single instance with the lock file)

Configuration

{
"dockerTasks":[
{
"master": "node-1.streetsmart.sit1",
"name": "itproduction_gateway",
"whispererMesh": "SIT1.json",
"version": "latest"
},
{
"master": "node-1.streetsmart.xct3",
"name": "terminal_devicemonitoring_service",
"whispererMesh": "XCT3-DM.json",
"version": "latest"
},
{
"master": "node-1.streetsmart.spt1",
"name": "itproduction_gateway",
"whisperers": [
"spt1-w1-config.json",
"spt1-w2-config.json",
"spt1-w3-config.json",
"spt1-w4-config.json",
"spt1-w5-config.json",
"spt1-w6-config.json",
"spt1-w7-config.json"
]
}
]
}

First, specify the service to monitor:

  • master: Manager swarm node to connect to
  • name: Name of service to monitor

Then, specify if the service should be monitor by the same Whisperer, or a different ones.
This allow to monitor only one/a limited number of instances, in case of high load services.

  • whispererMesh: Whisperer config file in case of 1 Whisperer for all service replicas
  • whisperers : Array of config file in case of limited number of replicas

Finally, you may specify the whisperer version.

  • latest being the ... latest version
  • this is use to do a gradual rollout

In the same folder, you must put the whisperer API Key file, downloaded from the UI, in the Installation tab of Whisperers.

InstallationTab.png

Whisperers can be attached to API gateways, or single service depending on what communication you want to capture: edge communications, intra cluster, or service specific (to databases for instance).

Limitations

For now the script has some limitations:

  • It doesn't kill previous instances of the same whisperer if attached to a different service
    • When changing service to monitor, you must then kill the whisperers or restart the service
  • It doesn't change version of whisperer by itself

Execution

$ ./whispererService.pl config.json

The script details what is does during execution:

Putting lock file /var/lock/whispererService.lock... Ok

---------------------------------------------------------------------------
Running at: 2023-07-31 14:42:32
Opening config... Ok
Parsing config... Ok

Task to monitor: node-1.streetsmart.sit1 - itproduction_gateway
Found 3 instance(s) on 3 node(s)
Node node-1.streetsmart.sit1 - Container itproduction_gateway.x1jfnkiiyvs6hor9eq6i963px.85h180e33baejhq7c2bh9xuym (8624aa285fd2)
Node node-2.streetsmart.sit1 - Container itproduction_gateway.qfzjxm8c4oxivkrwulzaeam3a.mcb4use277kiepo2guoeq325e (fc1d1a4b60d5)
Node node-3.streetsmart.sit1 - Container itproduction_gateway.ukjz1hn1e8rvdl09kpnyi5pg1.y34hadcappcq2lo2snce4o96w (cc353475bd9d)

Unlimited instances with same conf
Launching Whisperer for itproduction_gateway.x1jfnkiiyvs6hor9eq6i963px.85h180e33baejhq7c2bh9xuym (8624aa285fd2)
Loading conf SIT1-Kube.json
Getting list of services
126 services found
Launching Whisperer on node-1.streetsmart.sit1
7b45125579250d73b6a89962bde0c6bc2683dce9b8fa82766d951ef0e1f733aa
Launching Whisperer for itproduction_gateway.qfzjxm8c4oxivkrwulzaeam3a.mcb4use277kiepo2guoeq325e (fc1d1a4b60d5)
Loading conf SIT1-Kube.json
Getting list of services
126 services found
Launching Whisperer on node-2.streetsmart.sit1
d88cff764d8cdd7014f34b48488bffe5cdbd1ceb9db01a62480efcb887d170f9
Launching Whisperer for itproduction_gateway.ukjz1hn1e8rvdl09kpnyi5pg1.y34hadcappcq2lo2snce4o96w (cc353475bd9d)
Loading conf SIT1-Kube.json
Getting list of services
126 services found
Launching Whisperer on node-3.streetsmart.sit1
40360831adf94fe63cc6d4028457b8d54d247023733f5667dcd62c7409e669eb

Task to monitor: node-1.streetsmart.spt1 - itproduction_gateway
Found 6 instance(s) on 6 node(s)
Node node-1.streetsmart.spt1 - Container itproduction_gateway.l0r90zvxf5xgml1fh0s1f64pc.lqs5uae9pp1x9sldfsuxibf1e (57d505221501)
-> Monitored by 3bed80b004a7 (SPT1.json)
Node node-2.streetsmart.spt1 - Container itproduction_gateway.n55csr2ls6ljp6h24duu7v03t.p6apc9x3ud8gxgnztnf23q87j (e6d04ffe222e)
-> Monitored by a9bbfb557262 (spt1-w2-config.json)
Node node-3.streetsmart.spt1 - Container itproduction_gateway.9tvh7e5w7bp4e9l7swb88enpx.svqx66boxszem5t3owp45tfle (bc5105f37eb7)
-> Monitored by e0b262ffcd02 (spt1-w3-config.json)
Node node-4.streetsmart.spt1 - Container itproduction_gateway.ug40repzf53z2knwfl9d6kds8.h2lch4dmnrwsklrpu0m2q8ejb (a4aee3d09953)
-> Monitored by f71acbca17c6 (spt1-w4-config.json)
Node node-5.streetsmart.spt1 - Container itproduction_gateway.573oigdg4l01yqqm0um2sedpp.lw2uqaf94gzz2fyovr1sqokel (5ccce065919e)
-> Monitored by 4cec348d9a35 (spt1-w5-config.json)
Node node-6.streetsmart.spt1 - Container itproduction_gateway.fxuit9okiqn5z4xrbmhqxe7fi.npxdnyoz8nvm04m8xjq7v7tkw (962a7aae90e1)
-> Monitored by ea63f2916bf2 (spt1-w1-config.json)

No new Whisperer to launch.

All configs done, waiting for 2 min for new check

...

Source code

Gitlab