Skip to main content

Servers status

Description

This dashboard provides a status of the servers hosting the cluster and its datastores: CPU, RAM, Disk…

Screenshot

ServersScreen.png

Content

Applicative nodes CPU usage - timed graph

Shows the CPU usage of each node involved in the applicative cluster.

  • Can be above 100% when multiple cores
  • Stability is key
  • Same usage on each node is preferred
  • Target is below 75% * number of cores

When using Kubernetes, Spider would not be the only one running in the Cluster. This graphs shows the CPU used by all applications running, on all nodes.

ServersScreen-ApplicativeNodesCPU.png

Applicative nodes free RAM - timed graph

Shows the free RAM usage of each node involved in the applicative cluster.

  • This includes the operating system caching mechanism, so free RAM could be rather low
  • Stability is better
  • Same usage on each node is preferred

ServersScreen-ApplicativeNodesRAM.png

Servers items - aggregated grid

Shows major metrics for each cluster node, and their trend:

  • CPU
  • RAM
  • Disk (when on Swarm)
  • Count of warn and errors logs

This grid is for factual figures and mainly designed to export data.

ServersScreen-ServersItems.png

Services CPU usage (sum of all replicas) - timed graph

Shows the CPU usage sum of all replicas for each service

  • Allows to find most demanding services easily and scale them
  • Allows to track weird behaviors

We can see that the most used ones are:

  • WebWrite that aggregates packets of a TCP session and parses it
  • PackRead that gives packets to WebWrite
  • PackWrite that receives and parse Packets from Whisperers
  • TcpUpdate that provides TCP sessions to parse and updates them after each parsing
  • TcpWrite that receives TCP sessions from Whisperers

ServersScreen-ServicesCPU.png

Services average RAM usage - timed graph

Shows the average RAM usage of all replicas for each service

  • Most services are using between 90 and 150 MB of RAM per replica
  • Stability is the target

ServersScreen-ServicesRAM.png

Dockers CPU usage (sum of all replicas) - timed graph

Shows the CPU usage sum of all Dockers for each service.

It provides a view on the cluster usage without limiting to services. You'll also find:

  • Elasticsearch
  • Redis
  • Traefik
  • Metricbeat
  • Filebeat

It allows to see any unexpected usage.

For instance, Filebeat is using a lot of CPU, because we have one for each node, and it tracks all logs, even not Spider's.

ServersScreen-DockersCPU.png

Dockers average RAM usage - timed graph

Shows the average RAM usage of all Dockers for each service.

Without surprise, Elasticsearch outranges all others.

ServersScreen-DockersRAM.png

Services items - aggregated grid

Shows major metrics for each service instance / pod, and their trend:

  • CPU
  • RAM
  • Count of warn and errors logs

This grid is for factual figures and mainly designed to export data.

ServersScreen-ServicesItems.png

Redis used RAM - timed graph

Shows the memory usage of Redis over time.

  • When the processing of pollers and parsing is too slow, Redis accumulates data and can reach its maximum (1GB for default)
  • The more stable the better.

ServersScreen-RedisRAM.png

ES nodes free RAM - timed graph

Show the free RAM for each Elasticsearch node / pod.

  • The lower, the better.
  • The more stable the better.

ServersScreen-ESNodesRAM.png

Redis CPU usage - timed graph

Shows the CPU usage of Redis databases instances.

  • Nothing special to say... it is so small!
  • Usually, I noticed a proportional ratio of 1% CPU for 1000 req/s load.
  • The number of instances and what they hosts is configurable

ServersScreen-RedisCPU.png

ES nodes CPU usage - timed graph

Shows the CPU usage of Elasticsearch inside each node / Pod of Elasticsearch cluster.

  • Maximum at 100% for each node.
  • Stability is a key

ServersScreen-ESNodesCPU.png

Elasticsearch heap used - timed graph

Shows the JVM Heap used of Elasticsearch on each node.

  • Should not reach the limit (configurable)

ServersScreen-ESNodesHeap.png

Elasticsearch disk used - timed graph

Shows the disk used on each ES node.

  • Should not reach the limit (configurable)

ServersScreen-ESNodesDisk.png