Skip to main content

Whisperers status

Description

This dashboard provides a status of Whisperers clients: state, uploaded data, quality of parsing, cpu, ram, queues, circuit breakers…

Screenshot

WhisperersScreen.png

Content

Whisperers status - timed chart

Shows the status of all Whisperers connected to the server.

  • Statuses values are:
    • Starting - Whisperer just got deployed
    • Recording - Capture is in progress
    • Stopped - Capture is paused
    • Invalid_Config - Configuration needs a fix to allow Whisperer to start
    • Internal_Error - You found a bug!
    • Server_Down - Whisperer can't get configuration

WhisperersScreen-WhisperersCurrentStatus.png

Whisperers upload to server - timed chart

Shows data uploaded from the Whisperers to the server, in MB, and split by Whisperer.

Allows to quickly find Whisperers that upload most data.

WhisperersScreen-DataUploaded.png

Whisp current status - items grid

Shows current Whisperers status for all Whisperers and Instances.

  • Whisperer and instance name
    • Record is marked aggregated when the status record is the aggregated result of all instances
  • Whisperer version
  • Whisperer start, host monitored and uptime
  • Session start and duration
  • Last update
  • CPU, RAM
  • Payload sent and errors

This grid allows checking that all Whisperers are up-to-date, well connected, well behaving and if errors are present.

WhisperersScreen-WhispCurrentStatus.png

Tcp parsing status - timed chart

Shows the count of Tcp sessions, grouped by their parsing status, overt time.

Tcp session may be in:

  • Waiting or Pending, when they are queued for parsing
  • Incomplete when parsing is in progress, and session not closed
  • Warning when parsing failed but there is to be a second trial
  • Error or OK, when parsing is finished

The less red, the better ! :)
Errors could have many factors, but mainly: CPU contention on clients or servers, resulting in missing packets when parsing is done.

WhisperersScreen-TcpParsingStatus.png

Whisperers - items grid

Lists all Whisperers in the system and give access to their raw configuration.

WhisperersScreen-Whisperers.png

Whisperers total CPU usage - timed chart

Shows the sum of CPU usage of all replicas of the same Whisperer, over time.

WhisperersScreen-WhisperersTotalCpuUsage.png

Whisperers avg used RAM - timed chart

Shows the average RAM used for each Whisperer, across all replicas.

WhisperersScreen-WhisperersAvgUsedRAM.png

Instances CPU usage - timed chart

Shows CPU usage of all connected Whisperers instances.

  • Should be low ;)
  • The more packets captured and parsed, the more CPU usage.
    • Captured packets can be limited by PCAP filter
    • Parsed packets can be limited by Hostname blacklisting in configuration
    • A circuit breaker on CPU usage can be set to pause Whisperers when too high load
  • Classic usage: between 3 and 10%

WhisperersScreen-InstancesUsedCPU.png

Instances used RAM - timed chart

Shows RAM usage of all connected Whisperers instances.

Classic usage:

  • 120 MB when capturing and server responding
  • 80 MB when stopped

WhisperersScreen-InstancesUsedRAM.png

Queues length - timed chart

Shows the evolution of the sending queue of Whisperers.

  • 2 queues: Packets and Tcp sessions
  • Whisperer may send in // to Spider server (configuration).
  • When a Whisperer has too many requests to send to server, it stores them in a queue, waiting for next available slot for sending.
  • When items are in the queue, it means either:
    • The server is getting slow and has issues
    • The Whisperer is under high pressure of packets to capture

WhisperersScreen-QueuesLength.png

Queues overflow - timed chart

Tracks sending queues overflow over time.

  • 2 queues: Packets and Tcpsessions
  • When the queue is full, new items are discarded and never sent.
    • This causes parsing issues due to missing packets (most often)

This won't happen if the Whisperers and Servers are correctly scaled ;)

WhisperersScreen-QueuesOverflow.png

Services speed from Whisperers - timed chart

Shows the evolution of response time of Spider endpoints, as seen from the Whisperers point of view.

  • 30 ms is what you would expect on local network / same Availability Zone.
  • The lower, the better.

If the response time is bad add more service replicas or server nodes as needed.

WhisperersScreen-ServicesSpeedFromWhisperer.png

Active circuit breakers - timed chart

Shows when Whisperers have active circuit breakers.

  • When a Whisperer cannot connect to the server, or fails sending data (time out, mostly), a circuit breaker opens, and the Whisperer stops trying for some time.
    • Data is lost
  • This can happen when:
    • CPU on the host the Whisperer is in is heavy loaded
    • There are network issues
    • Server is not scaled big enough
    • Server is partially down
      • When server is completely down, the Whisperer stops its capture and waits for it to get back up again

WhisperersScreen-ActiveCircuitBreakers.png

Whisperers status - items grid

Shows all status sent by Whisperers

  • Items are pre filtered on those having errors

WhisperersScreen-WhisperersStatus.png

Hosts items - items grid

Lists hosts resources sent by Whisperers.

  • Hosts resources tracks the name resolving of Hosts seen by Whisperers
    • Start and stop of capture for each host
    • Dns names
    • Custom names set by users on UI or by parsing configuration
    • Position on map (if fixed)
  • An host resource is updated at regular interval, and a new one is created only when an host changes IP or Dns name

This grid is mostly used for debugging.

WhisperersScreen-HostsItems.png

Hosts stats items - aggregated grid

Shows statistic on Hosts resources for each Whisperer over the period.

  • If, over a couple of hours, a Whisperer has too many Hosts records, with a very short average duration, it means that:
    • Names of hosts is not stable
      • For instance Docker Swarm had a bug many years ago in reverse DNS of hosts. Often, the id of the Docker is returned instead of the name of the service replica.
    • Name resolving of IPs on the UI may fail
      • The UI limit its load to 99 Hosts resources at once.

WhisperersScreen-HostsStatsItems.png