Skip to main content

Configuring a Whisperer

Configuring a Whisperer is made through two different tabs grouping:

  • Capture config, for the agent configuration
  • Parsing config, for the server parsing configuration

Settings.png

Whisperer Type

Whisperers exist in different types.
The type is defined at the creation or just after, and cannot be changed afterward.

Available types:

Whisperer typeDescription
UPLOADWhisperers dedicated to manual uploaded data (from the UI). They have higher retention duration.
INTERFACEWhisperers dedicated to real time streaming capture of data.
FILEWhisperers acting like real time but reading data from local pcap files. Used mostly for debugging.

The Whisperer type is set in the Capture Config tab in Source / Mode field:

WhispererTypeConfig.png

Default values, advanced settings and inline help

Default values and advanced settings

Whisperers configuration is composed of many different settings.

  • Most have reasonable default values
  • Main settings are always visible to the user
  • Other details settings, 'advanced', are hidden by default, and only visible when clicking Show advanced

Color code

A color code is used to show values:

  • Default values are shown in blue
  • When editing changed values are in
    • orange during edition
    • green when valid
    • red when erroneous
  • Custom values are in black once saved

Inline help

Aside settings group titles, the Info.png icon unfolds the inline help when clicking on it.

Import / Export configuration

You may EXPORT the configuration of a Whisperer to IMPORT it when editing the configuration.

The export is in JSON, and may also be edited manually.

Copy configuration between Whisperers

When editing you may click on COPY button to copy the configuration of another Whisperer unto yours.

It's even faster than export/import!

Capture Config

CaptureConfig.png

Settings types

The following types are used in this documentation:

TypeDescriptionPatternExamples
DurationThe expected value is a duration in ISO notation.PT\d[SMH]PT10S for 10s
PT5M for 5min
BooleanThe expected value is a boolean. Represented as a checkbox on the UI.true|false
ListThe expected values are predefined and shown in a list.
StringThe expected value is a string...
PatternThe expected value is a regular expression.
IntegerThe expected value is an integer.\d+
FloatThe expected value is a float.\d*.?\d+
FixedFixed by the system, cannot be changed. If not good, contact your support.

The type may also be an array of types.

Source

Configure the source for packets capture.

Source.png

SettingDescriptionTypeComment
ModeThe type of the WhispererListINTERFACE | FILE | UPLOAD
Network interfaceThe network interface used for captureListany is a virtual interface on Linux that aggregates all others
lo is localhost
ethx are often the interfaces mounted by Kube
Pcap filterThe pcap filter to filter what network traffic you want to captureStringVery important to set right.
You often limit to tcp, or tcp port 80, or tcp portrange 3000-3010
Server hostTells what is the host of Spider, not to capture communications going to it.
To avoid infinite loops!
Fixed
Capture bufferCapture buffer of pcap in the Kernel, in kBIntegerShould often be several thousands.

Attachments

Configure attachments made with this Whisperer by a Spider Controller.

Attachments.png

SettingDescriptionTypeComment
Time to liveWhen set, defines the maximum time an attachment of this Whisperer may be active.Duration in Sec or Min

Circuit breakers

Circuit breakers settings allow to limit Cpu usage and Ram usage of Whisperers on the host they are capturing.
Thus limiting footprint in case of network traffic burst.

CircuitBreakers.png

SettingDescriptionTypeComment
CPU circuit breakerTells if the Whisperer should stop the capture when its CPU usage is too high.BooleanCircuit breakers lasts the duration of stats collection.
CPU limitSet the threshold for the circuit breaker. In percentage.Float
RAM circuit breakerTells if the Whisperer should stop the capture when its RAM usage is too high.Boolean
RAM limitSet the threshold for the circuit breaker. In MB.Integer

Packets sending

Settings to control the emission of packets.

PcapSending.png

SettingDescriptionTypeComment
BO time outTime out threshold when calling POST /packets API.Duration in S
Size before sendingSize threshold in kB of the buffer of packets before sending to back end.Integer
Sending frequencyDuration threshold after which the buffer of packets is sent even if not full.Duration in S
Parallel sendingHow many parallel sending of buffers are authorizedInteger
Sending queue sizeIf not slot for parallel sending is available, the Whisperer will queue the message.
When queue is full, new are discarded.
Integer

When server throughput may not be fast enough for capture load, you should increase Parallel sending setting.

caution

Those values have a direct effect on maximum Whisperer memory size:

  • queue sizexsize before sending = maximum queue size
  • Default to 500kB * 100 = 50 MB

DNS

Configures the IP reverse resolving feature.

DNS.png

SettingDescriptionTypeComment
Resolve IPsWhether the Whisperer should reverse resolve IP addresses to host names.Boolean
Custom DNS ServerWhether the Whisperer uses the system resolution or a custom DNS server.Boolean
- DNS hostHost of the custom DNS server.String
- DNS portPort of the custom DNS server.Integer
TTLHow long hostname resolution is kept in cache.Duration in H
Refresh rateHow often a cached hostname resolution is refreshed.Duration in S
Hosts to resolveList of hostnames to pre-resolve at start and keep in cache.Array of Strings
Hosts resolving rateRate for re-resolving those hostnames.Duration in S
Send full delayHow often the global list of hostnames and IPs is sent to backend.Duration in M
Send update delayHow often an update of the list of hostnames and IPs is sent to backend.Duration in S
BO time outTime out threshold when calling POST /hosts API.Duration in S
Purge delayHow often do the Whisperer clean the cache for TTL passed resolutions.Duration in H

The option Hosts to resolve is useful when:

  • you want to capture from the start of the Whisperer but also need filtering on hosts.
  • your DNS do not provide reverse resolving (PTR query).
  • the hosts are Virtual IPs with no PTR records (like in Docker Swarm).

Tcp Sessions

Configures the tracking and sending of Tcp sessions.

Tcp Sessions.png

SettingDescriptionTypeComment
Track TCP sessionsWhether the Whisperer should track TCP sessions. If not, it will only send raw packets.BooleanUsually, you want this 😊
BO timeoutTime out threshold when calling POST /sessions API.Duration in S
Sending frequencyDuration threshold after which the buffer of sessions is sent.Duration in S
Tcp sessions TTLTime to Live after which an inactive - but not closed - Tcp session is removed from memory.Duration in M
Send only data packetsWhen set, a specific - complex - pcap filter is added to capture only data packets & packets required for Tcp sessions tracking. Only packets with data are sent. Reduces Spider CPU usage.
Please note that TCP sessions statistics will then be erroneous (IP payload etc.).
Boolean
Sessions sent at onceMax count of Tcp sessions sent by API request.Integer
Parallel sendingHow many parallel sending of buffers are authorized.Integer
Sending queue sizeIf not slot for parallel sending is available, the Whisperer will queue the message.
When queue is full, new are discarded.
Integer
Max packetLot sizeMaximum size of contingent Tcp payload authorized. Usually means a maximum size of request or response.
Packets above are sent to Spider but are not taken into account when parsing.
Avoids blowing up the servers memory. In Bytes.
Integer

Filtering

Configures filtering on hostnames to capture packets (from and to) or to avoid.

Filtering.png

SettingDescriptionTypeComment
Track by defaultWhether the Whisperer tracks (all) IP addresses by default.BooleanSet to false when you want to only track certain hostnames
Wait for resolvingWhether the Whisperer waits for having resolve the IP address to track its packets.Boolean
Track unresolved IPWhether the Whisperer tracks IP addresses it could not resolve.Boolean
Hosts to trackList of regular expressions (or string) for valid hostnames to track.Array of PatternsA whitelist
Host to ignoreList of regular expressions (or string) for hostnames to avoid tracking.Array of PatternsA blacklist

An IP resolved to a hostname to track is not re-resolved until the host is not seen for long time (DNS.ttl).
An IP resolved to a hostname not to track is re-resolved every DNS.refreshRate.

When tracking by default, as soon as a host is to ignore, don't track.
When not tracking by default, as soon as a host is to track, do track.

tip

Hostnames filtering is a very powerful feature to limit the volume of data and the CPU usage.

  • When you don't need to track packets from or to a hostname, use this.
  • When you know the IP of the hosts to ignore, rather use a pcap filter.

Duplicates

Duplicates.png

When capturing with 2 instances of the same whisperer from both side of the same communication, Spider captures twice packets and TCP sessions.
The following options allow to avoid duplicated communications resources by creating common ids and adding unique checks on server side.

To avoid duplicated packets (when you save them), choose the relevant option on the parsing part of configuration.

SettingDescriptionTypeComment
Avoid duplicated communicationsWhether the Whisperer should generate unique Ids to identify duplicates of packetsBoolean

VxLan protocol

vxlan.png

VxLan is a UDP protocol used (at least) by VMWare and Docker Swarm to encapsulate packets of virtualized network infrastructure over UDP.
Pcap filter must be set to track UDP on corresponding port (4789 for Swarm).

These settings allow capturing, and decoding the inner packets of a virtualized network, captured outside the network.

SettingDescriptionTypeComment
Decapsulate internal packetWhen capturing VxLan, track and send the inner packet.Boolean
Keep original packetWhen capturing VxLan, track and send the outer packet.Boolean

Status sending

Configures the monitoring.

StatusSending.png

SettingDescriptionTypeComment
Send statusWhether to send Whisperer statistics to the back end.BooleanCrucial for monitoring.
BO timeoutTime out threshold when calling POST /status API.Duration in S
Sending frequencyFrequency to send statuses.Duration in S10 / 20s

Dump packets on client

You may ask the Whisperer to dump .pcap files of packets when capturing.
Used only for internal debugging.

DumpPackets.png

SettingDescriptionTypeComment
Dump packetsWhether the Whisperer should dump packets to filesBoolean
Buffer sizeSize in kB of the files to dump (approx)Integer
Output pathOutput path to store the dumpsString

Configuration polling

ConfigurationPolling.png

SettingDescriptionTypeComment
FrequencyTells how often the Whisperer calls to check a configuration changeDuration in sec.

Common parsing settings

ParsingConfig.png

caution

Parsing options are cached for 5 minutes for INTERFACE Whisperers and 1 minute for UPLOAD Whisperers.

Data storage policy

In Spider configuration, you may define Customisable Data Store Policies.
These defines different available retention periods for Whisperers data.

DSP.png

The Data storage policy of the Whisperer defines how long you want to keep its data.

  • The default value is defaul (definitive lack of imagination here)
  • It may be changed at any time in the lifetime of a Whisperer, but existing data will still be associated and managed by the policy active at the time of indexing
  • When the setting is set to a Policy that do not exist (any more), the default one is used.

Packets saving

Configures if packets should be saved for analysis.

SavePackets.png

Spider is built to parse packets in streaming.
You do not need to save packets unless you want to analyze at low packet level.

  • When saving packets,

    • Packets are saved to Elasticsearch.
    • It requires much CPU and space.
    • You may choose to avoid duplicated packets when capturing the two side of the same communication.
      • This is costly for ES as it then has to check for if the packet exists when saving. Activate it only when needed.
  • When not saving packets,

    • Packets are only kept in memory until they are parsed.
    • This prevents reconstructing payload data for TCP or HTTP once the parsing is finished.
    • You should definitely activate 'Save content' HTTP option, or else, you won't see the data ;)

Do not save packets:

  • when you filter out part of the parsed data, and you don't want the data to be reconstructed,
  • or when you don't need to keep packets, to save space and ES resources.

You may activate saving packets only temporarily when needed.

SettingDescriptionTypeComment
Save PacketsWhether Spider should save packets from this Whisperer.Boolean
- Avoid duplicated packetsWhether Spider should check for duplicated packets to avoid saving twice the same.Boolean

Tcp sessions saving

You may wish not to save TCP sessions in Spider (once the parsing is done). Use this:

  • when your parsing quality is good,
  • when you don't care of TCP sessions.

TcpSave.png

SettingDescriptionTypeComment
Save TCP sessionsWhether Spider should save Tcp sessions from this Whisperer.Boolean
tip

When not saving packets, saving Tcp sessions has little interest.

Host names

When Whisperer resolves IP addresses to capture hostnames, it can determine automatically a shortname to display based on the FQDN provided by the DNS.

For this, specify one or several regular expression(s) that are used to extract parts of the FQDN, which are then concatenated by '.'. Patterns are executed in order. First one that matches is selected.

HostNames.png

SettingDescriptionTypeComment
Name patternsList of patterns to transform a FQDN into something shorter, but meaningful.Array of Patterns
tip

This is particularly useful to have shorter names on the UI.

  • You don't need to see: myservice.mynamespace.svc.cluster.local
  • Seeing myservice is enough.

The FQDN is kept along the short name, don't worry:

HostId.png

HTTP protocol parsing settings

HTTP parser provides the most useful, and the most complex parsing options.
They allow filtering, but most importantly extracting business knowledge to show it on Spider.

HTTP Parsing

Configures which TCP sessions should be parsed for Http communications.

HttpParsing.png

SettingDescriptionTypeComment
Parse for HTTPWhether this Whisperer is capturing HTTP communications that should be parsed.Boolean
Parse by defaultWhether Spider should parse all ports by default.Boolean
Ports to parseDefine server ports exposng HTTP API.
The TCP sessions on those ports will be parsed for HTTP communications.
You may define lists of ports or ranges: 80, 8080, 9200, 3000-3100.
Array of Integers or Integers ranges
Ports to ignoreDefine server ports not to parse.Array of Integers or Integers ranges
Save parsing logDefine if the parsing tracking resource should be saved.BooleanFor Spider troubleshooting only
- Keep whole logDefine if we should keep the whole log (may be HUGE)BooleanFor Spider troubleshooting only

HTTP filtering

Configures filtering to remove headers, content or whole communication for certain URIs:

  • When it is sensitive information (and in production)
  • When it is not required to save them (like monitoring requests, healthcheck requests etc.), to save place.

Configures if Spider should save reassembled communication content.

HttpFiltering.png

SettingDescriptionTypeComment
Headers to filterDefine headers to remove in parsed object (like Basic auth header 😉).
You may filter by header name or value content.
It expects patterns to apply on the HTTP header line.
Array of Patterns
URIs to filterFilter out communications by URIs.
For sensitive communications like auth calls.
Array of Patterns
Save contentDefine if the content should be saved within the HTTP resource.
Important when not saving packets.
Boolean
- URIs to filter request bodyFilter out request body contents by URIs patterns.Array of Patterns
- URIs to filter response bodyFilter out response body contents by URIs patterns.Array of Patterns
Save raw headersDefine if the raw headers should be saved within the HTTP resource.
- URIs to filter request raw headersFilter out request raw headers by URIs patterns.Array of Patterns
- URIs to filter response raw headersFilter out response raw headers by URIs patterns.Array of Patterns
tip

Regexp are matched case insensitive.

tip

Save content when not saving packets.

tip

Save raw headers when you need to see what was really exchanged (and you don't filter any headers).

tip

Think about filtering ^Authorization: Basic headers to avoid saving passwords in Spider on production.

Request templates

On the parsed HTTP communications, you may define regular expression patterns to name the request.

The name is saved in the communication metadata and is used to aggregate and filter requests.
You define:

  • The name of the request:
    Ex: Creation of order, createAnOrder, POST /orders/{id}, depending on your style.
  • The parts of the request to parse to identify it. They will be concatenated, as in their raw format.
    • Method
    • URI, mandatory, with querystring
    • Headers
    • Request body
  • The regular expression to execute, with /ms flags.

Many regular expressions may be defined. They are matched in order, and the first match is kept.

To create many at once, you may prepare the list in a spreadsheet like Google docs with name, regexp and options in columns.
And then copy paste (without headers)

RequestsTemplates.png

SettingDescriptionTypeComment
NameName of the template.
May include placeholder for captured groups 💪.
String
PatternRegular expression. When using placeholders, use ^ and $ to enclose the pattern and avoid surprises.Pattern
MethodShould method be matched.Boolean
URIShould URI be matched.Boolean
HeadersShould headers be matched.Boolean
BodyShould body be reassembled, transfer decoded, decompressed and matched.Boolean
tip

This is one of the most useful feature of parsing.

Obviously this requires CPU, as for the next two ones.
But it is quite reasonable and optimised.
And the value is so much worth it!!

Request tags

On the parsed HTTP communications, you may define regular expression patterns to tag the request.

The tag name and value(s) are saved in the communication metadata and may be used when searching.
You define:

  • The name of the tag: Ex: 'ClientName', 'ApiVersion', 'Authenticated'...
  • The parts of the request to parse to tag it. They will be concatenated, as in their raw format.
    • Method
    • URI with querystring
    • Headers
    • Request body
  • The regular expression to execute, with /msg flags.

If the regular expression includes capture group(s), the captured data will be stored as tag value(s).
Without capture group, the tag value will be 'true' (text).
Many regular expressions may be defined. They are all executed, and all the tags are kept.

Alongside the tag value, are also saved:

  • The tag count
    • How often it matched
  • The tag cardinality
    • How often it matched with different values

This allows counting (and searching) the number of items in a response, the number of different customers...

To create many at once, you may prepare the list in a spreadsheet like Google docs with name, regexp and options in columns.
And then copy paste (without headers)

RequestsTags.png

SettingDescriptionTypeComment
NameName of the tag.String
PatternRegular expression to capture the tag.Pattern
MethodShould method be matched.Boolean
URIShould URI be matched.Boolean
HeadersShould headers be matched.Boolean
BodyShould body be reassembled, transfer decoded, decompressed and matched.Boolean
tip

Together with response tags, these are the most value-adding features of parsing!

Response tags

On the parsed HTTP communications, you may define regular expression patterns to tag the response.

It is the same process as for the request, but on the response.

ResponseTags.png

SettingDescriptionTypeComment
NameName of the tag.String
PatternRegular expression to capture the tag.Pattern
StatusShould method be matched.Boolean
HeadersShould headers be matched.Boolean
BodyShould body be reassembled, transfer decoded, decompressed and matched.Boolean
tip

When defining the same tag in request or response, or many times the same tag in either, all extracted values will be associated to the same tag and counted for tag count and cardinality.