Configuring a Whisperer
Configuring a Whisperer is made through two different tabs grouping:
Capture
config, for the agent configurationParsing
config, for the server parsing configuration
Whisperer Type
Whisperers exist in different types.
The type is defined at the creation or just after, and cannot be changed afterward.
Available types:
Whisperer type | Description |
---|---|
UPLOAD | Whisperers dedicated to manual uploaded data (from the UI). They have higher retention duration. |
INTERFACE | Whisperers dedicated to real time streaming capture of data. |
FILE | Whisperers acting like real time but reading data from local pcap files. Used mostly for debugging. |
The Whisperer type is set in the Capture Config
tab in Source
/ Mode
field:
Default values, advanced settings and inline help
Default values and advanced settings
Whisperers configuration is composed of many different settings.
- Most have reasonable default values
- Main settings are always visible to the user
- Other details settings, 'advanced', are hidden by default, and only visible when clicking
Show advanced
Color code
A color code is used to show values:
- Default values are shown in blue
- When editing changed values are in
- orange during edition
- green when valid
- red when erroneous
- Custom values are in black once saved
Inline help
Aside settings group titles, the icon unfolds the inline help when clicking on it.
Import / Export configuration
You may EXPORT
the configuration of a Whisperer to IMPORT
it when editing the configuration.
The export is in JSON, and may also be edited manually.
Copy configuration between Whisperers
When editing you may click on COPY
button to copy the configuration of another Whisperer unto yours.
It's even faster than export/import!
Capture Config
Settings types
The following types are used in this documentation:
Type | Description | Pattern | Examples |
---|---|---|---|
Duration | The expected value is a duration in ISO notation. | PT\d[SMH] | PT10S for 10s PT5M for 5min |
Boolean | The expected value is a boolean. Represented as a checkbox on the UI. | true|false | |
List | The expected values are predefined and shown in a list. | ||
String | The expected value is a string... | ||
Pattern | The expected value is a regular expression. | ||
Integer | The expected value is an integer. | \d+ | |
Float | The expected value is a float. | \d*.?\d+ | |
Fixed | Fixed by the system, cannot be changed. If not good, contact your support. |
The type may also be an array of types.
Source
Configure the source for packets capture.
Setting | Description | Type | Comment |
---|---|---|---|
Mode | The type of the Whisperer | List | INTERFACE | FILE | UPLOAD |
Network interface | The network interface used for capture | List | any is a virtual interface on Linux that aggregates all otherslo is localhostethx are often the interfaces mounted by Kube |
Pcap filter | The pcap filter to filter what network traffic you want to capture | String | Very important to set right. You often limit to tcp , or tcp port 80 , or tcp portrange 3000-3010 |
Method | Method used to capture the Network. By default AFPACKET_THEN_LIBPCAP is used, which tries AFNET with a fallback to LIBPCAP . AFNET is slightly faster, but may not be always available, depending on your kernel. | List | LIBPCAP | AFPACKET | AFPACKET_THEN_LIBPCAP |
Server host | Tells what is the host of Spider, not to capture communications going to it. To avoid infinite loops! | Fixed | |
Capture buffer | Capture buffer of pcap in the Kernel, in kB | Integer | Should often be several thousands. |
Attachments
Configure attachments made with this Whisperer by a Spider Controller.
Setting | Description | Type | Comment |
---|---|---|---|
Time to live | When set, defines the maximum time an attachment of this Whisperer may be active. | Duration in Sec or Min |
Circuit breakers
Circuit breakers settings allow to limit Cpu usage and Ram usage of Whisperers on the host they are capturing.
Thus limiting footprint in case of network traffic burst.
Setting | Description | Type | Comment |
---|---|---|---|
CPU circuit breaker | Tells if the Whisperer should stop the capture when its CPU usage is too high. | Boolean | Circuit breakers lasts the duration of stats collection. |
CPU limit | Set the threshold for the circuit breaker. In percentage. | Float | |
RAM circuit breaker | Tells if the Whisperer should stop the capture when its RAM usage is too high. | Boolean | |
RAM limit | Set the threshold for the circuit breaker. In MB. | Integer | |
Throughput limiter | Tells if the Whisperer should stop the capture when captured traffic is too high. | Boolean | |
Throughput limit | Set the threshold for the throughput. In MB/min. | Integer |
For LocalAgents Whisperers, only admins may change the Throughput limit, set to 5MB/min by default.
Packets sending
Settings to control the emission of packets.
Setting | Description | Type | Comment |
---|---|---|---|
BO time out | Time out threshold when calling POST /packets API. | Duration in S | |
Size before sending | Size threshold in kB of the buffer of packets before sending to back end. | Integer | |
Sending frequency | Duration threshold after which the buffer of packets is sent even if not full. | Duration in S | |
Parallel sending | How many parallel sending of buffers are authorized | Integer | |
Sending queue size | If not slot for parallel sending is available, the Whisperer will queue the message. When queue is full, new are discarded. | Integer |
When server throughput may not be fast enough for capture load, you should increase Parallel sending
setting.
Those values have a direct effect on maximum Whisperer memory size:
queue size
xsize before sending
=maximum queue size
- Default to 500kB * 100 = 50 MB
DNS
Configures the IP reverse resolving feature.
Setting | Description | Type | Comment |
---|---|---|---|
Resolve IPs | Whether the Whisperer should reverse resolve IP addresses to host names. | Boolean | |
Custom DNS Server | Whether the Whisperer uses the system resolution or a custom DNS server. | Boolean | |
- DNS host | Host of the custom DNS server. | String | |
- DNS port | Port of the custom DNS server. | Integer | |
TTL | How long hostname resolution is kept in cache. | Duration in H | |
Refresh rate | How often a cached hostname resolution is refreshed. | Duration in S | |
Hosts to resolve | List of hostnames to pre-resolve at start and keep in cache. | Array of Strings | |
Hosts resolving rate | Rate for re-resolving those hostnames. | Duration in S | |
Send full delay | How often the global list of hostnames and IPs is sent to backend. | Duration in M | |
Send update delay | How often an update of the list of hostnames and IPs is sent to backend. | Duration in S | |
BO time out | Time out threshold when calling POST /hosts API. | Duration in S | |
Purge delay | How often do the Whisperer clean the cache for TTL passed resolutions. | Duration in H |
The option Hosts to resolve
is useful when:
- you want to capture from the start of the Whisperer but also need filtering on hosts.
- your DNS do not provide reverse resolving (PTR query).
- the hosts are Virtual IPs with no PTR records (like in Docker Swarm).
Tcp Sessions
Configures the tracking and sending of Tcp sessions.
Setting | Description | Type | Comment |
---|---|---|---|
Track TCP sessions | Whether the Whisperer should track TCP sessions. If not, it will only send raw packets. | Boolean | Usually, you want this 😊 |
BO timeout | Time out threshold when calling POST /sessions API. | Duration in S | |
Sending frequency | Duration threshold after which the buffer of sessions is sent. | Duration in S | |
Tcp sessions TTL | Time to Live after which an inactive - but not closed - Tcp session is removed from memory. | Duration in M | |
Send only data packets | When set, a specific - complex - pcap filter is added to capture only data packets & packets required for Tcp sessions tracking. Only packets with data are sent. Reduces Spider CPU usage. Please note that TCP sessions statistics will then be erroneous (IP payload etc.). | Boolean | |
Sessions sent at once | Max count of Tcp sessions sent by API request. | Integer | |
Parallel sending | How many parallel sending of buffers are authorized. | Integer | |
Sending queue size | If not slot for parallel sending is available, the Whisperer will queue the message. When queue is full, new are discarded. | Integer | |
Max packetLot size | Maximum size of contingent Tcp payload authorized. Usually means a maximum size of request or response. Packets above are sent to Spider but are not taken into account when parsing. Avoids blowing up the servers memory. In Bytes. | Integer |
Filtering
Configures filtering on hostnames to capture packets (from and to) or to avoid.
Setting | Description | Type | Comment |
---|---|---|---|
Track by default | Whether the Whisperer tracks (all) IP addresses by default. | Boolean | Set to false when you want to only track certain hostnames |
Wait for resolving | Whether the Whisperer waits for having resolve the IP address to track its packets. | Boolean | |
Track unresolved IP | Whether the Whisperer tracks IP addresses it could not resolve. | Boolean | |
Hosts to track | List of regular expressions (or string) for valid hostnames to track. | Array of Patterns | A whitelist |
Host to ignore | List of regular expressions (or string) for hostnames to avoid tracking. | Array of Patterns | A blacklist |
An IP resolved to a hostname to track is not re-resolved until the host is not seen for long time (DNS.ttl).
An IP resolved to a hostname not to track is re-resolved every DNS.refreshRate.
When tracking by default, as soon as a host is to ignore, don't track.
When not tracking by default, as soon as a host is to track, do track.
Hostnames filtering is a very powerful feature to limit the volume of data and the CPU usage.
- When you don't need to track packets from or to a hostname, use this.
- When you know the IP of the hosts to ignore, rather use a pcap filter.
Duplicates
When capturing with 2 instances of the same whisperer from both side of the same communication, Spider captures twice packets and TCP sessions.
The following options allow to avoid duplicated communications resources by creating common ids and adding unique checks on server side.
To avoid duplicated packets (when you save them), choose the relevant option on the parsing part of configuration.
Setting | Description | Type | Comment |
---|---|---|---|
Avoid duplicated communications | Whether the Whisperer should generate unique Ids to identify duplicates of packets | Boolean |
Tls Keys
Whisperers can track TLS keys together with a Gocipher agent.
The Whisperer will look for TLS information in the TCP session it captures, and the Gocipher will capture the session TLS secrets.
Due to technical constraints, TLS keys are captured on ALL PORTS. Pcap filter has no effect on TLS capture.
However, you may limit what port to link or not by:
- Not capturing their packets
- Listing the ports to link in the Parsing Config
Setting | Description | Type | Comment |
---|---|---|---|
Track TLS Keys | Whether the Whisperer & Gociphers should track and send TLS secrets to the back office. | Boolean |
VxLan protocol
VxLan is a UDP protocol used (at least) by VMWare and Docker Swarm to encapsulate packets of virtualized network infrastructure over UDP.
Pcap filter must be set to track UDP on corresponding port (4789 for Swarm).
These settings allow capturing, and decoding the inner packets of a virtualized network, captured outside the network.
Setting | Description | Type | Comment |
---|---|---|---|
Decapsulate internal packet | When capturing VxLan, track and send the inner packet. | Boolean | |
Keep original packet | When capturing VxLan, track and send the outer packet. | Boolean |
Status sending
Configures the monitoring.
Setting | Description | Type | Comment |
---|---|---|---|
Send status | Whether to send Whisperer statistics to the back end. | Boolean | Crucial for monitoring. |
BO timeout | Time out threshold when calling POST /status API. | Duration in S | |
Sending frequency | Frequency to send statuses. | Duration in S | 10 / 20s |
Dump packets on client
You may ask the Whisperer to dump .pcap
files of packets when capturing.
Used only for internal debugging.
Setting | Description | Type | Comment |
---|---|---|---|
Dump packets | Whether the Whisperer should dump packets to files | Boolean | |
Buffer size | Size in kB of the files to dump (approx) | Integer | |
Output path | Output path to store the dumps | String |
Configuration polling
Setting | Description | Type | Comment |
---|---|---|---|
Frequency | Tells how often the Whisperer calls to check a configuration change | Duration in sec. |
Common parsing settings
Parsing options are cached for 5 minutes for INTERFACE
Whisperers and 1 minute for UPLOAD
Whisperers.
Timezone
You may associate a timezone to each Whisperer.
When the Whisperer is selected on the UI, the times of the captured communications are displayed according to this time zone - depending on the settings.
Data storage policy
In Spider configuration, you may define Customisable Data Store Policies.
These defines different available retention periods for Whisperers data.
The Data storage policy
of the Whisperer defines how long you want to keep its data.
- The default value is
defaul
(definitive lack of imagination here) - It may be changed at any time in the lifetime of a Whisperer, but existing data will still be associated and managed by the policy active at the time of indexing
- When the setting is set to a Policy that do not exist (any more), the
default
one is used.
Packets saving
Configures if packets should be saved for analysis.
Spider is built to parse packets in streaming.
You do not need to save packets unless you want to analyze at low packet level.
-
When saving packets,
- Packets are saved to Elasticsearch.
- It requires much CPU and space.
- You may choose to avoid duplicated packets when capturing the two side of the same communication.
- This is costly for ES as it then has to check for if the packet exists when saving. Activate it only when needed.
-
When not saving packets,
- Packets are only kept in memory until they are parsed.
- This prevents reconstructing payload data for TCP or HTTP once the parsing is finished.
- You should definitely activate 'Save content' HTTP option, or else, you won't see the data ;)
Do not save packets:
- when you filter out part of the parsed data, and you don't want the data to be reconstructed,
- or when you don't need to keep packets, to save space and ES resources.
You may activate saving packets only temporarily when needed.
Setting | Description | Type | Comment |
---|---|---|---|
Save Packets | Whether Spider should save packets from this Whisperer. | Boolean | |
- Avoid duplicated packets | Whether Spider should check for duplicated packets to avoid saving twice the same. | Boolean |
Tcp sessions saving
You may wish not to save TCP sessions in Spider (once the parsing is done). Use this:
- when your parsing quality is good,
- when you don't care of TCP sessions.
Setting | Description | Type | Comment |
---|---|---|---|
Save TCP sessions | Whether Spider should save Tcp sessions from this Whisperer. | Boolean |
When not saving packets, saving Tcp sessions has little interest.
TLS Keys
You may wish to link or not captured TLS secrets and keys to TCP sessions.
You may define if link should be done by default, or if it should be restricted to certain ports.
Setting | Description | Type | Comment |
---|---|---|---|
Link TLS Keys to TCP sessions | Whether this TLS keys captured for this Whisperer should be associated to its TCP sessions. | Boolean | |
Link by default | Whether Spider should link on all ports by default. | Boolean | |
Ports to link | Define server ports using TLS encryption. The TCP sessions on those ports will be linked to TLS secrets captured. You may define lists of ports or ranges: 80, 8080, 9200, 3000-3100. | Array of Integers or Integers ranges | |
Ports to ignore | Define server ports not to link. | Array of Integers or Integers ranges |
Host names
When Whisperer resolves IP addresses to capture hostnames, it can determine automatically a shortname to display based on the FQDN provided by the DNS.
For this, specify one or several regular expression(s) that are used to extract parts of the FQDN, which are then concatenated by '.'. Patterns are executed in order. First one that matches is selected.
Setting | Description | Type | Comment |
---|---|---|---|
Name patterns | List of patterns to transform a FQDN into something shorter, but meaningful. | Array of Patterns |
This is particularly useful to have shorter names on the UI.
- You don't need to see:
myservice.mynamespace.svc.cluster.local
- Seeing
myservice
is enough.
The FQDN is kept along the short name, don't worry:
HTTP protocol parsing settings
HTTP parser provides the most useful, and the most complex parsing options.
They allow filtering, but most importantly extracting business knowledge to show it on Spider.
HTTP Parsing
Configures which TCP sessions should be parsed for Http communications.
Setting | Description | Type | Comment |
---|---|---|---|
Parse for HTTP | Whether this Whisperer is capturing HTTP communications that should be parsed. | Boolean | |
Parse by default | Whether Spider should parse all ports by default. | Boolean | |
Ports to parse | Define server ports exposing HTTP API. The TCP sessions on those ports will be parsed for HTTP communications. You may define lists of ports or ranges: 80, 8080, 9200, 3000-3100. | Array of Integers or Integers ranges | |
Ports to ignore | Define server ports not to parse. | Array of Integers or Integers ranges | |
Save parsing log | Define if the parsing tracking resource should be saved. | Boolean | For Spider troubleshooting only |
- Keep whole log | Define if we should keep the whole log (may be HUGE) | Boolean | For Spider troubleshooting only |
HTTP filtering
Configures filtering to remove headers, content or whole communication for certain URIs:
- When it is sensitive information (and in production)
- When it is not required to save them (like monitoring requests, healthcheck requests etc.), to save place.
Configures if Spider should save reassembled communication content.
Setting | Description | Type | Comment |
---|---|---|---|
Headers to filter | Define headers to remove in parsed object (like Basic auth header 😉). You may filter by header name or value content. It expects patterns to apply on the HTTP header line. | Array of Patterns | |
URIs to filter | Filter out communications by URIs. For sensitive communications like auth calls. | Array of Patterns | |
Save content | Define if the content should be saved within the HTTP resource. Important when not saving packets. | Boolean | |
- URIs to filter request body | Filter out request body contents by URIs patterns. | Array of Patterns | |
- URIs to filter response body | Filter out response body contents by URIs patterns. | Array of Patterns | |
Save raw headers | Define if the raw headers should be saved within the HTTP resource. | ||
- URIs to filter request raw headers | Filter out request raw headers by URIs patterns. | Array of Patterns | |
- URIs to filter response raw headers | Filter out response raw headers by URIs patterns. | Array of Patterns |
Regexp are matched case insensitive.
Save content when not saving packets.
Save raw headers when you need to see what was really exchanged (and you don't filter any headers).
Think about filtering ^Authorization: Basic
headers to avoid saving passwords in Spider on production.
Request templates
On the parsed HTTP communications, you may define regular expression patterns to name the request.
The name is saved in the communication metadata and is used to aggregate and filter requests.
You define:
- The name of the request:
Ex:Creation of order
,createAnOrder
,POST /orders/{id}
, depending on your style. - The parts of the request to parse to identify it. They will be concatenated, as in their raw format.
- Method
- URI, mandatory, with querystring
- Headers
- Request body
- The regular expression to execute, with /ms flags.
Many regular expressions may be defined. They are matched in order, and the first match is kept.
To create many at once, you may prepare the list in a spreadsheet like Google docs with name, regexp and options in columns.
And then copy paste (without headers)
Setting | Description | Type | Comment |
---|---|---|---|
Name | Name of the template. May include placeholder for captured groups 💪. | String | |
Pattern | Regular expression. When using placeholders, use ^ and $ to enclose the pattern and avoid surprises. | Pattern | |
Method | Should method be matched. | Boolean | |
URI | Should URI be matched. | Boolean | |
Headers | Should headers be matched. | Boolean | |
Body | Should body be reassembled, transfer decoded, decompressed and matched. | Boolean |
This is one of the most useful feature of parsing.
Obviously this requires CPU, as for the next two ones.
But it is quite reasonable and optimised.
And the value is so much worth it!!
Request tags
On the parsed HTTP communications, you may define regular expression patterns to tag the request.
The tag name and value(s) are saved in the communication metadata and may be used when searching.
You define:
- The name of the tag: Ex: 'ClientName', 'ApiVersion', 'Authenticated'...
- The parts of the request to parse to tag it. They will be concatenated, as in their raw format.
- Method
- URI with querystring
- Headers
- Request body
- The regular expression to execute, with /msg flags.
If the regular expression includes capture group(s), the captured data will be stored as tag value(s).
Without capture group, the tag value will be 'true' (text).
Many regular expressions may be defined. They are all executed, and all the tags are kept.
Alongside the tag value, are also saved:
- The tag count
- How often it matched
- The tag cardinality
- How often it matched with different values
This allows counting (and searching) the number of items in a response, the number of different customers...
To create many at once, you may prepare the list in a spreadsheet like Google docs with name, regexp and options in columns.
And then copy paste (without headers)
Setting | Description | Type | Comment |
---|---|---|---|
Name | Name of the tag. | String | |
Pattern | Regular expression to capture the tag. | Pattern | |
Method | Should method be matched. | Boolean | |
URI | Should URI be matched. | Boolean | |
Headers | Should headers be matched. | Boolean | |
Body | Should body be reassembled, transfer decoded, decompressed and matched. | Boolean |
Together with response tags, these are the most value-adding features of parsing!
Response tags
On the parsed HTTP communications, you may define regular expression patterns to tag the response.
It is the same process as for the request, but on the response.
Setting | Description | Type | Comment |
---|---|---|---|
Name | Name of the tag. | String | |
Pattern | Regular expression to capture the tag. | Pattern | |
Status | Should method be matched. | Boolean | |
Headers | Should headers be matched. | Boolean | |
Body | Should body be reassembled, transfer decoded, decompressed and matched. | Boolean |
When defining the same tag in request or response, or many times the same tag in either, all extracted values will be associated to the same tag and counted for tag count and cardinality.