Skip to main content

Data Storage Policies

ยท 3 min read

Ever wanted to keep some Whisperer data longer than others?
Spider introduces Data Storage Policies to achieve this ๐Ÿ˜!

Overviewโ€‹

Previously, all Whisperers data were removed regularly for all Whisperers at once with Elasticsearch rotating indices features (ILM).

Now, you may define independent Data Storage Policies (DSP), to manage data TTL independently.
You may have a Whisperer storing its data for 7 days while another is storing 20 days and this a single day.

You may:

  • Define as many DSP as you want
  • Define, for each, different shards, replicas, ttl and rollover delay
  • Associate as many Whisperers as you want to any DSP

On the UI, DSPs are invisible, data are searched, uploaded, purged transparently and across all DSPs.
Only one setting appear, in the Whisperer parsing configuration.

note

DSP original evolution request was made by Jeremy M. and is the first official evolution request of Spider ๐Ÿ˜

DSP managementโ€‹

Creating DSPsโ€‹

Data Storage Policies are created in the values.yaml file, at setup: Indices setup.
See the documentation for more details.

DSP config.png

Updating a DSPโ€‹

You may update a DSP by changing values in the setup.

  • The associated ILM and templates will be regenerated
  • New indices will be created for the new data
  • Old data indices will stay as is

Removing a DSPโ€‹

You may remove a DSP from the setup.

  • The corresponding Elasticsearch objects won't be removed
  • Indices will be removed after their TTL
  • New data will be indexed using default DSP

Associating a DSP to a Whispererโ€‹

By default, each Whisperer is associated to the default DSP.

You may associate it to any other one in Whisperer parsing configuration

DSP.png

During the lifetime of a Whisperer, you may change DSP.
Only new data captured will be associated to the new one. Already present data will keep the TTL of the time it has been indexed.

Behind the sceneโ€‹

How does it work?
Simple but smart.

  1. Helm and ES init script are generating specific ILM, Templates and Indices for each data resource.
  2. The list of indices of DSP are injected into Pollers configuration
  3. The pollers route the data to the correct DSP based on Whisperer configuration (cached)

Monitoringโ€‹

Jeremy stated it in his request, he wanted to be able to monitor the total size stored for these different Whisperers storage policies.
Using another trick of aliases and without changing anything apart configuration in my monitoring, this is available! :-)

DSPMonitoring.png