New polling alert
On Kubernetes first month, Spider had two downtimes because the polling queues got filled with tens of thousands of items, and no alert got raised.
Cause
Indeed, the system was working, well, parsing, but saving in Elasticsearch was not working well.
The poller could not consume the new items fast enough.
I first tried scaling the pollers, but to no avail.
In fact, the issue was in Elasticsearch throughput that was not big enough: I understood it thanks to the monitoring UI, and scaled Elasticsearch CPU resource limits by changing 1 digit on the setup.yml :) !
New alert
It solved the situation, but the fact was that I did not get alerted.
So I decided to add an alert when the queues get filled and do not empty.
Implementing this alert from design to production took me less than one hour :) !!
That felt so great!
The framework for alerting is simple but efficient!