Skip to main content

Stuck gocipher

Description

This alert fires when a single gocipher instance is persistently failing to send its captured TLS secrets to the server, while the rest of the fleet is healthy.

A gocipher sends secrets over a pooled keep-alive HTTP connection to the hub. If a pooled connection goes half-dead (for example a proxy in front of the hub silently closes its side), that one instance can keep reusing the dead connection and receive repeated 502 Bad Gateway responses (GCPH-OUT-204), dropping its secrets until the pod is restarted.

Because the failure is confined to one pod out of potentially dozens, its handful of errors is easily hidden in the fleet-wide tooManyLogs noise. This probe inspects the errors per instance, so a single wedged gocipher is surfaced on its own.

info

The gocipher retries each send on a fresh connection before logging GCPH-OUT-204, so a single sustained occurrence means that instance could not reach the server on any connection — its decrypted traffic is at risk.

Default configuration

"stuckGocipher": {
"active": true,
"minErrorsPerMin": 0.2,
"delayWhenInactive": "PT5M",
"delayWhenActive": "PT1M"
}
  • minErrorsPerMin — an instance is considered stuck when it produces at least this many secret-send failures per minute over the period (and at least two failures in the window, to avoid firing on a single transient hiccup).

Mail

Content

The mail lists every gocipher instance considered stuck, with its host, the number of failures and the rate over the period.

Sample

{
"endpoint": "https://...",
"name": "stuckGocipher",
"status": "ACTIVE",
"since": "2026-06-19T08:30:25.969Z",
"stuckInstances": [
{
"instance": "local-gocipher-wxx65",
"host": "ip-192-168-98-185.eu-west-1.compute.internal",
"count": 12,
"ratePerMin": 2.4,
"statusCode": 502,
"msg": "Error received from server while sending secrets"
}
],
"windowMin": 5
}