The vulnerability upgrade agent - 44 components, 0 CVEs in 3h30
Spider is 40+ Node.js services, 10 Go services, 3 Go clients, and 3 GUIs.
Each one has its own package.json or go.mod with its own dependency tree, its own vulnerable transitive packages, and its own breaking-change risk on every upgrade.
To manage technical debt, I wrote a Claude Code agent that does the entire run on its own.
Result:
- Total elapsed time: about 3h30 for 43 active components.
- Output: 43 fully fixed, 0 failed, 0 CVEs left where a fix existed.
What it does
The agent is a Claude Code orchestrator session that drives per-component sub-agents.
Its job, end-to-end:
- For each component in the manifest, run
npm audit/npm outdated(Node) orgovulncheck/go list -u -m all(Go) - Triage each finding: patch / minor → apply, major isolated → attempt, major-with-cascade → skip with reason, no-fix-available → record
- Apply the dependency changes (
npm audit fix,go get pkg@ver && go mod tidy) - Adjust source code where the upgrade broke a call site (e.g.
gobreakerv2.4.0 changeddone(bool)todone(error)) - For Go services: rebuild the Alpine binary
- Deploy the change to a local k3s cluster via the Helm chart's
mountSrcmechanism - Run the relevant e2e test suite from
Tests/ServicesTests/ - Retry on failure up to 5 times - diagnose, patch, redeploy, retest
- Commit on success, revert on persistent failure
- Restore the Helm chart's
mountSrcflag to its pre-run state regardless of outcome - Report back to the orchestrator with a structured per-component result
- The orchestrator writes a final report at
Tools/vuln-agent/reports/YYYY-MM-DD-vuln-run.md
The whole thing runs four phases on the orchestrator side:
- Phase 0 - pre-flight: a Bash + Python script hashes every JavaScript file in every component and emits a JSON map of
{ sha1 → [(component, relpath), …] }for files present in 2+ components. About 54% of the JS files in Node services are shared (loggers, ES stores, JWT helpers, circuit breakers, utility modules). - Phase 1 - parse the component manifest, build a work queue
- Phase 2 - dispatch sub-agents in batches of 5, simultaneously. Wait for the batch to finish before launching the next.
- Phase 3 - propagation: when a canonical component fixes a shared source file, the orchestrator
cps the fixed file to the other components in the same family, runsnpm install, and commits separately for each recipient. The recipient does not need its own sub-agent run. - Phase 4 - write the final report
package.json and go.mod are never propagated. Each service owns its own dependency tree.
Why this design
A few decisions shaped the approach:
Parallelism by sub-agent batching.
A single Claude Code session walking 44 services in series would take days and run out of context long before the end.
Five sub-agents in parallel cap the orchestrator's working memory and complete a full pass in a few hours.
Five is also where local resources (build CPU, k3s pod count, Elasticsearch index pressure) start to push back.
Shared-file propagation as a first-class step.
With ~54% of JS files duplicated across Node services, naively running 44 sub-agents would have meant ~44× the work for any infrastructure fix (logger, ES store, JWT helper). The pre-flight hash map lets one canonical sub-agent fix a shared file once, and the orchestrator copies the result to siblings without rerunning the audit / install / test loop on them for the same change.
Each sibling still gets its own dep upgrade pass, since package.json is per-service.
Manifest-driven, not code-driven.
Adding or removing a component, marking one as deprecated, or pointing the agent at a new test pattern is a YAML edit. No code change. The schema is small enough to read end-to-end: name, path, lang, deploymentName, setupYamlKey, testPattern (or testCommand / testNotes), skip. Deprecated components (the original Node.js versions of services now rewritten in Go) carry skip: true so they stay in the manifest as documentation without being processed.
Real cluster, real tests.
On top of the services internal tests, every fix is validated against the actual e2e suite, against a deployed pod, on a real k3s cluster.
No mocked databases, no synthetic verification.
If a dependency upgrade silently broke something, the test catches it before the commit lands.
If the test environment itself is wedged, the agent flags it as PARTIAL and moves on rather than masking the issue.
Always-restored setup.
The Helm chart's mountSrc flag is what tells the deployment to mount the local source directory into the running pod (avoiding make image push for the iteration loop).
Sub-agents enable it before deploying, then always comment it back out at the end - pass or fail. The agent leaves the working tree in the state it found it for setup.yaml, regardless of outcome.
As we start from a working state, this allows updating and testing of components in isolation, without blockers in case of failures.
Once all services are upgraded and working, a complete test run is performed after build to ensure everything is functioning as expected.
What a real run looked like
The 2026-05-11 run is the canonical example:
| Status | Count |
|---|---|
| Fixed | 43 |
| Partial | 0 |
| Clean | 0 |
| Failed | 0 |
| No tests | 5 |
43 components processed across 9 batches of 5.
Cross-cutting changes that applied to most of the stack:
- Go toolchain 1.25 → 1.26.3 on every Go service, including Dockerfile base image updates
- gobreaker v2.4.0 API change (
Allow()callback signature shifted fromfunc(bool)tofunc(error)) - applied across every Go service that used the breaker - Patches for
GO-2025-3595(golang.org/x/netHTML tokenizer XSS) andGO-2025-3553(golang-jwt/jwt/v5excessive memory on header parse) - both rolled out to every Go service in the same run - Node security upgrades on ~36 services:
koa3.2.0 (Host Header Injection HIGH),lodash4.18.1 (Code Injection via_.templateHIGH),undici7.25.0 (multiple HIGH including WebSocket overflow + HTTP smuggling),webpack5.106.2, plus a long tail of transitive fixes.
A few notable single-service fixes:
mail-sender-nodemailer7.0.4 → 8.0.7 closed an SMTP command injection HIGH and a recursiveaddressparserDoSmaintenance-fast-xml-parserupdated for a CRITICAL CVSS 9.3 entity-encoding bypass + RangeError DoS + entity expansion DoS, plus@elastic/elasticsearch8.4.0 → 8.19.1poller-protobufjs7.5.4 → 8.2.0 closed a CRITICAL CVSS 9.8 arbitrary code executioncontroller-v3(Go client) -golang.org/x/netv0.48.0 → v0.53.0 patchedGO-2026-4918, an HTTP/2 infinite loop on a malformedSETTINGS_MAX_FRAME_SIZEframe
Where the added value comes from
A few things stand out about doing this as an agent rather than by hand:
Coverage in one pass.
Auditing 43 components by hand, in sequence, with the full e2e test loop, is multi-day work for a single engineer. The agent does it in an afternoon, against a real cluster, with structured reports.
Triage discipline.
The triage rules are written once into the sub-agent prompt and applied identically to every component. A major upgrade that creates a cascade gets skipped with a documented reason; a patch upgrade gets applied without ceremony; a no-fix CVE gets recorded so it shows up in the final report instead of disappearing. No upgrade-fatigue inconsistency between the first service and the forty-fourth.
Source-file fixes survive across siblings.
When gobreaker v2.4.0 changed its callback signature, every Go service needed the same line edit at the same call sites. The agent doesn't apply the same edit 10 times; it applies it once on the canonical component, then the orchestrator cps it to the others.
Reproducible, reviewable output.
Every change lands as a git commit per component, with a structured message. The final report at reports/YYYY-MM-DD-vuln-run.md enumerates every fix, every skip with reason, every test result, every retry, every infrastructure issue encountered. Reviewing a release upgrade pass is reading one markdown file.
The framework, not just the run.
The agent's value is not just one run - it is the framework. Adding a new component is a manifest entry. Running it again next quarter is one reference to orchestrator.md. The same machinery will keep working through future toolchain bumps, future CVEs, future dependency churn, with no per-incident engineering.
What this looks like to run
The agent runs as a standalone Claude Code session. Open a new session, load Tools/vuln-agent/orchestrator.md as the first message, and the orchestrator handles everything from there. It does not need any flag - the manifest at Tools/vuln-agent/vuln-manifest.yaml is the single source of truth for what gets processed.
Closing
Spider has had a CVE backlog story for as long as it has had 40+ dependencies. The vuln-agent run on 2026-05-11 ships this release with that backlog cleared, plus the Go toolchain bumped, plus the cross-cutting gobreaker migration done, plus the Node security patches applied uniformly across every active service. The diff this produced is large; the human time it took to land it was small.
The pattern - manifest + sub-agent prompt + orchestrator + propagation - is general. It is not specific to vulnerability scanning. Any cross-cutting change that touches every component in the monorepo and has a per-component verification step (a dependency bump, a logger migration, a config-key rename, a CI pipeline switch) fits the same shape.
I expect to use it for more than CVEs.