Gossipers
To reduce Whisperers resource usage, and to be able to introduce new features afterward, I wanted to migrate Whisperers from
Node.js to a native language.
Trendy languages for this kind of work are Go and Rust.
Go being more used in Kubernetes ecosystem, and having a nice ecosystem, I decided to go for Go π!
Why Go?β
Why Go?
- to remove overhead of native <-> interpreted
- to be able to use eBPF later
- to reduce agent footprint
- to scale better (multithreaded)
Go vs Rust
- Go is simpler with good community
- Go is used all over in Docker, K8Sβ¦
Learning Goβ
Go is a small language, and fast to learn.
I bought a nice book from O'Reilly: Learning Go
Very nice book indeed, that taught me how to do idiomatic Go, in an efficient way!
I've read the book on my spare time, then run through the first exercises until I felt easy with the syntax...
And then I started coding for real!
Go feedbackβ
- Compiled language with strict typing and strict rules
- If it compiles, it works
- v1.22 but 15 years old
- Light language, fast to learn
- Compiles in seconds
- Good ecosystem and libraries - but nothing compared to Node.js
- Not Object Oriented, not Function Oriented, not Procedural. Hybrid.
- Threaded model, not Asynchronous
- Goroutines for threading model - Light threads
- Channels for inter-thread communication
- Context management for synchronization and orchestration
- Modern standard library
- Structs with methods, Maps, Slices
- Webserver, webclient, JSON, XML, YAML, GRPC marshalling
- Modern Time library
- Structured logs
- No exception management
- All errors must be managed in place
- All output values must be used
- The complete dev toolbox is inside Go application
Differences with Node.jsβ
- Blocking langage
- β to parallel memory access
- Memory access has to be safeguarded: not // access to the same memory for read and write in different threads
- In Node async world, the same memory is accessed by all βworkersβ, this is so much simpler
- Much work in application architecture
- For non blocking processing
- To ensure segregation of thread memory
- But GoRoutines & Channels are great for data flow management
- Naturally, i ended up with tens of goroutines
- Which is much less than what Go uses internally: thousands are created by minute!
- Avoid using mutex in most cases
- Naturally, i ended up with tens of goroutines
Programming for real... with ChatGPT!β
To help me, I bought a licence of Golang from JetBrains, and tried the new AI Assistant.
That was such a good bet!
- It helped me create the file structure and the build files
- It converted whole Javascript files, keeping the algorithms
- It helped me coding all the boilerplate and more
AI Assistant is using ChatGPT behind the scene!
When you stop typing, it tries to figure out what you may want to type next... And he's very good at it π²
The combination of GoLang and AI Assistant was su great that in less than 1 week work, I had a new agent working and capturing network!
IntelliJ AI Assistantβ
- ChatGPT 4 inside your IDE (for real)
- Always active, proposes
- File structure creation for the project, Makefile, Dockerfile etcβ¦
- Functions inner code once you type in the name, inputs and outputs
- Code once you type a comment
- Completion of block for repeating code
- Allows to request
- Find problems
- Generate comments
- Generate code β Diff
- Suggest refactoring β Diff
- Write documentation and Git messages
- Convert files
Very, very efficient !!
Note: I've seen rather bad comments on AI Assistant on the web.
On the opposite, my experience was great, especially with Go, but I did not compare with CoPilot or others.
Whisperer to Gossiperβ
The tough part was moving from an async programming to a threaded model, while keeping a non-blocking process to ensure the throughput.
- First, I reproduced the internal structure of the Whisperer, using many Mutex to share memory. But this was not idiomatic.
- Then I performed several huge refactoring to transform the whole process and management flow with channels.
I became indeed so much cleaner.
Another tough part was the 'clean stop'. Reproducing the clean stop of Whisperers took me some hours, using Go Waiting Groups and context cancel features.
A last difficulty, I could not find a nice circuit breaker library that would provide the statistics required to respect the same observability than for Whisperers... Then I did my own π
The final result is great!
- No change in Spiders API
- Gossipers replace seamlessly a Whisperers
- Memory usage while streaming is 2x less - 70MB -> 30MB
- CPU usage is 3x less - 200m -> 70m
- And Docker image is 4x smaller! 85MB -> 22MB
Moreover, it may now be deployed outside Docker, as a 12MB executable!
Deploying Gossiperβ
Binary Downloadβ
Gossiper releases are available for download on Spider website.
Infra as Code - Sidecarsβ
Docker image is available in registry.gitlab.com/spider-analyzer/public-images/gossiper
.
You may simply replace Whisperer image by this one in your IaC to test it.
Sidecar manifestβ
Here is the yaml to deploy Gossiper as a Sidecar
- latest version of Gossiper
- injecting its JSON configuration from a secret
- Generate the Whisperer configuration on the UI
- Put it in a Kubernetes secret
apiVersion: v1
kind: Secret
metadata:
name: spider-config
stringData:
CONFIG: |
{
"whisperer": "...",
"spiderConfigURI": "...",
"privatePem": "-----BEGIN RSA PRIVATE KEY-----\n...\n-----END RSA PRIVATE KEY-----"
}
- Add the Whisperer as a sidecar to any POD / deployment
- name: myservice-whisperer
image: "registry.gitlab.com/spider-analyzer/public-images/gossiper:latest"
imagePullPolicy: Always
resources:
requests:
cpu: 10m
memory: 50Mi
limits:
cpu: 500m
memory: 500Mi
env:
- name: CONTAINER_NAME
value: myservice-whisperer
- name: LOG_LEVEL
value: info
- name: DNSCACHE_HOST
value: local-controller
- name: DNSCACHE_PORT
value: "53"
envFrom:
- secretRef:
name: spider-config
Gossiper does not resolve Kubernetes PODs names / Ip by itself anymore.
I deprecated the feature
It completely relies on Spider Controller DNS feature to do it, thus reducing the RBAC required to run Spider as a Sidecar.
On Kubernetes, you must - then - specify the custom DNSCACHE environment variable.
- You may provide the controller service as a hostname (like above) or as an IP.
- The controller must be accessible from all namespaces, and reversely
Attachmentsβ
Spider UI, server and controllers have been updated slightly to allow choosing the agent when attaching a Whisperer.
That's the easiest way to try Gossipers!
New featuresβ
I could not resist to bring new features already with this first release.
New AF Packet
capture methodβ
With Gossiper, you may choose which network capture method should be used.
For now, AF Packet
and Libpcap
may be chosen.
Which one is best?
- LibPcap
- 'Legacy' way
- Available everywhere (Linux, Windows, Older kernels)
- AF Packet
- A more recent way, with direct memory sharing between kernel and application
- Faster than old Libpcap way
- But recent versions of Libpcap are now using AF Packet
- Still, using Libpcap requires using CGo to interact with Go application
- Using AF Packet does not requires it
All in all, AF Packet should be better in most cases, and Gossiper tries this one first, with a fallback to Libpcap.
You should see less packet loss with AfPacket then Libpcap.
All Spider features work the same, except that
- packets are always seen with an Ethernet first layer. Even on Linux cooked.
- the packet loss count due to libpcap... is always 0. π
Choosing the capture methodβ
You may force what capture method to use in the Whisperer Capture configuration.
Viewing the current capture methodβ
The currently used capture method is brought back within Whisperers status.
Why Gossiper ?β
In fact, I did a brainstorming with ChatGPT to find a new name for a Whisperer in Go... and I really liked this proposal!
It feels so good:
- Gossiper and Whisperer both speaks about small talks that you want to hear
- Both sounds close
- And Gossiper... includes 'Go' π