Spider internal cluster gateway was until this week NGINX.
However, NGINX was presenting various issues in current setup:
- In order to absorb scaling up and down of replicas, I was asking NGINX to resolve the IP of the services VIP on every call. DNS resolver had a cache of 30s.
- The main issue was that NGINX can’t do persistent connections to the upstreams in this case.
- This made NGINX create a new TCP socket for every request. But soon enough when all TCP sockets were booked, it implied an increase of response time of 1 or 2s for linux to recycle sockets.
Change was needed!
Traefik is more and more used as a gateway for Docker clusters (and others).
It is indeed designed to integrate with Docker and to keep updated with cluster state.
So I switched Spider to Traefik this week. And the results are … astonishing !!
Although the response time from the clients have not changed much, the response time internal to the cluster have improved of 80% !!
I only struggled with Traefik configuration on Path rewriting. It has less options than NGINX on this field.
I had to add some custom rerouting inside the UIs themselves.