Skip to main content

GUI errors are caught and sent to server

· 2 min read

Thanks to Error boundaries and refactoring of errors on the saga side, I implemented something similar to what sentry.io is offering:

  • Client side errors tracking

Indeed, Sentry is right: clients do not send you the errors they found, it is best to get them yourselves! If you want to troubleshoot easily your GUI errors, you have to know them.

So this is what I did:

Sending the client side errors

  • Saga error and React error are caught globally
  • Once caught they are enriched with as many metadata as I can
    • Last request performed in case of Saga errors
    • What React component failed in case of React errors
    • A dump of current Redux state
      • I reused the 'share' feature :)
    • Last user action performed
      • I had it from my session tracking statistics feature
  • Then the log is sent to Spider server in background.
    • If this sending fails, nothing is done, except console logging.

Reporting on the client side errors

As I'm already using Elasticsearch to store most resources, storing the UI logs inside was peanuts. And then, I had included them in Spider monitoring UI:

  • I know how many errors were done and when

  • I can browse them

  • I can open them to find details and troubleshoot

Here it is a safety belt when the interval of timeLine is too small comparing to time range. I need to do something on timeline to avoid having these errors. See, it is useful! :)

  • I can search for similarities
  • And... I can open the user state in Spider UI and reproduce the error, just by copy pasting the link :-)

Amazing !

New toasts

· One min read

React 16 allowed me to integrate a much better toast library than the one offered initially by Material UI: react-notifications-component

This library offers:

  • Notifications types: info, warn, error...
  • Customisation with images, titles and so on
  • Notifications stacking
  • Custom placements...

The integration was purely limited to changing the encapsulation of Material UI notifications and offering more types to saga errors. Quite quick :)

Here is an example of notifications when you try to zoom too much (with the scroll):

React Upgrade - Error boundaries, new Toast, new Tabs

· 2 min read

Long time no news! Not that Spider did not progress during 3 months, more that I was lasy to write about it.

There are been a series of improvements, and I'm about to share the most important ones.

All UIs have been migrated to lastest React: 16.8

This was much needed and postponed, as I was still using React 14. Migration was mostly seemless witht the nice tooling offered by the community, and the scripts to converts the code (extract PropTypes, use PureComponents...)

Libraries upgrade

In the same time, I upgraded nearly all libraries. Few libraries couldn't get upgrade as the changes were too big:

  • Tabs: the library I used was not maintained (react-tab-panel), and the repo has been removed. I switched to react-tabs. Which offers less features, is simpler, but is well maintained.
  • Redux
    • Redux was not upgraded to latest version as this one was using React context and was showing performance issues. I have to wait for next one.
  • React-select
    • React-select did a major rework and change, and upgrade will need a seperate task. Moreover, it did not fill needed.
  • React-dropzone
    • Major interface break as well, less prioritary.
  • Material-UI
    • Material UI encountered many major version change, and this would impact much of the UI to upgrade it.
    • I created a separate task for this.

New feature: Error boundaries

Coming with React 16 are improved error boundaries. The features help catching rendering errors in React component tree and handle them gracefully by rendering a fallback when the error is catched.

I encapsulated all branches of Spider UI into these new error boundaries, so that the error is nicely catch and displayed. And so that avoiding the source of the error (changing record, or timespan) would avoid the error without requiring a full refresh of the page.

  • The bottom grid
  • The detail view
  • A detail view tab
  • The map
  • The timeline
  • The menu

When an error happens, instead of the expected content, you will see a poor fallen spider ;)

Here is a sample of an error on an HTTP request detail:

Browsers compatibility improved

· One min read

I corrected some Javascript and CSS, and now Spider UI has been successfully tested on:

  • Chrome
  • Chromium
  • Firefox
  • Opera

I always used latest version of course. There might be some unseen issues, please inform me if you find any.

Edge will not be supported as too limited, and for Safari... I don't have an Apple laptop to test ;-), sorry!

Improved search input autocompletion and syntax highlighting

· One min read

I did some reverse engineering over ACE autocompletion and syntax highlighting library and managed to improve Lucene autocompletion without changing the library :-) Man, I wish there were better documentation!

Improvements

  • Regular expressions are recognized in syntax highlighting
  • Numbers and ElasticSearch operators are recognized as such

  • Autocompletion do not take anymore the '.' as an identifier delimiters
    • So, no more strange behavior when selecting a subsubproperty in the completions options

  • Autocompletion avoid proposing to insert identifiers in the values anymore
    • I did not manage to handle all cases, but most are

Monitoring - Tracking system load

· One min read

2 new charts have been added to monitoring dashboards:

Redis load/s

Tracks the load required on Redis instances in the cluster.

Redis is fast, we can notice up to 12k requests/s on one instance. Compared to CPU usage... it is almost 1k req/s = 1% CPU. Impressive !!

Services load/s

Tracks the services load in the cluster.

Here there is no direct relation to CPU usage. As the process of each service can be quite different.

Spider is cookie free :)

· One min read

Thanks to the latest knowledge I got from Javascript and HTML5, I managed to remove the need for Cookies in Spider UI!!

So, Spider is low fat now, no more cookies needed, the JWT token is managed in all communications to the server.

API updates

· 2 min read

Hi,

Over the last couple of weeks, I took over the - not so interesting - mission of updating the API specification. It is available here: https://spider-analyzer.io/home/api/

Changes:

  • Specification updated to Open API specification 3.0
  • New objects diagram
  • All APIs are described with:
    • Structure of inputs
    • Structure of responses / resources
    • Examples
    • Quick start guide
  • I improved the usability of the API with:
    • Date parameters available on top of timestamp parameters: startDate can be used instead of startTime
    • Added hostnames in TcpSessions and HttpCommunications resources for easier search on FQDN

During the API review, I found some security issues... All are fixed now ;-) So doing, I added:

  • The possibility to purge on shared Whisperers for those who have the right. (Before, it was free for all ;))
  • A protection against wrong login: after 5 attempts, the account is blocked for some time
  • More security around customers details access
  • The name of the customer owning the Whisperer in Whisperer details's view

Wilfried already tested it, and... was happy to integrate it in its test results checks ;-)

Cheers

Technical migrations... and how much Spider is fast now !!

· 2 min read

December has been a month of migrations for Spider. And how much I'm happy to have done them! Read below.

Migration path

  • From NGINX to Traefik on 6/12
  • From Node 7 to Node 8
  • From Javascript generators to async/await programmation pattern
  • From eTags generation on full resource to eTags based on id+lastUpdate date (CPU saving)
  • From Node 8 to Node 10 (actually done in January)

The result of all this?

  1. A division by 2 to 5 of microservices CPU usage !
  2. A division by 4 to 6 of response times as seen from Whisperers !!
  3. A division by 5 to 12 of internal response times !!!

This was amazing!

I did not believe it, but yes that's proven. Node was saying that async await was faster than generators, then Google improved async/await processing speed by 8x in the V8 version embedded in Node 10.

Examples

  • Processing of packets improved from 484ms 90th percentile for 100 packets bulk request to ... 69 ms!!
  • Patching TcpSessions improved from 266ms 90th percentile to 13 ms!!! Excuse me!
  • CPU using of parsing job improved from 172% to 80% and for packets decoding, from 295% to ... 55% !! Amazing :-)
  • Previously Spider needed 4 servers to process 2500 packets/s, now... i could almost do it on two, and much much faster ! :)

Conclusion

Yes, Node.js is blazingly fast. This was right for callback mode, and now it is back for async/await ! :-)

Figures

Source: Google spreadsheet

And Streetsmart?

And you know what? Streetsmart is almost in a state before all my migrations. Imagine if migrations have the same effect for Streetsmart. It would be awesome ! :-)

Well, that's part of my plan indeed!!

Change of API gateway / reverse proxy / ingress controller

· 2 min read

NGINX

Spider internal cluster gateway was until this week NGINX. However, NGINX was presenting various issues in current setup:

  • In order to absorb scaling up and down of replicas, I was asking NGINX to resolve the IP of the services VIP on every call. DNS resolver had a cache of 30s.
  • The main issue was that NGINX can't do persistent connections to the upstreams in this case.
  • This made NGINX create a new TCP socket for every request. But soon enough when all TCP sockets were booked, it implied an increase of response time of 1 or 2s for linux to recycle sockets.

Change was needed!

Traefik

Traefik is more and more used as a gateway for Docker clusters (and others). It is indeed designed to integrate with Docker and to keep updated with cluster state.

So I switched Spider to Traefik this week. And the results are ... astonishing !!

Although the response time from the clients have not changed much, the response time internal to the cluster have improved of 80% !!

Note: I only struggled with Traefik configuration on Path rewriting.  It has less options than NGINX on this field. I had to add some custom rerouting inside the UIs themselves.