65 posts tagged with "architecture"

View All Tags

Upgrade to Redis 6.2

April 22, 2021 · One min read

Just wanted to test :) I just upgraded Redis from 5.0 to 6.2...

Nothing to change except systemd loader
Performance is as fast as before (with no change of settings): 10 500 op/s for 7% CPU
Works like a charm

CPU

Load

Processing time

I'll let it run for some time, then I'll activate the new IO threads to check any improvement.

Later, I'll see about using the new ACL and TLS features ;)

Bundle BO modules with Webpack

April 14, 2021 · 2 min read

To avoid sharing the source code in the Dockers, I decided to bundle the BO souce code with Webpack.

Dealing with native libraries was complexifying it much. Finally, I decided to keep the node modules outside the bundle, as well as static files like HTML or configuration files that are loaded by fs.readFile. However the Lua scripts for Redis are embedded in the bundles.

Here is the webpack config file needed:

const path = require('path');
const nodeExternals = require('webpack-node-externals');

module.exports = {
    entry: './index.js',
    mode: 'production',
    output: {
        path: path.resolve(__dirname, 'dist'),
        filename: 'index.js'
    },
    externalsPresets: { node: true }, // in order to ignore built-in modules like path, fs, etc.
    externals: [nodeExternals()], // in order to ignore all modules in node\_modules folder
    node: {
        __dirname: false,
    },
    module: {
        rules: [{
            test: /\\.lua/,
            type: 'asset/source'
        }]
    }
};

Then I'm using multi stage builds in Dockerfile to:

Build the bundle
Copy all necessary files
Then create the simplest image

# Build bundle

FROM node:12-alpine AS build\_bundle

RUN apk --no-cache add --virtual build-dependencies python build-base

WORKDIR /app
COPY . /app

ARG registry
RUN npm install --quiet --no-progress --registry=${registry:-https://registry.npmjs.org} \\
    && npm run build \\
    && npm cache clean --force \\
    && npm prune --production

# Build server

FROM node:12-alpine AS build

WORKDIR /app
COPY --from=build\_bundle /app/dist /app/
COPY --from=build\_bundle /app/node\_modules /app/node\_modules
COPY docker-entrypoint.sh LICENSE package.json /app/
COPY resources /app/resources

# Final

FROM node:12-alpine

WORKDIR /app
COPY --from=build /app /app/

ARG port
EXPOSE $port
USER 1000:1000
ENTRYPOINT \["./docker-entrypoint.sh"\]
CMD \["index.js"\]

Login UI upgraded to commons

April 14, 2021 · One min read

A new background for a better look also :)

Monitoring UI update with common components

April 14, 2021 · One min read

Spider Monitoring UI has been upgraded with common components from Network UI, and common application architecture :).

Now Excel export is available on all grids !

Upgraded Impersonate feature. Such a Joy to use for testing !

April 14, 2021 · One min read

When implementing the Team feature, I needed to change often users and users right to check all access rights linked to Team.

That was getting cumbersome, and I decided to improve the impersonification feature instead!

Now, a new API exists to impersonate a user, that generates a new token from your own, with inside:

the impersonated user identification
its whisperers access rights
its own users rights (option)

One the UI, select if you want the selected users rights, and the UI and Services will behave as if this user was calling! However, all audit fields will still be valued with your own user. No cheating ;)

A shortcut option has also been added to get back your own rights fast when there are too many users.

It is damn helpful for testing, changing right and user within 2 mouse clicks!

More performance with a windowing Grid :)

January 30, 2021 · 2 min read

Rendering grid was getting slow once hundreds of rows were loaded. It affected resize of window, opening details and resizing columns, as these actions implied grid rendering.

After some thoughts on how to perform better, and some walkthrough various windowing libraries, I decided to implement windowing in the grid by myself, it would be less rework than integrating in a windowing library.

First trial: partial success after 2h30 work

A first trial was to still render all lines, but to render empty lines for lines outside the visible range. It proved I was right, windowing was working great to improve performance, but there was one drawback:

Too many rendering of empty lines when scrolling was introducing visible lags in scrolling.

I tweak it a bit, manage to get acceptable performance by triggering rendering only after a certain range of scrolling... But lag were still noticeable.

Second trial: solution ready with 30min more work

The night brought advice, as usual. I move the 'not rendering' one step above, to:

Render a big empty line above the visible range with the cumulated height of all non rendered rows
Render a big empty line below the visible range with the cumulated height of all non rendered rows
Render only the visible range (plus a bit more) in between

And result is great :)

You may load thousands of records (if you have the stamina), the grid behavior and speed is constant.

Resizing columns is fluid
Opening details is almost immediate
Resizing grid is immediate
And scrolling is fluid. If you pay attention, you'll notice the lag. But, honestly it does not matter anymore :)

Example with a grid with 1000 rows loaded

Network view UI internal upgrade

December 7, 2020 · 2 min read

Following the work on Monitoring UI, I upgraded main UI.

2020-08 - Restructure of code for better maintenance.
2020-11 - Migration finished to latest versions of React, Redux, Material UI v4, and migration to Hooks.
2020-11 - Refactor of all code to have all display and processing related to a protocol in a single folder, loaded by dependency injection in the core of the application: Components, tabs, UI parts, actions, sagas...

Result:

Components are regrouped in
- base components, that can be shared across UIs, and agnostic to this specific UI
- specifics components, that are coupled to the state structure or reall business specific
All application specific work is located in screens folder
- includes the app initialization and state recovery
- includes the screen structure and coupling to the theme
- includes containers to couple base components to the state
- includes a collections folder in which each collection include the specific for a resource:
  - details view
  - diff view
  - download package builder
  - upload parser
  - filters definition
  - grid and excel columns definition
  - map nodes and links definition
  - sequence diagram links building rules
  - stats definition
  - timeline configurations
  - settings tab
  - whisperer parsing config options
  - custom sagas if needed

The structure is much better, as well as troubleshooting :) I refactored / rewrote much of the code that dated from my starting months with React / Redux. I could still be improved, but it is much better structured and architectured! :)

HOW MUCH I LEARNED building Spider !!

Next step: extract the protocols folders into plug-ins. But this is another story!

Alerting

August 10, 2020 · One min read

New independent alert service

After much study of Prometheus alert manager and some other solutions, I decided it would be faster and cheaper in resources and time to implement a basic alerting service to start with :) Especially the security integration...

So I did it. A new service (45MB - 0% CPU) is checking various metrics from the monitoring and sending mail to administrators in case of issues. And it sends again after some time if the problem is not solved.

Complex rules can be written... as it is code ;-) Took a couple of days, and works like a charm! Of course, it is - itself - monitored in the monitoring, & integrated in the automated setup ;)

Implemented probes

ES healthcheck
Redis healthcheck
Change in infrastructure (increase or removal of nodes)
Low ES free space
No new status (Whisperers down)
Too many logs over last minute

Of course, alert are sent when the probe cannot work.

Monitoring GUI upgrade

August 10, 2020 · One min read

I started a BIG and LONG task: removing the technical debt on the UI side:

Libraries updates
Moving React to function base and hooks whenever possible
Refactoring
Material UI upgrade + CSS-in-JS + theming approach

First application to be reworked: Monitoring UI. It allowed me to start with a full application, while doing common components with Networkview, while not struggling with the complexity of the latter one.

Timeline component was refactored too (https://www.npmjs.com/package/@spider-analyzer/timeline), which leads to a much easier maintenance :)

The result being not so much visible (apart in the code), I took the opportunity to introduce one feature while doing it: the dark mode. Often requested by users, and being facilitated by MUI theming :)

Here it is:

Dark mode can be activated in the settings.

... Now, let's get this work to NetworkView UI !

Using Elasticsearch asynchronous searches

May 24, 2020 · 4 min read

On May 15th, Elastic team released Elasticsearch 7.7 introducing asynchronous searches. A way to get partial results of a search request and to avoid waiting indefinitely for a complete result.

I saw there a way to improve User Experience while loading Spider, and to avoid the last timeouts that are still sometimes occurring when generating the Timeline or the Network map over many days.

So, here it is, 9 daysafter ES 7.7 release, the implementation of async searches in Spider is stable (I hope ;) ) and efficient!

Normal search

Async search

I stumble a bit at the beginning to find the right way to use this:

When to use partial results

Loading partial results while data were already present meant resetting the existing map or timeline. The result was ugly and disturbing. I decided to limite partial loads to initial load, whisperer switch, view switch... In other words... when the resultset is empty before searching.

API integration

Although ES does not require clients to send again the query parameters to get the async search followup, Spider API does.

Indeed, the async final result may present a 'next' link to get the next page. This link is built as hypermedia and includes everything necessary to proceed easily to the next page.

As Spider is stateless, the client is required to send all request parameters for all async follow up, in order to allow Spider to build this 'next' hypermedia link. Spider makes it easy to comply with, by providing another hypermedia link with all parameters to get the async call follow up.

Client usage

I also tested several solutions to chain the calls in the client (UI) to finaly find that Elastic team made it really easy:

You may define a timeout (wait_for_completion_timeout) to get the first partial results in the first query.
- If the results are avaiable before, you get them straight, as a normal search.
- In the other cases, you get a result with partial (or no) data.
- On further call, you may get the progress of the search straight... or also provide a timeout

The beauty of this is that you don't have a drawback in using or not async. If you use timeouts, you always get results when they are available. :)

At first, I implemented it so:

Async search + timeout
Wait 250ms after partial results
Call follow-up
Wait 250ms
...

But this method may lead you to get results later than they are available. Which is bad for a real time UI like Spider.

Using Timeouts in a clever way, you combine partial results, and ASAP response:

Async search + timeout
Call followup + timeout (no wait)

With this usage, as soon as the query is over, ES give you the results. And you may propose an incremental loading experience to your users.

Implementation results

I implemented async search on UI for the following 'long' queries:

Timeline loading
Timeline quality loading
Network map loading
DNS records loading

On all 3 views (HTTP, TCP, Packet), with 1s timeouts.

The effect is visible only when loading the full period with no filters. Indeed, other queries are way below 1s ;-) On automatic refresh, you won't see the change. The queries are not 'magically' faster: doing time based aggregation on 30 millions communications still takes... 4s :-O

As it is still new, I may have missed some stuff, so you may deactivate it in the Settings panel, to get back to normal search:

Share me your feelings ! :)

First trial: partial success after 2h30 work​

Second trial: solution ready with 30min more work​

Example with a grid with 1000 rows loaded​

New independent alert service​

Implemented probes​

When to use partial results​

API integration​

Client usage​

Implementation results

References

First trial: partial success after 2h30 work

Second trial: solution ready with 30min more work

Example with a grid with 1000 rows loaded

New independent alert service

Implemented probes

When to use partial results

API integration

Client usage