The rendering code checks whether endpoint IPs are part of
cluster-local networks. Due to the prevalence of endpoints - medium
sized reports can contain many thousands of endpoints - this is
performance critical. Alas the existing code performs the check via a
linear scan of a list of networks. That is slow when there are more
than a few, which will be the case in the context of k8s, since there
the probes register service IPs as local /32 networks.
Here we change representation of the set of networks to a prefix
tree (aka trie), which is well-suited for IP network membership checks
since networks are in fact a bitstring prefixes.
The specific representation is a crit-bit tree, but that choice was
purely based on implementation convenience - the chosen library is the
only one I could find that directly supports IP networks.
The rendering code checks whether endpoint IPs are part of
cluster-local networks. Due to the prevalence of endpoints - medium
sized reports can contain many thousands of endpoints - this is
performance critical. Alas the existing code performs the check via a
linear scan of a list of networks. That is slow when there are more
than a few. Unfortunately in some common k8s network setups, e.g. on
AWS, a cluster can contain hundreds of networks, due to /32 networks
derived from interfaces with multiple IPs.
Here we change representation of the set of networks to a prefix
tree (aka trie), which is well-suited for IP network membership checks
since networks are in fact a bitstring prefixes.
The specific representation is a crit-bit tree, but that choice was
purely based on implementation convenience - the chosen library is the
only one I could find that directly supports IP networks.
...by removing them. It was a ridiculous amount of contorted code to
test some utterly trivial functionality that is largely provided by
the golang stdlib.
* Node details fetching reports at proper timestamp.
* Corrected all the relevant timestamps in the UI.
* Renamed some state variables.
* Time travel works for topologies list.
* Added a whole screen overlay for time travel.
* Polished the backend.
* Make time travel work also with the Resource View.
* Fixed the jest tests.
* Fixed the empty view message for resource view.
* Some naming polishing.
* Addressed the comments.
Pins d3-transition and d3-drag dependencies which were previously
pulled in as deps from d3zoom as version 1.1.0 each. This broke
the zoom feature for `npm start`.
Fixes#2545
* Hacky working prototype.
* Operate with time.Duration offset instead of fixed timestamp.
* Polished the backend code.
* Made a nicer UI component.
* Small refactorings of the websockets code.
* Fixed the backend tests.
* Better websocketing and smoother transitions
* Small styling refactoring.
* Detecting empty topologies.
* Improved error messaging.
* Addressed some of David's comments.
* Moved nodesDeltaBuffer to a global state to fix the paused status rendering bug.
* Small styling changes
* Changed the websocket global state variables a bit.
* Polishing & refactoring.
* More polishing.
* Final refactoring.
* Addressed a couple of bugs.
* Hidden the timeline control behind Cloud context and a feature flag.
* Addressed most of @davkal's comments.
* Added mixpanel tracking.
the information is constant and already present in the id, so we can
extract it from there.
That reduces the report size and improves report encoding/decoding
performance. It should reduce memory usage too and improve report
merging performance too.
NB: Probes with this change are incompatible with old apps.