When the scope-app restarts, it no longer has a
reference to the previous node set. Therefore,
the delta update adds *all* nodes but does not
remove legacy ones.
`reset==true` tells the frontend to start fresh.
Fixes#2708
If the scope-app API unexpectedly restarts, it has no report
at hand (until it gets one from the probe) and sends node
count 0 to the frontend for all topologies. Once the report
arrives, it will send the proper count.
What happened was the frontend did hide Processes for a short
time till the node count recovered. This moved the topology
selection to the always visible Containers (hide_if_empty == false)
while keeping the graph as is.
Once the node count recovers, Processes comes back but the
selection is still at Containers.
We now keep the selected topology visible at all time even if
the API returns a node count of 0. This recovers nicely when
the correct node counts come in. Once the user selects a different
topology while and a backend response arrives, it disappears.
Fixes#2646
...when initialising eBPF-based connection tracking.
Previously we were ignoring all eBPF events until we had gathered the
existing connections. That means we could a) miss connections created
during the gathering, and b) fail to forget connections that got
closed during the gathering.
The fix comprises the following changes:
1. pay attention to eBPF events immediately. That way we do not
miss anything.
2. remember connections for which we received a Close event during the
initalisation phase, and subsequently drop gathered existing
connections that match these. That way we do not erroneously consider
a gathered connection as open when it got closed since the gathering.
3. drop gathered existing connections which match connections detected
through eBPF events. The latter typically have more / current
metadata. In particular, PIDs can be missing from the former.
Fixes#2689.
Fixes#2700.
Without synchronisation, the isDead() call might return a stale value,
delaying deadness detection potentially indefinitely.
Without the guards / idempotence in .stop(), invoking stop() more than
once could cause a panic, since tracer.Stop() closes a channel (which
panics on a closed channel). Multiple stop() invocations are rare, but
not impossible.
when we got an fd install event but the pid was dead by time we
processed it, we would fail to remove the watcher for that pid from
the fdinstall_pids table.
This is a minor, and bounded, leak, since the table only contains pids
that were alive when we initialized ebpf. And this change only plugs
that leak very partially, since we will never remove pids that die
while sitting in accept().
The package version is irrelevant for the build process
and is not read anywhere.
The package is not published and causes confusion if the
bump is forgotten.
We defer starting the ebpf tracer until we've set the global var which
is referenced by the callback functions. Previously the var could be
unset when the callbacks are invoked, resulting in a segfault.
Fixes#2687.
* Initial top level control.
* Added the jump buttons.
* Tiny styling adjustments.
* Massive renaming.
* Pause info
* Added slider marks.
* Improved messaging.
* Freeze all updates when paused.
* Repositioned for Configure button.
* Improved the flow.
* Working browsing through slider.
* Small styling.
* Hide time travel button behind the feature flag.
* Fixed actions.
* Elements positioning corner cases.
* Removed nodes delta buffering code.
* Fixed the flow.
* Fixed almost all API call cases.
* Final touches
* Fixed the tests.
* Fix resource view updates when time travelling.
* Added some comments.
* Addressed some of @foot's comments.
ProcNet.Next does not allocate Connection structs, for efficiency.
Instead it always returns a *Connection pointing to the same instance.
As a result, any mutations by the caller to struct elements that
aren't actually set by ProcNet.Next, in particular Connection.Proc,
are carried across to subsequent calls.
This had hilarious consequences: connections referencing an inode
which we hadn't come across during proc walking would be associated
with the process corresponding to the last successfully looked up
inode.
The fix is to clear out the garbage left over from previous calls.
Fixes#2638.
The figure is inaccurate since it counts containers across all
hosts. Getting the count correct is non-trivial, so it's better to not
show the figure at all.
NB: the count still shows up on mouse-over of the link, but that is
defensible and not (very) confusing since the link represents the
image, not the image on a particular host, and it's the same count
that show up as the minor label in the container images view.
Fixes#2681.