Similar to video compression which uses key-frames and differences
between them: every N publishes we send a full report, but inbetween
we only send what has changed.
Fairly simple approach in the probe - hold on to the last full report,
and for the deltas remove anything that would be merged in from the
full report.
On the receiving side in the app it already merges a set of reports
together to produce the final output for rendering, so provided N is
smaller than that set we don't need to do anything different.
Deltas don't need to represent nodes that have disappeared - an
earlier full node will have that node so it would be merged into the
final output anyway.
This dependency makes it harder to see the structure of the program,
and sometimes complicates compilation.
Mostly just changing the source of strings that are already exported
from the report package. A few new strings have to be moved there,
plus the function `IsPauseImageName()`.
So we save space writing out empty topologies.
Need to fix up `app_client_internal_test.go` to use Scope's
`test/reflect` package that understands empty==nil, so now it doesn't
need a previous workaround.
Remove a similar workaround in `probe_internal_test.go` that isn't
necessary since it's already using that package.
Previously we would merge all reports in a 15-second window.
Now we use a 'quantum' of 3 seconds, similar to the single-user app.
E.g. a 30-node cluster will have 150 individual reports over 15
seconds, but the new code will merge 5 pre-merged reports plus 20-ish
very recent individual ones.
This limits the max heap size used for deserialising, since we only do
3 seconds at once per instance.
Individual reports are still put into the cache, but should get
displaced by the pre-merged ones under LRU.
We observe a slow increase in connections reported, and are unable to
find the root cause, so clear down the data every six hours and start
from a clean sheet.
Delay kernel events by up to 0.2ms, to reduce the chance the ebpf
reporter sends them out-of-order, and allow out-of-order events to
happen up to once a minute without giving up on the ebpf reporter.