Also, 1 packet may be counted in N topologies, so you can't rely on the
sum of all packet counts across topologies having any relation to the
sampling data.
Another implicit invariant in the data model is that edges are always of the
form (local -> remote). That is, the source of an edge must always be a node
that originates from within Scope's domain of visibility. This was evident by
the presence of ingress and egress fields in edge/aggregate metadata.
When building the sniffer, I accidentally and incorrectly violated this
invariant, by constructing distinct edges for (local -> remote) and (remote ->
local), and collapsing ingress and egress byte counts to a single scalar. I
experienced a variety of subtle undefined behavior as a result. See #339.
This change reverts to the old, correct methodology. Consequently the sniffer
needs to be able to find out which side of the sniffed packet is local v.
remote, and to do that it needs access to local networks. I moved the
discovery from the probe/host package into probe/main.go.
As part of that work I discovered that package report also maintains its own,
independent "cache" of local networks. Except it contains only the (optional)
Docker bridge network, if it's been populated by the probe, and it's only used
by the report.Make{Endpoint,Address}NodeID constructors to scope local
addresses. Normally, scoping happens during rendering, and only for pseudo
nodes -- see current LeafMap Render localNetworks. This is pretty convoluted
and should be either be made consistent or heavily commented.
During rendering, RenderableNodes are created by Map funcs, which take only
NodeMetadata. Previously, it was simply assumed that the relevant keys for a
given Map func would be present in the metadata. If that implicit invariant
failed, the returned RenderableNode would be invalid, with e.g. an ID of
"hostid::". That, in turn, would trigger undefined behavior later on in the
rendering workflow.
This bug was detected by creating a partial node metadata for a non-local
endpoint node. That node was detected during the first phase of rendering, and
given an invalid renderable node ID of "myhostname::", which prevented it from
attaching to TheInternet pseudonode. It eventually got removed from the set
of valid nodes, which meant nodes that were adjacent to it suddenly became
orphans, and got filtered out by the FilterUnconnected step of the rendering
pipeline.
With this change, every map func checks for the presence of mandatory fields,
i.e. the fields that compose the resulting renderable node's ID.
Also,
- Add unit tests for LeafMapFuncs
- Topology Validate checks NodeMetadatas must not have nil Metadata
NewNodeMetadata -> MakeNodeMetadata. It doesn't return a pointer, so
Make is more idiomatic.
Invoke MakeNodeMetadata when necessary. The zero value for a
NodeMetadata is no longer valid.
Split MakeNodeMetadata to two constructors. MakeNodeMetadata when you
don't have anything to prepopulate; MakeNodeMetadataWith when you do.
Also, a fix to the tests in app. We unmarshal a RenderableNode struct,
which has a JSON-ignored NodeMetadata field. The zero value is invalid,
so we need to fix that before performing comparisons.
Also:
- build a custom URL matcher to cope with container image names having a encoded forward slash in them.
- escape node ids in the UI when constructing URLs.
- add a test which fetches all the nodes of all topologies, and update report fixture to have slash in container image names.
This causes detailed node lookups for the grouped-by-process-name view to fail. Also, add a test for process walker trimmming whitespace, and a test the process-by-name view gives the right result.
- Move render.Rpt to test.Report
- Remove report/report_fixture_test.go; it was unused
- Remove report fixture in app/ and make tests use test.Report
- Move expected outputs into render/expected, so they can be reused
- Move pidtree to its own module and disaggregate it into tree, walker and reporter.
- Extend testing for probe/process
- Extend process metadata; add command line & # threads.
This makes container image details show the containers (and processes) correctly.
Also:
- introduces a 'test' package, moved Diff function there.
- adds some tests for this new rendered view.