Commit Graph

446 Commits

Author SHA1 Message Date
Matthias Radestock
ae2a5edc18 make nodeSummaryGroupSpecs only include what's needed 2017-06-21 18:24:14 +01:00
Matthias Radestock
b6c886e0d1 cosmetic 2017-06-21 18:19:10 +01:00
Matthias Radestock
a306867610 fast network membership check
The rendering code checks whether endpoint IPs are part of
cluster-local networks. Due to the prevalence of endpoints - medium
sized reports can contain many thousands of endpoints - this is
performance critical. Alas the existing code performs the check via a
linear scan of a list of networks. That is slow when there are more
than a few, which will be the case in the context of k8s, since there
the probes register service IPs as local /32 networks.

Here we change representation of the set of networks to a prefix
tree (aka trie), which is well-suited for IP network membership checks
since networks are in fact a bitstring prefixes.

The specific representation is a crit-bit tree, but that choice was
purely based on implementation convenience - the chosen library is the
only one I could find that directly supports IP networks.
2017-06-21 03:03:49 +01:00
Matthias Radestock
9e75331e9a Revert "fast network membership check"
This reverts commit 98f036359b.
2017-06-20 20:51:27 +01:00
Matthias Radestock
98f036359b fast network membership check
The rendering code checks whether endpoint IPs are part of
cluster-local networks. Due to the prevalence of endpoints - medium
sized reports can contain many thousands of endpoints - this is
performance critical. Alas the existing code performs the check via a
linear scan of a list of networks. That is slow when there are more
than a few. Unfortunately in some common k8s network setups, e.g. on
AWS, a cluster can contain hundreds of networks, due to /32 networks
derived from interfaces with multiple IPs.

Here we change representation of the set of networks to a prefix
tree (aka trie), which is well-suited for IP network membership checks
since networks are in fact a bitstring prefixes.

The specific representation is a crit-bit tree, but that choice was
purely based on implementation convenience - the chosen library is the
only one I could find that directly supports IP networks.
2017-06-20 19:31:11 +01:00
Matthias Radestock
873fac12ac memoize isKnownServices for improved performance 2017-06-19 13:29:43 +01:00
Matthias Radestock
0d0414d348 faster matching of known services
We hit this code *a lot* during rendering.
2017-06-18 16:02:34 +01:00
Mike Lang
f403d01885 Forgot to include daemonsets in renderKubernetesTopologies
Yay, needing to remember 10 different obscure places to add a new topology every time.
2017-06-12 10:13:54 -07:00
Matthias Radestock
afbc1decab drop addr and port from Endpoint.Latest map
the information is constant and already present in the id, so we can
extract it from there.

That reduces the report size and improves report encoding/decoding
performance. It should reduce memory usage too and improve report
merging performance too.

NB: Probes with this change are incompatible with old apps.
2017-06-10 19:19:56 +01:00
Matthias Radestock
912c684e65 optimise memoisation for parallel execution
don't start the same piece of work twice
2017-06-05 10:30:11 +01:00
Matthias Radestock
91d3497f7d parallelise 'reduce' 2017-06-05 08:44:17 +01:00
Matthias Radestock
6eaffb44e0 fix bug: handle short-lived ebpf-tracked connections again
This got broken in #2559.

The problem here is similar to #2551.
2017-06-04 18:42:54 +01:00
Matthias Radestock
30c38a958f remove blatant falsehoods from comments 2017-06-04 16:23:03 +01:00
Matthias Radestock
ebcf9dcf10 refactor: rename ShortLivedConnectionJoin to ConnectionJoin
since it's dealing with *all* connections, not just short-lived ones.
2017-06-04 16:10:21 +01:00
Matthias Radestock
9bc7b30f0f extract and expand endpoint procspied filter
The filter needs to exclude both procspied and eBPF-tracked endpoints,
since both will be picked up by the process topology.
2017-06-04 16:10:21 +01:00
Matthias Radestock
707add13a3 refactor: simplify some filters 2017-06-04 16:10:21 +01:00
Matthias Radestock
ee0736df69 refactor: extract constant mapEndpoint2IP 2017-06-04 16:10:21 +01:00
Matthias Radestock
6697f4a897 refactor: declosure ShortLivedConnectionJoin 2017-06-04 16:10:21 +01:00
Matthias Radestock
ff4a4c08ce refactor: remove pointless optimisation 2017-06-04 10:36:55 +01:00
Mike Lang
3aa4a676dd Add new view for daemonsets 2017-05-19 15:06:53 -07:00
Mike Lang
c60731b043 Add report topology for daemonsets 2017-05-19 15:00:01 -07:00
preston_doster_tc
ed9c369f50 Standardized formatting. 2017-05-15 15:27:48 -05:00
preston_doster_tc
0f1c2f1cb7 Corrected spacing. 2017-05-15 13:32:39 -05:00
preston_doster_tc
df58f55782 Added Azure endpoints so they show up as individual nodes instead of under 'The Internet'. 2017-05-15 13:27:06 -05:00
Mike Lang
51999529a7 Add docker swarm Stack selector ala k8s namespace selector
We have to introduce the kinda hacky concept of a 'No Stack' stack
to reconcile it with the idea of a 'default' k8s namespace. This is important
because swarm services without a stack don't have the same docker labels as ones that do.
Curiously, they still have what appears to be a stack name 'prefix' on their names,
but I can't isolate that name anywhere easily so they'll just have to make do.

I basically copy-pasted updateFilters to make this work, todo go back and refactor
to not duplicate 90% of the code.
2017-04-18 09:08:22 -07:00
Mike Lang
2b208580ab Add new topology view for Docker Swarm services 2017-04-14 17:18:06 -07:00
Mike Lang
9f0f120bc5 Remove explicit listing of api topologies in render/detailed/node specs
Instead, we can infer them from the render topology and the primaryAPITopology map
2017-04-10 15:06:38 -07:00
Mike Lang
3656965ae7 Refactor Map2Parent and family into one function
This greatly improves code reuse while keeping the behaviour flexible
2017-04-10 14:30:53 -07:00
Mike Lang
9c88ad85e9 render/detailed/parents: Refactor for less repeated information
We replace the existing data structure with a simpler one that
only specifies how to get the parent label, which is the only
part of the Parent struct that can't be generated from the node info alone.

Future work: Standardize this concept of a label and put it in the topology instead.
Though that already exists...so just use it?
2017-04-10 14:30:52 -07:00
Mike Lang
2a74883cce If no node summary generator exists for topology, do a sane default
The default sets the node label to the node ID.
This is likely to not look very good, but the intent is that it creates an obvious problem,
ie. that the node ID is being used as the label, rather than a silent omission or more subtle problem.

Possible future work:
* For single-component IDs, extract the component automatically and use that instead.
* Instead of functions, in simple cases just have a LUT by topology with common behaviours like
  'stack = true or false', 'label = this key in node.Latest'
The latter opens up to eventually moving this info inside the report itself ala topology templates,
or at least centralizing it in the source.
2017-04-10 14:30:52 -07:00
Mike Lang
c16becc148 render/detailed: When summarising children, add fallback for unlisted topologies
Currently, if a topology does not have any specific info in nodeSummariesByID,
any children of the node that belong to that topology will be silently omitted.

This change adds a default behaviour for such topologies, with no special columns
but at least it is displayed at all.
Unlisted topologies are displayed after all listed ones, in arbitrary order.

Note that completely bogus or other special cases (eg. topology = Pseudo) still will not
be displayed as report.Topology() will fail.
2017-04-10 14:30:52 -07:00
Mike Lang
14ab5ccceb render: Maintain a list of 'primary' api topologies for each report topology
This gives us a single source of truth in a variety of situations where we want to
know what view to direct a user to in order to 'open' a particular node.

I wanted to put this in app/api_topologies where the views are defined, but that creates
a circular import.
2017-04-10 14:30:52 -07:00
Mike Lang
efb68fb2da api_topologies: Add a selectType field to option groups
This field changes the option group behaviour depending on its value.
Currently only supports two values:
"one" (default): Old behaviour, one option can be selected
"union": Any number of options can be selected, and the filters are OR-ed togther

It is written in such a way as to easily enable a future "intersection" option,
as per union but AND-ing the filters. But this is not done here. YAGNI.
2017-03-27 10:06:56 -07:00
Alfonso Acosta
8814e856e0 Merge pull request #2338 from weaveworks/2324-exclude-pause-from-k8s
Exclude pause containers when rendering k8s topologies
2017-03-23 23:48:17 +01:00
Mike Lang
da8b8d5095 Revert "Revert "Merge pull request #2285 from weaveworks/mike/k8s-ns-in-container-view""
This reverts commit d55c528fe2.
2017-03-20 10:05:10 -07:00
Mike Lang
d55c528fe2 Revert "Merge pull request #2285 from weaveworks/mike/k8s-ns-in-container-view"
This reverts commit 76ddc75fb8, reversing
changes made to 3ade2933eb.

We are rolling this back for now because it's causing a bug where sub-topologies
would have ~3000 repeated cases of the k8s filters, causing performance issues clientside.
2017-03-17 14:00:05 -07:00
Mike Lang
76ddc75fb8 Merge pull request #2285 from weaveworks/mike/k8s-ns-in-container-view
When k8s present, allow filtering of containers by namespace
2017-03-16 14:56:10 -07:00
Mike Lang
b01e890475 When k8s present, allow filtering of containers by namespace
To facilitate this, we replace the existing functionality of updateFilters which
sets k8s topologies to have the filters [namespace, managed], to instead append the namespace filter
to any existing. This lets it apply to both k8s and container topologies without overwriting existing
container filters. We instead set the managed filter in the static definition.

This however has the side effect that the ordering of the namespace filter and the managed filter
in k8s topologies has been reversed, so it reads:
	Show Unmanaged | Hide Unmanaged
	foo | bar | default | baz | All Namespaces
instead of:
	foo | bar | default | baz | All Namespaces
	Show Unmanaged | Hide Unmanaged
2017-03-16 14:21:11 -07:00
Alfonso Acosta
806b27e785 Exclude pause containers when rendering k8s topologies 2017-03-16 12:18:28 +00:00
Iago López Galeiras
9920c4ea48 Add eBPF connection tracking without dependencies on kernel headers
Based on work from Lorenzo, updated by Iago, Alban, Alessandro and
Michael.

This PR adds connection tracking using eBPF. This feature is not enabled by default.
For now, you can enable it by launching scope with the following command:

```
sudo ./scope launch --probe.ebpf.connections=true
```

This patch allows scope to get notified of every connection event,
without relying on the parsing of /proc/$pid/net/tcp{,6} and
/proc/$pid/fd/*, and therefore improve performance.

We vendor https://github.com/iovisor/gobpf in Scope to load the
pre-compiled ebpf program and https://github.com/weaveworks/tcptracer-bpf
to guess the offsets of the structures we need in the kernel. In this
way we don't need a different pre-compiled ebpf object file per kernel.
The pre-compiled ebpf program is included in the vendoring of
tcptracer-bpf.

The ebpf program uses kprobes/kretprobes on the following kernel functions:
- tcp_v4_connect
- tcp_v6_connect
- tcp_set_state
- inet_csk_accept
- tcp_close

It generates "connect", "accept" and "close" events containing the
connection tuple but also pid and netns.
Note: the IPv6 events are not supported in Scope and thus not passed on.

probe/endpoint/ebpf.go maintains the list of connections. Similarly to
conntrack, it also keeps the dead connections for one iteration in order
to report short-lived connections.

The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
there and still used at start-up because eBPF only brings us the events
and not the initial state. However, the /proc parsing for the initial
state is now done in foreground instead of background, via
newForegroundReader().

NAT resolution on connections from eBPF works in the same way as it did
on connections from /proc: by using conntrack. One of the two conntrack
instances is only started to get the initial state and then it is
stopped since eBPF detects short-lived connections.

The Scope Docker image size comparison:
- weaveworks/scope in current master:  22 MB (compressed),  68 MB
  (uncompressed)
- weaveworks/scope with this patchset: 23 MB (compressed), 69 MB
  (uncompressed)

Fixes #1168 (walking /proc to obtain connections is very expensive)

Fixes #1260 (Short-lived connections not tracked for containers in
shared networking namespaces)

Fixes #1962 (Port ebpf tracker to Go)

Fixes #1961 (Remove runtime kernel header dependency from ebpf tracker)
2017-03-08 22:11:12 +01:00
Filip Barl
2e9255b190 Addressed the comments and fixed the tests. 2017-02-20 11:40:40 +01:00
Filip Barl
f1904a626f Fix filtering issue for uncontained nodes in DNS name view (#2170). 2017-02-20 11:38:21 +01:00
Mike Lang
fad3e88269 Rename ECS Service node ids to be cluster;serviceName
This is important for two reasons:
* It prevents nasty false-equality bugs when two different services from different ECS clusters
  are present in the same report
* It allows us to retrieve the cluster and service name - all the info we need to look up the service -
  using only the node ID. This matters, for example, when trying to handle a control request.
2017-02-03 13:45:18 -08:00
Alfonso Acosta
0a135e6330 Check for known services before external IPs
Known services can be internal (e.g. same VPC in AWS)
2017-01-31 15:37:57 +00:00
Mike Lang
fca76d661e Merge pull request #2145 from weaveworks/mike/render/fix-ecs-detailed-parents
render.detailed: Add ECS topologies to detailed parents conversion
2017-01-23 13:23:45 -08:00
Mike Lang
19b6d5b1c4 Merge pull request #2146 from weaveworks/mike/render/fix-ecs-detailed-children
render.detailed: Teach detailed node view how to display ECS topologies as children
2017-01-23 13:23:27 -08:00
Mike Lang
eb44e95922 render.detailed: Teach detailed node view how to display ECS topologies as children
Previously, they were omitted.
2017-01-20 18:07:54 -08:00
Mike Lang
39e735f99e render.detailed: Add ECS topologies to detailed parents conversion
so they will correctly show up in the details view.
Currently, since ECS topologies aren't considered, the ECS parents of nodes are silently dropped.
2017-01-20 17:12:40 -08:00
Filip Barl
d3466b5454 Refactored the table component/model and wrote the tests
Backward-compatibility fix
2017-01-16 17:05:36 +01:00
Filip Barl
31be525bd2 Created generic table model on backend
Replaced MetadataRow with generic Row in Table model

Sending through multicolumn tables from the backend
2017-01-16 12:22:10 +01:00