We have to introduce the kinda hacky concept of a 'No Stack' stack
to reconcile it with the idea of a 'default' k8s namespace. This is important
because swarm services without a stack don't have the same docker labels as ones that do.
Curiously, they still have what appears to be a stack name 'prefix' on their names,
but I can't isolate that name anywhere easily so they'll just have to make do.
I basically copy-pasted updateFilters to make this work, todo go back and refactor
to not duplicate 90% of the code.
We want the middle ground between a small compression size, a fast
compression time and a fast decompression time.
Tests suggest that the default compression level is better than the
maximum compression level: although the reports are 4% bigger and
decompress slower, they compress 33% faster.
See discussion on https://github.com/weaveworks/scope/issues/1457#issuecomment-293288682
Since d60874aca8 `connectionTracker` can
fallback when the `EbpfTracker` died. Hence we only have to stop the
`tracer` in `stop()`.
This commit is also a fixup for d60874aca8
where we do a gentle fallback but never actually stop the tracer to stop
polling.
Based on work from Lorenzo, updated by Iago, Alban, Alessandro and
Michael.
This PR adds connection tracking using eBPF. This feature is not enabled by default.
For now, you can enable it by launching scope with the following command:
```
sudo ./scope launch --probe.ebpf.connections=true
```
This patch allows scope to get notified of every connection event,
without relying on the parsing of /proc/$pid/net/tcp{,6} and
/proc/$pid/fd/*, and therefore improve performance.
We vendor https://github.com/iovisor/gobpf in Scope to load the
pre-compiled ebpf program and https://github.com/weaveworks/tcptracer-bpf
to guess the offsets of the structures we need in the kernel. In this
way we don't need a different pre-compiled ebpf object file per kernel.
The pre-compiled ebpf program is included in the vendoring of
tcptracer-bpf.
The ebpf program uses kprobes/kretprobes on the following kernel functions:
- tcp_v4_connect
- tcp_v6_connect
- tcp_set_state
- inet_csk_accept
- tcp_close
It generates "connect", "accept" and "close" events containing the
connection tuple but also pid and netns.
Note: the IPv6 events are not supported in Scope and thus not passed on.
probe/endpoint/ebpf.go maintains the list of connections. Similarly to
conntrack, it also keeps the dead connections for one iteration in order
to report short-lived connections.
The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
there and still used at start-up because eBPF only brings us the events
and not the initial state. However, the /proc parsing for the initial
state is now done in foreground instead of background, via
newForegroundReader().
NAT resolution on connections from eBPF works in the same way as it did
on connections from /proc: by using conntrack. One of the two conntrack
instances is only started to get the initial state and then it is
stopped since eBPF detects short-lived connections.
The Scope Docker image size comparison:
- weaveworks/scope in current master: 22 MB (compressed), 68 MB
(uncompressed)
- weaveworks/scope with this patchset: 23 MB (compressed), 69 MB
(uncompressed)
Fixes#1168 (walking /proc to obtain connections is very expensive)
Fixes#1260 (Short-lived connections not tracked for containers in
shared networking namespaces)
Fixes#1962 (Port ebpf tracker to Go)
Fixes#1961 (Remove runtime kernel header dependency from ebpf tracker)
* Add options to hide args and env vars
To allow for use of weave-scope in an unauthenticated environment,
add options to the probe to hide comand line arguments and
environment variables, which might contain secret data.
Fixes#2222
* Change docker.NewRegistry arguments to be a struct
* Remove redundant declarations of default values
* Move registry options outside to improve readability
This is important for two reasons:
* It prevents nasty false-equality bugs when two different services from different ECS clusters
are present in the same report
* It allows us to retrieve the cluster and service name - all the info we need to look up the service -
using only the node ID. This matters, for example, when trying to handle a control request.
With net.netfilter.nf_conntrack_acct = 1, conntrack adds the following
fields in the output: packets=3 bytes=164
And with SELinux (e.g. Fedora), conntrack adds: secctx=...
The parsing with fmt.Sscanf introduced in #2095 was unfortunately
rejecting lines with those fields. This patch fixes that by adding more
complicated parsing in decodeFlowKeyValues() with FieldsFunc and SplitN.
Fixes#2117
Regression from #2095