644 Commits

Author SHA1 Message Date
Matthias Radestock
afbc1decab drop addr and port from Endpoint.Latest map
the information is constant and already present in the id, so we can
extract it from there.

That reduces the report size and improves report encoding/decoding
performance. It should reduce memory usage too and improve report
merging performance too.

NB: Probes with this change are incompatible with old apps.
2017-06-10 19:19:56 +01:00
Matthias Radestock
c8f97878d2 re-target app clients when name resolution changes
Fixes #2578.
2017-06-09 12:30:26 +01:00
Matthias Radestock
fb735b65c4 cosmetic: correct comment 2017-06-09 11:31:20 +01:00
Matthias Radestock
d0b40ee4b9 correct type for "Observed Gen."
It's a number. This enables numeric sorting of Observed Gen in the
table mode of the Deployment and Replicaset views.
2017-06-08 04:27:10 +01:00
Roland Schilter
56cb02675b Back off upon errored kubernetes api requests (#2562)
closes #1009
2017-06-06 16:19:41 +02:00
Matthias Radestock
59f777a066 don't read all of /proc when probe.proc.spy=false
Previously we were doing the reading even though we weren't looking at
the result.
2017-06-02 14:01:25 +01:00
Matthias Radestock
b52b2078ca refactor: remove unnecessary conditional
we always have a flowWalker when not using ebpf
2017-05-25 23:04:45 +01:00
Matthias Radestock
b80a51bc39 cosmetic: remove outdated comment
we now do correctly fall back to proc scanning when eBPF fails
2017-05-25 23:04:45 +01:00
Matthias Radestock
a6cc8ece4f simplify connection tracker initialization
- eliminate the code duplication when falling back to procfs scanning
- trim some superfluous comments

Also fix a bug in the procvess: when falling back to procfs scanning
in ReportConnections, the scanner was given a "--any-nat" param, which
is wrong.
2017-05-25 23:02:19 +01:00
Alfonso Acosta
0aec26653b Guard against null DaemonSet store 2017-05-24 11:36:26 +00:00
Mike Lang
c0751cd4e2 probe/kubernetes: Propagate errors in getting label selectors 2017-05-19 15:06:53 -07:00
Mike Lang
d4a5360d4c k8s probe: Collect info on daemonsets for new DaemonSet topology 2017-05-19 15:06:51 -07:00
Alban Crequy
d715ccc391 ebpf: handle fd_install events from tcptracer-bpf
Since https://github.com/weaveworks/tcptracer-bpf/pull/39, tcptracer-bpf
can generate "fd_install" events when a process installs a new file
descriptor in its fd table. Those events must be requested explicitely
on a per-pid basis with tracer.AddFdInstallWatcher(pid).

This is useful to know about "accept" events that would otherwise be
missed because kretprobes are not triggered for functions that were
called before the installation of the kretprobe.

This patch find all the processes that are currently blocked on an
accept() syscall during the EbpfTracker initialization.
feedInitialConnections() will use tracer.AddFdInstallWatcher() to
subscribe to fd_install  events. When a fd_install event is received,
synthesise an accept event with the connection tuple and the network
namespace (from /proc).
2017-05-19 14:49:38 +02:00
Alfonso Acosta
dbdb648ada Merge pull request #2527 from weaveworks/2494-track-non-natted-shortlived-conns
Let conntrack track non-NATed short-lived connections
2017-05-19 01:42:02 +02:00
Alfonso Acosta
2d6034a2e5 Re-enable pod shortcut reports 2017-05-18 10:21:32 +00:00
Alfonso Acosta
7497c7d432 Let conntrack track non-NATed short-lived connections 2017-05-16 23:15:16 +00:00
Alfonso Acosta
5079c114bc Merge pull request #2507 from kinvolk/alban/perf-map-fixes
ebpf connection tracker: perf map fixes
2017-05-16 21:57:42 +02:00
Alban Crequy
9079677873 ebpf tracker: add callback for lost events
Lost events were previously unnoticed. This patch adds an error in the
log and stops the ebpf tracker if an event is lost.
2017-05-10 18:37:32 +02:00
Alfonso Acosta
aec715653a Avoid null dereferences in ECS client (encore) 2017-05-10 16:25:32 +00:00
Alfonso Acosta
26eaadbbaa Avoid null dereferences in ECS client 2017-05-10 15:28:53 +00:00
Alfonso Acosta
88874782be Log specific error when deployments are not supported 2017-05-05 14:30:59 +02:00
Alban Crequy
598c6a0238 proc walker: cache limits and cmdline/name after parsing 2017-05-05 13:05:08 +02:00
Alban Crequy
3a8a09a606 proc walker: optimize readLimits 2017-05-05 13:05:08 +02:00
Alban Crequy
bdb09f5f9d proc walker: optimize readStats 2017-05-05 13:05:08 +02:00
Alban Crequy
640b240469 proc walker: optimize open file counter
Golang's ReadDirNames is expensive and better avoided when we don't care
about the names.
2017-05-02 14:45:12 +02:00
Alfonso Acosta
876bb97539 Merge pull request #2452 from weaveworks/mike/docker-swarm/service-ns-selector
Add docker swarm Stack selector ala k8s namespace selector
2017-04-25 15:57:15 +02:00
Michael Schubert
1d1f7347ce proc_linux: don't exec getNetNamespacePathSuffix() on every walk 2017-04-19 12:49:04 +02:00
Mike Lang
51999529a7 Add docker swarm Stack selector ala k8s namespace selector
We have to introduce the kinda hacky concept of a 'No Stack' stack
to reconcile it with the idea of a 'default' k8s namespace. This is important
because swarm services without a stack don't have the same docker labels as ones that do.
Curiously, they still have what appears to be a stack name 'prefix' on their names,
but I can't isolate that name anywhere easily so they'll just have to make do.

I basically copy-pasted updateFilters to make this work, todo go back and refactor
to not duplicate 90% of the code.
2017-04-18 09:08:22 -07:00
Bryan Boreham
c944225475 Merge pull request #2437 from kinvolk/alban/gzip-compression-level-default
gzip: change compression level to the default
2017-04-18 10:45:38 +01:00
Mike Lang
72bcdba1c3 swarm service: Capture stack namespace and strip it from name 2017-04-17 15:13:50 -07:00
Mike Lang
327b909956 probe/docker: Populate SwarmService topology based on docker labels
This isn't the best way to do it, but it will work well enough for an initial implementation
2017-04-14 12:51:28 -07:00
Mike Lang
460352d2d7 Merge pull request #2436 from weaveworks/mike/easier-added-topologies
Reduce the number of places topologies are explicitly listed
2017-04-14 12:49:12 -07:00
Alban Crequy
a8af81fe20 gzip: change compression level to the default
We want the middle ground between a small compression size, a fast
compression time and a fast decompression time.

Tests suggest that the default compression level is better than the
maximum compression level: although the reports are 4% bigger and
decompress slower, they compress 33% faster.

See discussion on https://github.com/weaveworks/scope/issues/1457#issuecomment-293288682
2017-04-12 17:41:43 +02:00
Mike Lang
18ba2c4e38 ecs: Also make service a parent of task 2017-04-11 10:58:33 -07:00
Mike Lang
75314cb910 Reduce manually listing all topologies in a few places
Prefer WalkTopologies to apply a uniform action to every topology,
reducing need to make multiple changes and risk of errors if you forget one.
2017-04-07 12:57:42 -07:00
Bryan Boreham
515f4b1a47 Make various anonymous fields named
Anonymous fields make any methods on the inner object visible on the
outer, so they should only be used when the outer is-a inner.
2017-04-01 11:35:10 +00:00
Michael Schubert
cd25b8b935 endpoint/ebpf: implement stop
Since d60874aca8 `connectionTracker` can
fallback when the `EbpfTracker` died. Hence we only have to stop the
`tracer` in `stop()`.

This commit is also a fixup for d60874aca8
where we do a gentle fallback but never actually stop the tracer to stop
polling.
2017-03-21 14:42:34 +01:00
Michael Schubert
5572895a2b ebpf_test: tracker set to dead after out of order events 2017-03-17 16:50:25 +01:00
Michael Schubert
5262e0765d reader_linux: only access latestBuf when set
.. and avoid nil pointer dereference. It can happen that
`getWalkedProcPid` is called before the first `performWalk` finished.
2017-03-17 14:43:31 +01:00
Michael Schubert
d60874aca8 Fallback to proc when ebpf timestamps are wrong 2017-03-17 14:43:31 +01:00
Michael Schubert
22ae6c45a0 Implement ebpf proc fallback 2017-03-14 13:59:09 +01:00
Michael Schubert
5f2ba891a4 endpoint/reporter: only stop scanner if not nil 2017-03-14 11:56:04 +01:00
Michael Schubert
ce904fc56c Remove redundant arg from newEbpfTracker 2017-03-14 11:56:04 +01:00
Matthias Radestock
245c2e9149 fall back to /proc/<pid>/comm for process name
when proc/<pid>/cmdline is empty, which is the case for some system
and defunct processes.

Fixes #2315
2017-03-09 14:02:32 +00:00
Iago López Galeiras
9920c4ea48 Add eBPF connection tracking without dependencies on kernel headers
Based on work from Lorenzo, updated by Iago, Alban, Alessandro and
Michael.

This PR adds connection tracking using eBPF. This feature is not enabled by default.
For now, you can enable it by launching scope with the following command:

```
sudo ./scope launch --probe.ebpf.connections=true
```

This patch allows scope to get notified of every connection event,
without relying on the parsing of /proc/$pid/net/tcp{,6} and
/proc/$pid/fd/*, and therefore improve performance.

We vendor https://github.com/iovisor/gobpf in Scope to load the
pre-compiled ebpf program and https://github.com/weaveworks/tcptracer-bpf
to guess the offsets of the structures we need in the kernel. In this
way we don't need a different pre-compiled ebpf object file per kernel.
The pre-compiled ebpf program is included in the vendoring of
tcptracer-bpf.

The ebpf program uses kprobes/kretprobes on the following kernel functions:
- tcp_v4_connect
- tcp_v6_connect
- tcp_set_state
- inet_csk_accept
- tcp_close

It generates "connect", "accept" and "close" events containing the
connection tuple but also pid and netns.
Note: the IPv6 events are not supported in Scope and thus not passed on.

probe/endpoint/ebpf.go maintains the list of connections. Similarly to
conntrack, it also keeps the dead connections for one iteration in order
to report short-lived connections.

The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
there and still used at start-up because eBPF only brings us the events
and not the initial state. However, the /proc parsing for the initial
state is now done in foreground instead of background, via
newForegroundReader().

NAT resolution on connections from eBPF works in the same way as it did
on connections from /proc: by using conntrack. One of the two conntrack
instances is only started to get the initial state and then it is
stopped since eBPF detects short-lived connections.

The Scope Docker image size comparison:
- weaveworks/scope in current master:  22 MB (compressed),  68 MB
  (uncompressed)
- weaveworks/scope with this patchset: 23 MB (compressed), 69 MB
  (uncompressed)

Fixes #1168 (walking /proc to obtain connections is very expensive)

Fixes #1260 (Short-lived connections not tracked for containers in
shared networking namespaces)

Fixes #1962 (Port ebpf tracker to Go)

Fixes #1961 (Remove runtime kernel header dependency from ebpf tracker)
2017-03-08 22:11:12 +01:00
Alfonso Acosta
052ff39bf1 Merge pull request #2309 from weaveworks/2258-fix-kubelet-access
Fix kubelet failure fallback and make port configurable
2017-03-08 10:15:21 -08:00
Alfonso Acosta
8bf753a51b Revert "Revert "Add options to hide args and env vars (#2306)"" (#2311)
* Revert "Revert "Add options to hide args and env vars (#2306)""

* Make linter happy
2017-03-08 02:16:42 -08:00
Alfonso Acosta
dcc7389127 Revert "Add options to hide args and env vars (#2306)"
This reverts commit 764afb6301.
2017-03-07 17:51:27 +01:00
Mike Bryant
764afb6301 Add options to hide args and env vars (#2306)
* Add options to hide args and env vars

To allow for use of weave-scope in an unauthenticated environment,
add options to the probe to hide comand line arguments and
environment variables, which might contain secret data.

Fixes #2222

* Change docker.NewRegistry arguments to be a struct

* Remove redundant declarations of default values

* Move registry options outside to improve readability
2017-03-07 08:51:18 -08:00
Alfonso Acosta
fb64f1102f Fix tests 2017-03-07 13:53:17 +00:00