201 Commits

Author SHA1 Message Date
Bryan Boreham
b5cdcb9a42 Move DNS name mapping from endpoint to report 2018-02-20 16:14:21 +00:00
Bryan Boreham
6674ff61e5 Fix incorrect comment 2018-02-20 16:14:20 +00:00
Matthias Radestock
5b30b668ae refactor: don't return receiver in Topology.AddNode()
This had little use and was obscuring the mutating nature of
AddNode().
2018-02-19 05:10:04 +00:00
Matthias Radestock
e93b69cf10 remove Node.Edges
It is unused and none of the adjacency mapping code in the renderer
takes any notice of it. Removing this shrinks the report size.

Edges were introduced in #838. At the time we had an experimental
packet sniffer under experimental/sniff/sniffer.go. That got removed
in #1646.

We can resurrect this if we ever decide to add meta data to edges.
2017-12-17 13:28:22 +00:00
Matthias Radestock
1f2247a8c4 move node metadata keys into report package
Both the probe and the app (for rendering) need to know about them.
2017-12-11 20:26:08 +00:00
Matthias Radestock
1865c46368 refactor: introduce a constant for "copy_of"
since it's shared between the probe and renderer
2017-12-09 10:45:59 +00:00
Tobias Klauser
89f3ce2e64 Simplify Utsname string conversion
Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members and the marshal.FromUtsname functions are
no longer needed.
2017-11-02 08:45:54 +01:00
Alban Crequy
9c53653997 EbpfTracker: restart it when it dies
EbpfTracker can die when the tcp events are received out of order. This
can happen with a buggy kernel or apparently in other cases, see:
https://github.com/weaveworks/scope/issues/2650

As a workaround, restart EbpfTracker when an event is received out of
order. This does not seem to happen often, but as a precaution,
EbpfTracker will not restart if the last failure is less than 5 minutes
ago.

This is not easy to test but I added instrumentation to trigger a
restart:

- Start Scope with:
    $ sudo WEAVESCOPE_DOCKER_ARGS="-e SCOPE_DEBUG_BPF=1" ./scope launch

- Request a stop with:
    $ echo stop | sudo tee /proc/$(pidof scope-probe)/root/var/run/scope/debug-bpf
2017-08-17 16:39:27 +02:00
Matthias Radestock
e77d40fc16 refactor: inline connectionTracker.performFlowWalk 2017-07-30 09:23:41 +01:00
Matthias Radestock
b93b19a7c7 refactor: simplify connection polarity reversal 2017-07-30 08:48:13 +01:00
Matthias Radestock
65cebed6c4 get rid of endpoint type indicators
The app stopped paying attention to these some time ago.

Removing them shrinks reports by 3-10%.
2017-07-30 08:38:56 +01:00
Matthias Radestock
e603a28ca4 Merge pull request #2704 from weaveworks/2689-2700-ebpf-init
don't miss, or fail to forget, initial connections

Fixes #2689.
Fixes #2700.
2017-07-13 11:39:31 +01:00
Matthias Radestock
b087e95711 bump tcptracer-bpf version 2017-07-12 07:27:35 +01:00
Matthias Radestock
ebc3cddf01 don't miss, or fail to forget, initial connections
...when initialising eBPF-based connection tracking.

Previously we were ignoring all eBPF events until we had gathered the
existing connections. That means we could a) miss connections created
during the gathering, and b) fail to forget connections that got
closed during the gathering.

The fix comprises the following changes:

1. pay attention to eBPF events immediately. That way we do not
miss anything.

2. remember connections for which we received a Close event during the
initalisation phase, and subsequently drop gathered existing
connections that match these. That way we do not erroneously consider
a gathered connection as open when it got closed since the gathering.

3. drop gathered existing connections which match connections detected
through eBPF events. The latter typically have more / current
metadata. In particular, PIDs can be missing from the former.

Fixes #2689.
Fixes #2700.
2017-07-11 22:50:47 +01:00
Matthias Radestock
d568c50ec4 make EbpfTracker.dead go-routine-safe and .stop() idempotent
Without synchronisation, the isDead() call might return a stale value,
delaying deadness detection potentially indefinitely.

Without the guards / idempotence in .stop(), invoking stop() more than
once could cause a panic, since tracer.Stop() closes a channel (which
panics on a closed channel). Multiple stop() invocations are rare, but
not impossible.
2017-07-11 19:38:07 +01:00
Matthias Radestock
cf6353327a eliminate race in ebpf initialization
We were enabling event processing before feeding in the initial
connections, which results in a non-deterministic outcome.
2017-07-11 19:38:07 +01:00
Matthias Radestock
15215d0c2c prevent concurrent map access in ebpf fd install event handler
which presumably could cause havoc
2017-07-11 19:38:07 +01:00
Matthias Radestock
3883d8f1af fix a minor leak in ebfp fdinstall_pids table
when we got an fd install event but the pid was dead by time we
processed it, we would fail to remove the watcher for that pid from
the fdinstall_pids table.

This is a minor, and bounded, leak, since the table only contains pids
that were alive when we initialized ebpf. And this change only plugs
that leak very partially, since we will never remove pids that die
while sitting in accept().
2017-07-11 19:38:07 +01:00
Matthias Radestock
e2cbe7ac26 refactor: a bit of inlining 2017-07-11 19:38:06 +01:00
Matthias Radestock
3baeb3d238 refactor: use fourTuple as map key instead of string 2017-07-11 19:38:06 +01:00
Matthias Radestock
ad7b5cdc19 refactor: remove pointless interface
premature abstraction
2017-07-11 19:38:06 +01:00
Matthias Radestock
8a56540648 refactor: eliminate global var 2017-07-11 19:38:06 +01:00
Matthias Radestock
8bd0188537 respect UseConntrack setting in ebpf initialisation 2017-07-11 19:37:11 +01:00
Matthias Radestock
7ea0800f8b refactor: extract helper to get initial flows 2017-07-10 07:34:20 +01:00
Matthias Radestock
07e7adbd63 refactor: make performFlowWalk data flow more obvious 2017-07-10 07:22:12 +01:00
Matthias Radestock
19e45ec248 refactor: eliminate global var 2017-07-07 10:18:43 +01:00
Matthias Radestock
8cf79b2e4a bump tcptracer-bpf version and use it to fix race
We defer starting the ebpf tracer until we've set the global var which
is referenced by the callback functions. Previously the var could be
unset when the callbacks are invoked, resulting in a segfault.

Fixes #2687.
2017-07-07 06:56:28 +01:00
Matthias Radestock
286e481771 Merge pull request #2645 from weaveworks/2644-initial-ebpf-polarity
correct polarity of initial connections

Fixes #2644.
2017-06-26 09:10:46 +01:00
Matthias Radestock
b43003fd2b refactor: remove superfluous pointering 2017-06-25 11:25:51 +01:00
Matthias Radestock
bd6cdc44a8 refactor: extract some common code 2017-06-25 11:22:32 +01:00
Matthias Radestock
181a548122 correct polarity of initial connections
Fixes #2644
2017-06-25 11:08:24 +01:00
Matthias Radestock
30e0444914 ensure connections from /proc/net/tcp{,6} get the right pid
ProcNet.Next does not allocate Connection structs, for efficiency.
Instead it always returns a *Connection pointing to the same instance.
As a result, any mutations by the caller to struct elements that
aren't actually set by ProcNet.Next, in particular Connection.Proc,
are carried across to subsequent calls.

This had hilarious consequences: connections referencing an inode
which we hadn't come across during proc walking would be associated
with the process corresponding to the last successfully looked up
inode.

The fix is to clear out the garbage left over from previous calls.

Fixes #2638.
2017-06-25 10:59:58 +01:00
Alfonso Acosta
4006040cc1 Review feedback 2017-06-23 23:52:30 +00:00
Alfonso Acosta
9778be760b Use a global lock instead 2017-06-23 09:18:51 +00:00
Alfonso Acosta
22a31fc5f1 Review feedback 2017-06-22 16:33:36 +00:00
Alfonso Acosta
4139494783 Avoid race conditions in DNSSnooper's cached domains 2017-06-22 12:58:57 +00:00
Alfonso Acosta
62f2c0920f Do not read tcp6 files if TCP version 6 isn't supported 2017-06-15 10:16:14 +00:00
Matthias Radestock
afbc1decab drop addr and port from Endpoint.Latest map
the information is constant and already present in the id, so we can
extract it from there.

That reduces the report size and improves report encoding/decoding
performance. It should reduce memory usage too and improve report
merging performance too.

NB: Probes with this change are incompatible with old apps.
2017-06-10 19:19:56 +01:00
Matthias Radestock
59f777a066 don't read all of /proc when probe.proc.spy=false
Previously we were doing the reading even though we weren't looking at
the result.
2017-06-02 14:01:25 +01:00
Matthias Radestock
b52b2078ca refactor: remove unnecessary conditional
we always have a flowWalker when not using ebpf
2017-05-25 23:04:45 +01:00
Matthias Radestock
b80a51bc39 cosmetic: remove outdated comment
we now do correctly fall back to proc scanning when eBPF fails
2017-05-25 23:04:45 +01:00
Matthias Radestock
a6cc8ece4f simplify connection tracker initialization
- eliminate the code duplication when falling back to procfs scanning
- trim some superfluous comments

Also fix a bug in the procvess: when falling back to procfs scanning
in ReportConnections, the scanner was given a "--any-nat" param, which
is wrong.
2017-05-25 23:02:19 +01:00
Alban Crequy
d715ccc391 ebpf: handle fd_install events from tcptracer-bpf
Since https://github.com/weaveworks/tcptracer-bpf/pull/39, tcptracer-bpf
can generate "fd_install" events when a process installs a new file
descriptor in its fd table. Those events must be requested explicitely
on a per-pid basis with tracer.AddFdInstallWatcher(pid).

This is useful to know about "accept" events that would otherwise be
missed because kretprobes are not triggered for functions that were
called before the installation of the kretprobe.

This patch find all the processes that are currently blocked on an
accept() syscall during the EbpfTracker initialization.
feedInitialConnections() will use tracer.AddFdInstallWatcher() to
subscribe to fd_install  events. When a fd_install event is received,
synthesise an accept event with the connection tuple and the network
namespace (from /proc).
2017-05-19 14:49:38 +02:00
Alfonso Acosta
dbdb648ada Merge pull request #2527 from weaveworks/2494-track-non-natted-shortlived-conns
Let conntrack track non-NATed short-lived connections
2017-05-19 01:42:02 +02:00
Alfonso Acosta
7497c7d432 Let conntrack track non-NATed short-lived connections 2017-05-16 23:15:16 +00:00
Alban Crequy
9079677873 ebpf tracker: add callback for lost events
Lost events were previously unnoticed. This patch adds an error in the
log and stops the ebpf tracker if an event is lost.
2017-05-10 18:37:32 +02:00
Michael Schubert
1d1f7347ce proc_linux: don't exec getNetNamespacePathSuffix() on every walk 2017-04-19 12:49:04 +02:00
Bryan Boreham
515f4b1a47 Make various anonymous fields named
Anonymous fields make any methods on the inner object visible on the
outer, so they should only be used when the outer is-a inner.
2017-04-01 11:35:10 +00:00
Michael Schubert
cd25b8b935 endpoint/ebpf: implement stop
Since d60874aca8 `connectionTracker` can
fallback when the `EbpfTracker` died. Hence we only have to stop the
`tracer` in `stop()`.

This commit is also a fixup for d60874aca8
where we do a gentle fallback but never actually stop the tracer to stop
polling.
2017-03-21 14:42:34 +01:00
Michael Schubert
5572895a2b ebpf_test: tracker set to dead after out of order events 2017-03-17 16:50:25 +01:00