Commit Graph

688 Commits

Author SHA1 Message Date
Matthias Radestock
e603a28ca4 Merge pull request #2704 from weaveworks/2689-2700-ebpf-init
don't miss, or fail to forget, initial connections

Fixes #2689.
Fixes #2700.
2017-07-13 11:39:31 +01:00
Matthias Radestock
b087e95711 bump tcptracer-bpf version 2017-07-12 07:27:35 +01:00
Matthias Radestock
ebc3cddf01 don't miss, or fail to forget, initial connections
...when initialising eBPF-based connection tracking.

Previously we were ignoring all eBPF events until we had gathered the
existing connections. That means we could a) miss connections created
during the gathering, and b) fail to forget connections that got
closed during the gathering.

The fix comprises the following changes:

1. pay attention to eBPF events immediately. That way we do not
miss anything.

2. remember connections for which we received a Close event during the
initalisation phase, and subsequently drop gathered existing
connections that match these. That way we do not erroneously consider
a gathered connection as open when it got closed since the gathering.

3. drop gathered existing connections which match connections detected
through eBPF events. The latter typically have more / current
metadata. In particular, PIDs can be missing from the former.

Fixes #2689.
Fixes #2700.
2017-07-11 22:50:47 +01:00
Matthias Radestock
d568c50ec4 make EbpfTracker.dead go-routine-safe and .stop() idempotent
Without synchronisation, the isDead() call might return a stale value,
delaying deadness detection potentially indefinitely.

Without the guards / idempotence in .stop(), invoking stop() more than
once could cause a panic, since tracer.Stop() closes a channel (which
panics on a closed channel). Multiple stop() invocations are rare, but
not impossible.
2017-07-11 19:38:07 +01:00
Matthias Radestock
cf6353327a eliminate race in ebpf initialization
We were enabling event processing before feeding in the initial
connections, which results in a non-deterministic outcome.
2017-07-11 19:38:07 +01:00
Matthias Radestock
15215d0c2c prevent concurrent map access in ebpf fd install event handler
which presumably could cause havoc
2017-07-11 19:38:07 +01:00
Matthias Radestock
3883d8f1af fix a minor leak in ebfp fdinstall_pids table
when we got an fd install event but the pid was dead by time we
processed it, we would fail to remove the watcher for that pid from
the fdinstall_pids table.

This is a minor, and bounded, leak, since the table only contains pids
that were alive when we initialized ebpf. And this change only plugs
that leak very partially, since we will never remove pids that die
while sitting in accept().
2017-07-11 19:38:07 +01:00
Matthias Radestock
e2cbe7ac26 refactor: a bit of inlining 2017-07-11 19:38:06 +01:00
Matthias Radestock
3baeb3d238 refactor: use fourTuple as map key instead of string 2017-07-11 19:38:06 +01:00
Matthias Radestock
ad7b5cdc19 refactor: remove pointless interface
premature abstraction
2017-07-11 19:38:06 +01:00
Matthias Radestock
8a56540648 refactor: eliminate global var 2017-07-11 19:38:06 +01:00
Matthias Radestock
8bd0188537 respect UseConntrack setting in ebpf initialisation 2017-07-11 19:37:11 +01:00
Matthias Radestock
7ea0800f8b refactor: extract helper to get initial flows 2017-07-10 07:34:20 +01:00
Matthias Radestock
07e7adbd63 refactor: make performFlowWalk data flow more obvious 2017-07-10 07:22:12 +01:00
Matthias Radestock
19e45ec248 refactor: eliminate global var 2017-07-07 10:18:43 +01:00
Matthias Radestock
8cf79b2e4a bump tcptracer-bpf version and use it to fix race
We defer starting the ebpf tracer until we've set the global var which
is referenced by the callback functions. Previously the var could be
unset when the callbacks are invoked, resulting in a segfault.

Fixes #2687.
2017-07-07 06:56:28 +01:00
Matthias Radestock
f0ae2bd98c refactor: use inline StringSet constructor 2017-07-04 06:29:19 +01:00
Alfonso Acosta
6c03540b1f Merge pull request #2659 from weaveworks/use-new-k8s-go-client
Use new k8s go client
2017-07-03 23:23:41 +02:00
Alfonso Acosta
84afe9fe70 Fix typo 2017-07-03 20:20:28 +00:00
Alfonso Acosta
34bfc22b4f Fix tests 2017-07-03 20:20:28 +00:00
Alfonso Acosta
7d59936d8c HostNetwork is now inlined in the pod spec 2017-07-03 20:20:28 +00:00
Alfonso Acosta
8bbbf25809 Migrate probe to new new kubernetes go-client
This namely involved importing new libraries and using the new Clientset.

Changes worth mentioning:

* The new kubernetes library doesn't provide StoreToLister wrappers, so now I am going the casting directly.
* Deleting the pods and getting their logs is done in a cleaner way (using the
  Clientset instead of the lower-level RESTclient).
2017-07-03 20:20:27 +00:00
Matthias Radestock
d12603c516 tiny refactor: use inline string set constructor in test 2017-07-03 07:55:17 +01:00
Matthias Radestock
430e74a80a refactor: remove report latest map Delete()
It wasn't used, and is problematic in any case since it introduces
non-monotonicity.
2017-07-03 02:06:21 +01:00
Matthias Radestock
9dc50b5202 refactor: hide "empty set" constants
They are an implementation detail.
2017-07-03 01:26:22 +01:00
Roland Schilter
e6bc1d6ec2 Merge pull request #2649 from weaveworks/1975-honor-docker-host-in-probe
Honor DOCKER_* env variables in probe and app
2017-06-28 09:07:52 +02:00
Mike Lang
889972c48a Display node type on k8s controller nodes
Since there are multiple types in the same topology, displaying the type is important.
We do this in multiple places:

* Add node type to minor label

* Add node type as metadata and include in metadata template.
  Even though this will always be the same for every node of that topology, this was
  the easiest way to add it so it displays in the table view.
  Note we can't control ordering of columns in table view, it's always alphabetical.
2017-06-27 10:19:04 -07:00
Roland Schilter
651e52b5a5 Honor DOCKER_* env variables in probe and app
Changed default for flag `-app.docker` to use the DOCKER_* env variables
instead of hardcoded /var/run/docker.sock; uses docker's default if
no DOCKER_HOST defined, for both probe and app.

Fixes #1975
2017-06-27 17:14:49 +02:00
Matthias Radestock
286e481771 Merge pull request #2645 from weaveworks/2644-initial-ebpf-polarity
correct polarity of initial connections

Fixes #2644.
2017-06-26 09:10:46 +01:00
Matthias Radestock
b43003fd2b refactor: remove superfluous pointering 2017-06-25 11:25:51 +01:00
Matthias Radestock
bd6cdc44a8 refactor: extract some common code 2017-06-25 11:22:32 +01:00
Matthias Radestock
181a548122 correct polarity of initial connections
Fixes #2644
2017-06-25 11:08:24 +01:00
Matthias Radestock
30e0444914 ensure connections from /proc/net/tcp{,6} get the right pid
ProcNet.Next does not allocate Connection structs, for efficiency.
Instead it always returns a *Connection pointing to the same instance.
As a result, any mutations by the caller to struct elements that
aren't actually set by ProcNet.Next, in particular Connection.Proc,
are carried across to subsequent calls.

This had hilarious consequences: connections referencing an inode
which we hadn't come across during proc walking would be associated
with the process corresponding to the last successfully looked up
inode.

The fix is to clear out the garbage left over from previous calls.

Fixes #2638.
2017-06-25 10:59:58 +01:00
Alfonso Acosta
4006040cc1 Review feedback 2017-06-23 23:52:30 +00:00
Alfonso Acosta
9778be760b Use a global lock instead 2017-06-23 09:18:51 +00:00
Alfonso Acosta
22a31fc5f1 Review feedback 2017-06-22 16:33:36 +00:00
Alfonso Acosta
4139494783 Avoid race conditions in DNSSnooper's cached domains 2017-06-22 12:58:57 +00:00
Matthias Radestock
a7cfd043fc fix fmt string error in test, found by linter 2017-06-21 21:56:34 +01:00
Matthias Radestock
4a54b75419 forgot this one in #2622 2017-06-20 20:43:26 +01:00
Matthias Radestock
4e0065a57d refactor: put all network detection code in one place 2017-06-20 09:23:52 +01:00
Matthias Radestock
19a6551de2 ignore local IPv6 addresses/networks
There is no point in paying attention to them since scope connection
tracking only deals in IPv4.
2017-06-20 09:04:08 +01:00
Alfonso Acosta
43c5ed2aaf Merge pull request #2554 from weaveworks/never-localhost
Use 127.0.0.1 instead of localhost, more
2017-06-19 22:40:37 +02:00
Alfonso Acosta
62f2c0920f Do not read tcp6 files if TCP version 6 isn't supported 2017-06-15 10:16:14 +00:00
Matthias Radestock
afbc1decab drop addr and port from Endpoint.Latest map
the information is constant and already present in the id, so we can
extract it from there.

That reduces the report size and improves report encoding/decoding
performance. It should reduce memory usage too and improve report
merging performance too.

NB: Probes with this change are incompatible with old apps.
2017-06-10 19:19:56 +01:00
Matthias Radestock
c8f97878d2 re-target app clients when name resolution changes
Fixes #2578.
2017-06-09 12:30:26 +01:00
Matthias Radestock
fb735b65c4 cosmetic: correct comment 2017-06-09 11:31:20 +01:00
Matthias Radestock
d0b40ee4b9 correct type for "Observed Gen."
It's a number. This enables numeric sorting of Observed Gen in the
table mode of the Deployment and Replicaset views.
2017-06-08 04:27:10 +01:00
Roland Schilter
56cb02675b Back off upon errored kubernetes api requests (#2562)
closes #1009
2017-06-06 16:19:41 +02:00
Bryan Boreham
1898b67e1f Use 127.0.0.1 instead of localhost in case that name resolves to something else 2017-06-05 10:31:27 +00:00
Matthias Radestock
59f777a066 don't read all of /proc when probe.proc.spy=false
Previously we were doing the reading even though we weren't looking at
the result.
2017-06-02 14:01:25 +01:00