Commit Graph

252 Commits

Author SHA1 Message Date
Bryan Boreham
103ea2095f Fix lint warnings in Go code
All cosmetic.
2020-12-30 18:30:34 +00:00
Bryan Boreham
f0e43aa020 fix(probe): wait for ebpf tracker to stop on exit
We should have done this anyway, but I haven't noticed the error
previously. Possibly Go 1.14 is more aggressive about exiting when
some goroutines are still active.
2020-05-10 15:30:28 +00:00
Bryan Boreham
7dc7215a26 Refactor: improve readability based on review feedback 2020-01-23 15:04:51 +00:00
Bryan Boreham
b7b245ed48 tests: connection subset testing
Utility functions to create fake sets of connections for testing, and
then exercising the subset filtering code to check that quantities
come out as expected.
2020-01-13 08:59:51 +00:00
Bryan Boreham
de3c34ddc6 performance(probe): thin out many connections between the same point
The app will only show one line, regardless of how many connections we
have, so reduce the number to save bandwidth and rendering time.

We filter by choosing a modulus, e.g. send every connection that is a
multiple of 3, or 9, and so on. We avoid multiples of 2 because port
numbers are often a multiple of 2 or 4 for bit-encoding reasons.
2020-01-13 08:53:47 +00:00
Bryan Boreham
fc46ea17ee refactor(probe/ebpf): track connections by four-tuple+namespace
The previous code tracked only by four-tuple, which meant that two
connections with same address/port combinations in different namespace
would clash and one would get dropped.

Also previously the tuple was duplicated between the map key and
value, so we remove it from the value.

We only add the namespace in the case that the local address is
loopback, which matches how the rest of Scope treats addresses.
2020-01-13 08:53:47 +00:00
Bryan Boreham
9758c81736 comment: add explanatory comment on handleFdInstall() 2020-01-13 08:53:47 +00:00
Bryan Boreham
d57a4df3b2 enhancement(probe): debug message for initial connection 2020-01-13 08:53:47 +00:00
Bryan Boreham
2767e2b319 improvement: only record connections that we have a PID for 2020-01-13 08:53:47 +00:00
Bryan Boreham
a6da810261 refactor(probe): move host/pid encoding into addConnection() function 2020-01-13 08:53:47 +00:00
Bryan Boreham
b9f10e9b73 refactor(probe/ebpf): make ebpf setup safer
It was possible for `t.ebpfTracker` to change underneath this code
while running on a background goroutine, so change it to take
`ebpfTracker` as a parameter.

While we're here, rename the functions to better match what they do.
2019-10-14 11:25:04 +00:00
Bryan Boreham
ae83c6545e fix(probe/ebpf): feed initial connections synchronously on restart
If we run `getInitialState()` async there is some chance we will see
another ebpf failure and call `useProcfs()` before `getInitialState()`
gets to the last line, whereupon it will crash on nil pointer.

Also it seems pointless to call `performEbpfTrack()` without waiting
for something to feed in, so I suspect this is what the original
author had in mind.

It will slow down this one `Report()` on machines with a lot of
processes or connections, but ebpfTracker restart is supposed to be a
rare event.
2019-10-14 08:22:25 +00:00
Bryan Boreham
23d8a418e1 performance: network namespace ID is a 32-bit quantity
This shrinks some data-structures slightly.

Citation: https://github.com/torvalds/linux/blob/6f0d349d922b/include/linux/ns_common.h#L10
2019-10-04 13:11:30 +00:00
Bryan Boreham
c6ce47f87d diagnostics: make fourTuple.String() human-readable 2019-10-04 13:09:50 +00:00
Bryan Boreham
2941850a75 performance: in connection tracker, hold IP addresses in binary rather than strings
This is more compact, and saves effort converting to and from the string format.
2019-09-25 20:15:05 +00:00
Bryan Boreham
9208e08bf3 Change dns snooper timeout to avoid spinning in pcap
The previous code seems to be relying on a 64-bit to 32-bit conversion
working in a certain way; when gopacket was changed to cast the value
explicitly it starts returning immeditely from pcap.
2019-09-20 14:32:53 +00:00
Bryan Boreham
5cba126c12 Merge pull request #3600 from weaveworks/expose-probe-metrics
Expose probe metrics to Prometheus
2019-08-20 14:35:06 +01:00
Bryan Boreham
eba9f31f3f fix(probe): restart conntrack handler periodically to clear out data
We observe a slow increase in connections reported, and are unable to
find the root cause, so clear down the data every six hours and start
from a clean sheet.
2019-08-13 16:30:56 +00:00
Bryan Boreham
6e715d2697 fix(probe): Loosen ebpf parameters to reduce restarts
Delay kernel events by up to 0.2ms, to reduce the chance the ebpf
reporter sends them out-of-order, and allow out-of-order events to
happen up to once a minute without giving up on the ebpf reporter.
2019-08-13 16:17:23 +00:00
Bryan Boreham
5e57b0dbcf Add metrics for conntrack and ebpf errors 2019-07-09 13:01:47 +00:00
Bryan Boreham
161014857d Merge pull request #3646 from weaveworks/remove-useless-metric
fix: remove unused metric SpyDuration
2019-07-04 17:37:16 +01:00
Bryan Boreham
c1e110ac7b fix: remove unused metric SpyDuration
The call to register this metric was removed in #633 over three years ago.
If it isn't registered then nobody can see the values.
The measurement is duplicated by metrics added in #658.
2019-07-04 13:53:00 +00:00
Bryan Boreham
1e6a6a7bb4 fix: handle errors reported by the conntrack package
In particular, ENOBUFS which indicates data has been dropped.
With this change the collector will restart, thus resynchronising with
the OS.
2019-07-04 13:30:46 +00:00
tiriplicamihai
0b4e26ed77 Fix formatting. 2019-04-30 19:24:23 +03:00
tiriplicamihai
92d3c1d5e9 Fix test data. 2019-04-30 19:09:02 +03:00
tiriplicamihai
1fbe648e82 Add newline. 2019-04-30 18:41:16 +03:00
tiriplicamihai
364a7423a5 Add tests for dns snooper. 2019-04-30 18:38:07 +03:00
tiriplicamihai
a6bc6b0148 Use break instead of continue. 2019-01-28 23:06:18 +02:00
tiriplicamihai
1aadf5c606 Fix dnssnooper probe for multiple CNAMEs. 2019-01-28 13:49:16 +02:00
CarlosEDP
4f8fc5e010 Add ARM64 build 2019-01-02 12:08:49 -02:00
Bryan Boreham
c732fee433 Don't add closed connections to 'activeFlows' 2018-11-14 15:34:58 +00:00
Bryan Boreham
95ce2cb1a8 Add build constraint on Linux-only features
Split Reporter into Linux and non-Linux parts, and stubbed it out for
non-Linux targets.
2018-11-14 15:34:58 +00:00
Bryan Boreham
01ef6a104d Eliminate connectionTrackerConfig struct 2018-11-14 15:34:58 +00:00
Bryan Boreham
e3d42676a3 Add back some parts of the original cli code 2018-11-14 15:34:58 +00:00
Bryan Boreham
71c59e87d1 Update comment 2018-11-14 15:34:58 +00:00
Bryan Boreham
f4dc368955 Don't buffer TIME_WAIT flows on conntrack start-up
When the probe first starts we should only be interested in active
connections, and if the loop re-starts it's probably because too many
connections are opening and closing to keep up with, so it's good to
drop any that are already closed then too.

Refactor the code so `handleFlow` is only called on events, and handle
the initial list of connections directly.
2018-11-14 15:34:58 +00:00
Bryan Boreham
c627802664 Refactor: remove some code that is now unnecessary
- don't need another wrapper round `conntrack.Connections()`
- logPipe() was only for the command-line conntrack
- nobody closes the `event` chan now, so no need to pre-check for quit
2018-11-14 15:34:58 +00:00
Bryan Boreham
a29e9fa27a Update to match upstream conntrack library 2018-11-14 15:34:57 +00:00
Bryan Boreham
b9405bcc4b Remove our own copy of the upstream library 2018-11-14 15:34:57 +00:00
Bryan Boreham
73f35fd6d9 Handle nat status from conntrack via netlink
Replacement for the --any-nat command-line parameter
2018-11-14 15:34:57 +00:00
Bryan Boreham
ed6a010330 Decode conntrack status from netlink 2018-11-14 15:34:57 +00:00
Bryan Boreham
3314e1f0c7 Move constants to headers.go to be more like upstream 2018-11-14 15:34:57 +00:00
Bryan Boreham
7a68b5bdb0 Use Nfgenmsg from unix package instead of declaring locally 2018-11-14 15:34:57 +00:00
Bryan Boreham
8b04ef7359 Move conntrack code out to client.go to match upstream 2018-11-14 15:34:57 +00:00
Joseph Glanville
ac63937df7 Switch to new conntrack library 2018-11-14 15:34:57 +00:00
Joseph Glanville
853196f6d1 Import conntrack library 2018-11-14 15:34:57 +00:00
meghalidhoble
625998b91e Change made to the listed files, to enable weaveworks-scope on Power(ppc64le)
1)backend/Dockerfile 2) probe/endpoint/dns_snooper.go
3) client/Dockerfile 4) docker/Dockerfile.cloud-agent
5) probe/process/walker_linux_test.go & 6) tools/lint

1)'backend/Dockerfile' : Conditional added so that the cross-compiling should
   be done on amd64. Also removed support for sh-lint for ppc64le for now.
   As the version for shfmt mentioned in the dockerfile is not available for
   ppc64le and the later version does't work fine with existing application.
2)'probe/endpoint/dns_snooper.go' : Renamed this file so as to reuse for ppc64le
   and added a build-constraint. Now this file will be build for amd64 on linux
   and ppc64le on linux.
3)'client/Dockerfile' : Modified the version of the base image for node from
   8.4.0 to 8.11, as this version supports multiarch.
4)'docker/Dockerfile.cloud-agent' : Modified the version of the base image for
   golang from 1.10.2-strech to 1.10.2, which supports multiarch.
5) 'probe/process/walker_linux_test.go' : Test fixed to run for ppc64le,
    modified the code to accept RSSBytes based on pageSize value per
    architecture, instead of hard-coded values.
6)'tools/lint' : Modified the file to skip the sh-lint implementation for ppc64le.

PR #3231
2018-08-13 12:45:25 +05:30
Marc Carré
d46c2266ce Change Sirupsen/logrus to sirupsen/logrus
```
$ git grep -l Sirupsen | grep -v vendor | xargs sed -i '' 's:github.com/Sirupsen/logrus:github.com/sirupsen/logrus:g'
$ gofmt -s -w app
$ gofmt -s -w common
$ gofmt -s -w probe
$ gofmt -s -w prog
$ gofmt -s -w tools
```
2018-07-23 20:10:14 +02:00
Michael Schubert
7bb1e38de3 ebpf: update check for known faulty Ubuntu kernels
With c75700fe04 we added code to detect
Ubuntu Xenial kernels with a regression in the eBPF subsystem in order
to gently fallback to procfs scanning on such systems (and not crash the
host system by running eBPF code).

With the latest kernel update for Ubuntu Xenial, the bug was fixed:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454

Therefore we can update the added check with an upper limit and make
sure that eBPF connection tracking only is disabled on kernels within
the range having the bug.

xref: https://github.com/weaveworks/scope/issues/3131
2018-05-23 11:38:04 +02:00
Michael Schubert
5d036c5ac4 ebpf: add tests for isKernelSupported() 2018-04-13 17:17:51 +02:00