The previous code seems to be relying on a 64-bit to 32-bit conversion
working in a certain way; when gopacket was changed to cast the value
explicitly it starts returning immeditely from pcap.
We observe a slow increase in connections reported, and are unable to
find the root cause, so clear down the data every six hours and start
from a clean sheet.
Delay kernel events by up to 0.2ms, to reduce the chance the ebpf
reporter sends them out-of-order, and allow out-of-order events to
happen up to once a minute without giving up on the ebpf reporter.
The call to register this metric was removed in #633 over three years ago.
If it isn't registered then nobody can see the values.
The measurement is duplicated by metrics added in #658.
When the probe first starts we should only be interested in active
connections, and if the loop re-starts it's probably because too many
connections are opening and closing to keep up with, so it's good to
drop any that are already closed then too.
Refactor the code so `handleFlow` is only called on events, and handle
the initial list of connections directly.
- don't need another wrapper round `conntrack.Connections()`
- logPipe() was only for the command-line conntrack
- nobody closes the `event` chan now, so no need to pre-check for quit
1)backend/Dockerfile 2) probe/endpoint/dns_snooper.go
3) client/Dockerfile 4) docker/Dockerfile.cloud-agent
5) probe/process/walker_linux_test.go & 6) tools/lint
1)'backend/Dockerfile' : Conditional added so that the cross-compiling should
be done on amd64. Also removed support for sh-lint for ppc64le for now.
As the version for shfmt mentioned in the dockerfile is not available for
ppc64le and the later version does't work fine with existing application.
2)'probe/endpoint/dns_snooper.go' : Renamed this file so as to reuse for ppc64le
and added a build-constraint. Now this file will be build for amd64 on linux
and ppc64le on linux.
3)'client/Dockerfile' : Modified the version of the base image for node from
8.4.0 to 8.11, as this version supports multiarch.
4)'docker/Dockerfile.cloud-agent' : Modified the version of the base image for
golang from 1.10.2-strech to 1.10.2, which supports multiarch.
5) 'probe/process/walker_linux_test.go' : Test fixed to run for ppc64le,
modified the code to accept RSSBytes based on pageSize value per
architecture, instead of hard-coded values.
6)'tools/lint' : Modified the file to skip the sh-lint implementation for ppc64le.
PR #3231
With c75700fe04 we added code to detect
Ubuntu Xenial kernels with a regression in the eBPF subsystem in order
to gently fallback to procfs scanning on such systems (and not crash the
host system by running eBPF code).
With the latest kernel update for Ubuntu Xenial, the bug was fixed:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
Therefore we can update the added check with an upper limit and make
sure that eBPF connection tracking only is disabled on kernels within
the range having the bug.
xref: https://github.com/weaveworks/scope/issues/3131
The Ubuntu Xenial update to kernel 4.4.0-119.143 from 4.4.0-116.140 did
include a regression in the eBPF code. A basic `bpf_map_lookup_elem`
call as found in the tcptracer-bpf library used by Scope leads to a
kernel panic. As a result, Scope / the system crashes during startup
when the tcptracer is initialized. The Scope bug report can be found
here:
https://github.com/weaveworks/scope/issues/3131
To avoid crashes and gently fallback to procfs (as Scope already does
for systems not supporting eBPF), update `isKernelSupported()` and
explicitly check for Ubuntu Kernel versions with the problem.
Once the bug is fixed and an update published, the `abiNumber` check in
`isKernelSupported()` can and should be updated with an upper limit.
The Ubuntu bug report can be found here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
It is unused and none of the adjacency mapping code in the renderer
takes any notice of it. Removing this shrinks the report size.
Edges were introduced in #838. At the time we had an experimental
packet sniffer under experimental/sniff/sniffer.go. That got removed
in #1646.
We can resurrect this if we ever decide to add meta data to edges.
Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members and the marshal.FromUtsname functions are
no longer needed.
EbpfTracker can die when the tcp events are received out of order. This
can happen with a buggy kernel or apparently in other cases, see:
https://github.com/weaveworks/scope/issues/2650
As a workaround, restart EbpfTracker when an event is received out of
order. This does not seem to happen often, but as a precaution,
EbpfTracker will not restart if the last failure is less than 5 minutes
ago.
This is not easy to test but I added instrumentation to trigger a
restart:
- Start Scope with:
$ sudo WEAVESCOPE_DOCKER_ARGS="-e SCOPE_DEBUG_BPF=1" ./scope launch
- Request a stop with:
$ echo stop | sudo tee /proc/$(pidof scope-probe)/root/var/run/scope/debug-bpf