Commit Graph

778 Commits

Author SHA1 Message Date
Bryan Boreham
20ce708db9 Don't bother deduplicating IPs; they end up in a set 2018-06-04 16:41:24 +00:00
Bryan Boreham
c6c51f36f7 Limit network namespace code to compile on Linux only 2018-06-04 10:54:02 +00:00
Bryan Boreham
ade54ba84e probe: stop calling 'weave ps'
Now that we enter the container namespace to fetch IPs for every
container, there is no need to have 'weave ps' do it.

This does mean we lose Weave MAC addresses, but that is a rather
idiosyncratic feature anyway.
2018-06-02 22:22:08 +00:00
Bryan Boreham
ff5b2affe0 probe: fetch container IP addresses from inside its namespace
So that we can pick up addresses added via CNI or other mechanisms
that Docker is not aware of.
2018-06-02 21:49:30 +00:00
Michael Schubert
7bb1e38de3 ebpf: update check for known faulty Ubuntu kernels
With c75700fe04 we added code to detect
Ubuntu Xenial kernels with a regression in the eBPF subsystem in order
to gently fallback to procfs scanning on such systems (and not crash the
host system by running eBPF code).

With the latest kernel update for Ubuntu Xenial, the bug was fixed:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454

Therefore we can update the added check with an upper limit and make
sure that eBPF connection tracking only is disabled on kernels within
the range having the bug.

xref: https://github.com/weaveworks/scope/issues/3131
2018-05-23 11:38:04 +02:00
Stefan Prodan
439b67880e Fix pause image detection for Kubernetes 1.10 2018-05-19 14:03:17 +03:00
Roland Schilter
f012c23ca1 Sentence cased text everywhere (#3166)
* Sentence cased text everywhere

Follows Weave Cloud's direction of sentence case on most things.

* More space between sorter caret and label

* Use full topology name for table header
2018-05-17 17:30:38 -07:00
Filip Barl
bfb20a8f40 Addressed @LiliC's feedback. 2018-05-17 11:43:54 +02:00
Filip Barl
183aaea950 Fixed the tests. 2018-05-17 11:09:31 +02:00
Filip Barl
4382deb39b Show image tag separate from image name in Node Details. 2018-05-17 11:09:31 +02:00
Michael Schubert
5d036c5ac4 ebpf: add tests for isKernelSupported() 2018-04-13 17:17:51 +02:00
Michael Schubert
c75700fe04 ebpf: check for known faulty Ubuntu kernel
The Ubuntu Xenial update to kernel 4.4.0-119.143 from 4.4.0-116.140 did
include a regression in the eBPF code. A basic `bpf_map_lookup_elem`
call as found in the tcptracer-bpf library used by Scope leads to a
kernel panic. As a result, Scope / the system crashes during startup
when the tcptracer is initialized. The Scope bug report can be found
here:

https://github.com/weaveworks/scope/issues/3131

To avoid crashes and gently fallback to procfs (as Scope already does
for systems not supporting eBPF), update `isKernelSupported()` and
explicitly check for Ubuntu Kernel versions with the problem.

Once the bug is fixed and an update published, the `abiNumber` check in
`isKernelSupported()` can and should be updated with an upper limit.

The Ubuntu bug report can be found here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1763454
2018-04-13 17:17:51 +02:00
Matthias Radestock
805572d70e refactor: inline single-use constant 2018-04-02 13:50:22 +01:00
Matthias Radestock
e852b18e42 refactor: remove unused constant 2018-04-02 13:48:29 +01:00
Matthias Radestock
feae4f4fcf don't show Failed pods
these are not alive, and Scope generally only shows the living, not
the dead.
2018-04-02 11:03:25 +01:00
Matthias Radestock
27fb3571e1 refactor: remove StateDeleted from map keys
since it is a map value, not a key.
2018-04-02 11:03:01 +01:00
Matthias Radestock
076acdb319 send shortcut reports on all container state changes
This got broken in cec750049f.
2018-03-25 09:08:45 +01:00
Roland Schilter
3de06b5a09 Support controls in more k8s topologies (#3110)
* Refactor: func has already access to probeID

* Allow controls for more k8s topologies

Controls for nodes generally need to know about the probe that is in
control of them.

This PR appends the probe ID info to k8s topologies CronJob, DaemonSet,
Service, and StatefulSet. Therefore allowing plugins to append controls.

* Remove superfluous error check

* Add some tests to verify controls allowance
2018-03-16 12:39:57 -07:00
Matthias Radestock
adc46e84e8 Merge pull request #3094 from weaveworks/remove-getchildren
Remove unused process tree function GetChildren()
2018-02-26 13:43:29 +00:00
Bryan Boreham
262cea2797 More efficient docker Tagger
Augment existing node rather than creating a new one then merging it,
and avoid creating a set with one entry.
2018-02-26 12:43:00 +00:00
Bryan Boreham
6593720472 Merge pull request #3080 from weaveworks/stop-polling-k8s
Remove flag -probe.kubernetes.interval and stop re-syncing Kubernetes data
2018-02-26 10:43:43 +00:00
Bryan Boreham
95359a70d0 Merge pull request #3091 from weaveworks/omit-empty-networks
Exclude null entries for networks on container nodes in probe report
2018-02-26 10:43:13 +00:00
Bryan Boreham
c06429b92a Remove unused process tree function GetChildren() 2018-02-26 08:54:31 +00:00
Bryan Boreham
64570f1311 Add Kubernetes service type and ports 2018-02-24 14:58:13 +00:00
Bryan Boreham
3941424794 Don't add null entries to container nodes for networks
or the "none" network, which is a special case meaning none.
2018-02-23 18:09:18 +00:00
Bryan Boreham
1de6d9755a Remove -probe.kubernetes.interval flag entirely 2018-02-22 15:05:20 +00:00
Bryan Boreham
b5cdcb9a42 Move DNS name mapping from endpoint to report 2018-02-20 16:14:21 +00:00
Bryan Boreham
6674ff61e5 Fix incorrect comment 2018-02-20 16:14:20 +00:00
Bryan Boreham
b742846835 Optimise processTopology() (#3074)
* Add a benchmark for processprocessTopology()

* Shortcut merging with an empty set

* Use more efficient apis to create process node
2018-02-19 10:13:58 +00:00
Bryan Boreham
f72ced3380 Add topology.ReplaceNode() for efficiency (#3073)
* Add topology.ReplaceNode() for efficiency

In some places AddNode() was called after adding to an existing node,
in which case the Merge() is just a waste of time.
2018-02-19 10:13:31 +00:00
Matthias Radestock
5b30b668ae refactor: don't return receiver in Topology.AddNode()
This had little use and was obscuring the mutating nature of
AddNode().
2018-02-19 05:10:04 +00:00
Brice Fernandes
e106568dda Create README.md 2018-02-15 10:36:49 +00:00
Roberto Bruggemann
37771a0607 Create reflectors asynchronously
Reflectors are created and run within the same function, asynchronously from the main thread.
Creating reflectors may require calls to the kubernetes api, which can return errors.
API errors are not handled in the main thread, but are handled asynchronously by retries.
2018-02-05 12:01:59 +00:00
Roberto Bruggemann
f4b55b3cf0 Fetch cronjobs from 'batch/v1beta1'
This applies if kubernetes' version is >= 1.8.
Otherwise fetch cronjobs from 'batch/v2alpha1'.
2018-01-30 17:04:32 +00:00
Roberto Bruggemann
710d665c41 Upgrade k8s.io/client-go to kubernetes-1.9.1
Upgraded from 99c19923, branch release-3.0.

This required fetching or upgrading the following:
* k8s.io/api to kubernetes-1.9.1
* k8s.io/apimachinery to kubernetes-1.9.1
* github.com/juju/ratelimit to 1.0.1
* github.com/spf13/pflag to 4c012f6d

Also, update Scope's imports/function calls to be compatible with the new client.
2018-01-30 10:14:42 +00:00
Roberto Bruggemann
2f9e2fc9ce Check if k8s resources are supported in runReflectorUntil
`isResourceSupported` checks whether a kubernetes resource is supported by the api server.
This ensures that, if the probe is unable to communicate with the api server, the call is retried until a true/false response.

If `isResourceSupported` returns false, `ListAndWatch` is not called and `runReflectorUntil` just exits.
2018-01-22 13:46:55 +00:00
Roberto Bruggemann
9198f6b38b logReadCloser: ensure loop terminates if channels are closed
Adding the !EOF check to the loop condition ensures not reading from closed channels.
2018-01-19 14:23:13 +00:00
Roberto Bruggemann
d1d370ce01 logReadCloser: ensure reader errors yield EOF
This change makes the underlying reader set their corresponding `eof` slot to true on termination.
This make the overall logReadCloser converge to EOF in case of errors of the underlying readers, therefore prevent spinning on read.

`bufio.Reader.ReadBytes` may not return io.EOF when `Close()` closes the underlying reader.
For instance, closing logReadCloser from the Scope App makes `bufio.Reader.ReadBytes` produce the following error: `http2: response body closed`.
2018-01-19 14:13:29 +00:00
Roberto Bruggemann
f42e22098e Merge pull request #3014 from weaveworks/stop-fetching-replicasets
Stop fetching ReplicaSets and ReplicationControllers
2018-01-15 10:33:30 +00:00
Roberto Bruggemann
50b182bff5 Rename CaptureResource -> CaptureDeployment
The function now only takes Deployments into account.
2018-01-11 17:01:03 +00:00
Roberto Bruggemann
5a2e214140 Merge pull request #3013 from weaveworks/multi-container-log
Reading pod logs returns all container logs
2018-01-09 11:33:32 +00:00
Roberto Bruggemann
00639d9476 logReaderCloser: remove stopChannels
Replace them with sync.WaitGroup.
2018-01-08 18:11:44 +00:00
Roberto Bruggemann
b4e3f85e89 LogReadCloser interleaves 'by line' for each container
Also, prepend each line with '[ContainerName]'.
2018-01-05 14:40:52 +00:00
Roberto Bruggemann
cf6e0ffdc6 Stop fetching ReplicaSets and ReplicationControllers
They are not reported back to the scope app.
2018-01-04 10:54:26 +00:00
Roberto Bruggemann
0b86c65e66 Reading pod logs returns all container logs
This is achieved by issuing an http request for each container to kubernetes' API, which yields one Reader for the corresponding container.
`logReadCloser' then reads from the above readers in parallel as data is available, buffering when necessary, forwarding it to clients by implementing the io.ReadCloser interface.
2018-01-03 14:13:17 +00:00
Roberto Bruggemann
2c60d50f10 Remove unused WalkNodes function 2018-01-03 13:55:27 +00:00
Roberto Bruggemann
ccfcc61042 The probe reports namespaces 2018-01-03 13:55:27 +00:00
Matthias Radestock
5dad27cf7e permit setting probe.kubernetes.interval to 0
...which is useful if we want to disable periodic fetching of all
objects.

Previously the interval was also used to set the initial backoff of
the reconnect logic. A zero value there would result in _no_
backoff. So instead we now just use the default, which is 10s which
also happens to be the default probe.kubernetes.interval, so there is
no change in behaviour for the stock settings.
2018-01-03 00:38:17 +00:00
Matthias Radestock
e2eef50cda eliminate (out-of-date) list of topologies in plugin code 2017-12-24 22:27:02 +00:00
Matthias Radestock
9419c3ef5c refactor(ish): reduce number of topology lists
Having 6 lists of topolgies in the same file is a bit much:

1. consts for topology names
2. Report type definition
3. MakeReport() Report initialisation
4. Report.Topology(name) lookup
5. Report.TopologyMap() mapping of names to topology references
6. Report.WalkPairedTopologies() iterator over topology references

We get rid of 5 and 6 by introducing a topologyNames slice. So we
are down to 5.

We replace Report.TopologyMap() with a new function,
WalkNamedTopologies, that uses topologyNames. WalkPairedTopologies()
is updated to operate in a similar fashion. Likewise for
WalkTopologies() and Topologies() - these were previously calling
Walk[Paired]Topologies, but it is clearer to simply implement them
directly.
2017-12-24 22:26:57 +00:00