760 Commits

Author SHA1 Message Date
Matthias Radestock
adc46e84e8 Merge pull request #3094 from weaveworks/remove-getchildren
Remove unused process tree function GetChildren()
2018-02-26 13:43:29 +00:00
Bryan Boreham
262cea2797 More efficient docker Tagger
Augment existing node rather than creating a new one then merging it,
and avoid creating a set with one entry.
2018-02-26 12:43:00 +00:00
Bryan Boreham
6593720472 Merge pull request #3080 from weaveworks/stop-polling-k8s
Remove flag -probe.kubernetes.interval and stop re-syncing Kubernetes data
2018-02-26 10:43:43 +00:00
Bryan Boreham
95359a70d0 Merge pull request #3091 from weaveworks/omit-empty-networks
Exclude null entries for networks on container nodes in probe report
2018-02-26 10:43:13 +00:00
Bryan Boreham
c06429b92a Remove unused process tree function GetChildren() 2018-02-26 08:54:31 +00:00
Bryan Boreham
64570f1311 Add Kubernetes service type and ports 2018-02-24 14:58:13 +00:00
Bryan Boreham
3941424794 Don't add null entries to container nodes for networks
or the "none" network, which is a special case meaning none.
2018-02-23 18:09:18 +00:00
Bryan Boreham
1de6d9755a Remove -probe.kubernetes.interval flag entirely 2018-02-22 15:05:20 +00:00
Bryan Boreham
b5cdcb9a42 Move DNS name mapping from endpoint to report 2018-02-20 16:14:21 +00:00
Bryan Boreham
6674ff61e5 Fix incorrect comment 2018-02-20 16:14:20 +00:00
Bryan Boreham
b742846835 Optimise processTopology() (#3074)
* Add a benchmark for processprocessTopology()

* Shortcut merging with an empty set

* Use more efficient apis to create process node
2018-02-19 10:13:58 +00:00
Bryan Boreham
f72ced3380 Add topology.ReplaceNode() for efficiency (#3073)
* Add topology.ReplaceNode() for efficiency

In some places AddNode() was called after adding to an existing node,
in which case the Merge() is just a waste of time.
2018-02-19 10:13:31 +00:00
Matthias Radestock
5b30b668ae refactor: don't return receiver in Topology.AddNode()
This had little use and was obscuring the mutating nature of
AddNode().
2018-02-19 05:10:04 +00:00
Brice Fernandes
e106568dda Create README.md 2018-02-15 10:36:49 +00:00
Roberto Bruggemann
37771a0607 Create reflectors asynchronously
Reflectors are created and run within the same function, asynchronously from the main thread.
Creating reflectors may require calls to the kubernetes api, which can return errors.
API errors are not handled in the main thread, but are handled asynchronously by retries.
2018-02-05 12:01:59 +00:00
Roberto Bruggemann
f4b55b3cf0 Fetch cronjobs from 'batch/v1beta1'
This applies if kubernetes' version is >= 1.8.
Otherwise fetch cronjobs from 'batch/v2alpha1'.
2018-01-30 17:04:32 +00:00
Roberto Bruggemann
710d665c41 Upgrade k8s.io/client-go to kubernetes-1.9.1
Upgraded from 99c19923, branch release-3.0.

This required fetching or upgrading the following:
* k8s.io/api to kubernetes-1.9.1
* k8s.io/apimachinery to kubernetes-1.9.1
* github.com/juju/ratelimit to 1.0.1
* github.com/spf13/pflag to 4c012f6d

Also, update Scope's imports/function calls to be compatible with the new client.
2018-01-30 10:14:42 +00:00
Roberto Bruggemann
2f9e2fc9ce Check if k8s resources are supported in runReflectorUntil
`isResourceSupported` checks whether a kubernetes resource is supported by the api server.
This ensures that, if the probe is unable to communicate with the api server, the call is retried until a true/false response.

If `isResourceSupported` returns false, `ListAndWatch` is not called and `runReflectorUntil` just exits.
2018-01-22 13:46:55 +00:00
Roberto Bruggemann
9198f6b38b logReadCloser: ensure loop terminates if channels are closed
Adding the !EOF check to the loop condition ensures not reading from closed channels.
2018-01-19 14:23:13 +00:00
Roberto Bruggemann
d1d370ce01 logReadCloser: ensure reader errors yield EOF
This change makes the underlying reader set their corresponding `eof` slot to true on termination.
This make the overall logReadCloser converge to EOF in case of errors of the underlying readers, therefore prevent spinning on read.

`bufio.Reader.ReadBytes` may not return io.EOF when `Close()` closes the underlying reader.
For instance, closing logReadCloser from the Scope App makes `bufio.Reader.ReadBytes` produce the following error: `http2: response body closed`.
2018-01-19 14:13:29 +00:00
Roberto Bruggemann
f42e22098e Merge pull request #3014 from weaveworks/stop-fetching-replicasets
Stop fetching ReplicaSets and ReplicationControllers
2018-01-15 10:33:30 +00:00
Roberto Bruggemann
50b182bff5 Rename CaptureResource -> CaptureDeployment
The function now only takes Deployments into account.
2018-01-11 17:01:03 +00:00
Roberto Bruggemann
5a2e214140 Merge pull request #3013 from weaveworks/multi-container-log
Reading pod logs returns all container logs
2018-01-09 11:33:32 +00:00
Roberto Bruggemann
00639d9476 logReaderCloser: remove stopChannels
Replace them with sync.WaitGroup.
2018-01-08 18:11:44 +00:00
Roberto Bruggemann
b4e3f85e89 LogReadCloser interleaves 'by line' for each container
Also, prepend each line with '[ContainerName]'.
2018-01-05 14:40:52 +00:00
Roberto Bruggemann
cf6e0ffdc6 Stop fetching ReplicaSets and ReplicationControllers
They are not reported back to the scope app.
2018-01-04 10:54:26 +00:00
Roberto Bruggemann
0b86c65e66 Reading pod logs returns all container logs
This is achieved by issuing an http request for each container to kubernetes' API, which yields one Reader for the corresponding container.
`logReadCloser' then reads from the above readers in parallel as data is available, buffering when necessary, forwarding it to clients by implementing the io.ReadCloser interface.
2018-01-03 14:13:17 +00:00
Roberto Bruggemann
2c60d50f10 Remove unused WalkNodes function 2018-01-03 13:55:27 +00:00
Roberto Bruggemann
ccfcc61042 The probe reports namespaces 2018-01-03 13:55:27 +00:00
Matthias Radestock
5dad27cf7e permit setting probe.kubernetes.interval to 0
...which is useful if we want to disable periodic fetching of all
objects.

Previously the interval was also used to set the initial backoff of
the reconnect logic. A zero value there would result in _no_
backoff. So instead we now just use the default, which is 10s which
also happens to be the default probe.kubernetes.interval, so there is
no change in behaviour for the stock settings.
2018-01-03 00:38:17 +00:00
Matthias Radestock
e2eef50cda eliminate (out-of-date) list of topologies in plugin code 2017-12-24 22:27:02 +00:00
Matthias Radestock
9419c3ef5c refactor(ish): reduce number of topology lists
Having 6 lists of topolgies in the same file is a bit much:

1. consts for topology names
2. Report type definition
3. MakeReport() Report initialisation
4. Report.Topology(name) lookup
5. Report.TopologyMap() mapping of names to topology references
6. Report.WalkPairedTopologies() iterator over topology references

We get rid of 5 and 6 by introducing a topologyNames slice. So we
are down to 5.

We replace Report.TopologyMap() with a new function,
WalkNamedTopologies, that uses topologyNames. WalkPairedTopologies()
is updated to operate in a similar fashion. Likewise for
WalkTopologies() and Topologies() - these were previously calling
Walk[Paired]Topologies, but it is clearer to simply implement them
directly.
2017-12-24 22:26:57 +00:00
Matthias Radestock
e93b69cf10 remove Node.Edges
It is unused and none of the adjacency mapping code in the renderer
takes any notice of it. Removing this shrinks the report size.

Edges were introduced in #838. At the time we had an experimental
packet sniffer under experimental/sniff/sniffer.go. That got removed
in #1646.

We can resurrect this if we ever decide to add meta data to edges.
2017-12-17 13:28:22 +00:00
Matthias Radestock
1f2247a8c4 move node metadata keys into report package
Both the probe and the app (for rendering) need to know about them.
2017-12-11 20:26:08 +00:00
Matthias Radestock
1865c46368 refactor: introduce a constant for "copy_of"
since it's shared between the probe and renderer
2017-12-09 10:45:59 +00:00
Roberto Bruggemann
1669ff8e28 Merge pull request #2957 from weaveworks/probe-no-replicasets
Stop reporting ReplicaSets
2017-12-05 14:06:59 +00:00
Roberto Bruggemann
b522443837 Stop reporting ReplicaSets
Also, add Deployment as Pod parent.
2017-12-04 16:19:49 +00:00
Matthias Radestock
914acf6e3d fix incorrect reporting of replicaset DesiredReplicas
ReplicaSetSpec.Replicas is an *int32. Just like in DeploymentSpec,
where we deal with that in the same way.
2017-11-30 18:40:00 +00:00
Filip Barl
119bbab4fe Merge pull request #2915 from weaveworks/2875-humanize-durations
Humanize reported durations
2017-11-06 14:11:45 +01:00
Bryan Boreham
6aab6ced5a Merge pull request #2918 from weaveworks/ecs-crash-less
Don't de-reference pointers from AWS without checking
2017-11-04 20:33:49 +00:00
Filip Barl
f5bfa506d6 Verified the TODO comments and make durations be in seconds. 2017-11-03 10:43:41 +01:00
Filip Barl
320b9e240f Abstracted the report data types. 2017-11-03 10:43:41 +01:00
Filip Barl
6c0194b832 Show uptime durations in a more human format. 2017-11-03 10:43:41 +01:00
Tobias Klauser
89f3ce2e64 Simplify Utsname string conversion
Use Utsname from golang.org/x/sys/unix which contains byte array
instead of int8/uint8 array members. This allows to simplify the string
conversions of these members and the marshal.FromUtsname functions are
no longer needed.
2017-11-02 08:45:54 +01:00
Bryan Boreham
a082c28919 Don't de-reference pointers from AWS without checking 2017-11-01 14:52:50 +00:00
Damien Lespiau
5990ad4947 docker: Close pipe when the docker API call fails
This hasn't been found in the wild but by code inspection. If we fail the
docker API call, the pipe is never closed. Close it before returning.
2017-10-16 23:30:46 +01:00
Mike Lang
1c6fbffc69 Fix test broken by #2854 2017-09-19 03:54:13 -07:00
Bruno Galindro da Costa
cd21bafa2e Adds ECS Cluster Region option 2017-09-18 20:14:44 -03:00
Alban Crequy
9c53653997 EbpfTracker: restart it when it dies
EbpfTracker can die when the tcp events are received out of order. This
can happen with a buggy kernel or apparently in other cases, see:
https://github.com/weaveworks/scope/issues/2650

As a workaround, restart EbpfTracker when an event is received out of
order. This does not seem to happen often, but as a precaution,
EbpfTracker will not restart if the last failure is less than 5 minutes
ago.

This is not easy to test but I added instrumentation to trigger a
restart:

- Start Scope with:
    $ sudo WEAVESCOPE_DOCKER_ARGS="-e SCOPE_DEBUG_BPF=1" ./scope launch

- Request a stop with:
    $ echo stop | sudo tee /proc/$(pidof scope-probe)/root/var/run/scope/debug-bpf
2017-08-17 16:39:27 +02:00
Matthias Radestock
7a23afde2c Merge pull request #2781 from weaveworks/2550-non-login-container-shell
run a normal (rather than login) shell in containers
2017-08-02 08:33:43 +01:00