Commit Graph

925 Commits

Author SHA1 Message Date
Bryan Boreham
574c76ac40 Merge pull request #3709 from weaveworks/report-endpoint-subset
Report a subset of connections from/to the same endpoint
2020-01-27 22:42:47 +00:00
Bryan Boreham
7dc7215a26 Refactor: improve readability based on review feedback 2020-01-23 15:04:51 +00:00
Bryan Boreham
53297eb07c Merge pull request #3743 from weaveworks/more-pause
kubernetes: detect more 'pause' containers
2020-01-23 14:43:06 +00:00
Bryan Boreham
880daa78ff Extend K8s tagger test to cover pause containers 2020-01-23 11:47:40 +00:00
Bryan Boreham
92b8a489e7 kubernetes: detect more 'pause' containers
Dockershim has added a label `io.kubernetes.docker.type` for at least
four years, where the pause container is of type `podsandbox`.  This
should be more reliable than trying to keep up with everyone's name
for the pause container.
2020-01-15 10:26:10 +00:00
Bryan Boreham
1dcdfab05a fixup: from review feedback
Fix a logic error in ECS scale-down button, bad copy/paste in
ActiveControls() and neaten the switch cases in container controls.

Co-Authored-By: Filip Barl <filip@weave.works>
2020-01-13 14:48:38 +00:00
Bryan Boreham
b7b245ed48 tests: connection subset testing
Utility functions to create fake sets of connections for testing, and
then exercising the subset filtering code to check that quantities
come out as expected.
2020-01-13 08:59:51 +00:00
Bryan Boreham
de3c34ddc6 performance(probe): thin out many connections between the same point
The app will only show one line, regardless of how many connections we
have, so reduce the number to save bandwidth and rendering time.

We filter by choosing a modulus, e.g. send every connection that is a
multiple of 3, or 9, and so on. We avoid multiples of 2 because port
numbers are often a multiple of 2 or 4 for bit-encoding reasons.
2020-01-13 08:53:47 +00:00
Bryan Boreham
fc46ea17ee refactor(probe/ebpf): track connections by four-tuple+namespace
The previous code tracked only by four-tuple, which meant that two
connections with same address/port combinations in different namespace
would clash and one would get dropped.

Also previously the tuple was duplicated between the map key and
value, so we remove it from the value.

We only add the namespace in the case that the local address is
loopback, which matches how the rest of Scope treats addresses.
2020-01-13 08:53:47 +00:00
Bryan Boreham
9758c81736 comment: add explanatory comment on handleFdInstall() 2020-01-13 08:53:47 +00:00
Bryan Boreham
d57a4df3b2 enhancement(probe): debug message for initial connection 2020-01-13 08:53:47 +00:00
Bryan Boreham
2767e2b319 improvement: only record connections that we have a PID for 2020-01-13 08:53:47 +00:00
Bryan Boreham
a6da810261 refactor(probe): move host/pid encoding into addConnection() function 2020-01-13 08:53:47 +00:00
Bryan Boreham
c88be40b19 performance: Update plugins to new-style controls data 2019-11-26 11:29:42 +00:00
Bryan Boreham
85d2f6309c performance: Send active controls as a single string per node
Instead of a whole extra data structure which is quite expensive to
marshal and unmarshal, just send the information in a string.  No
clever merging strategy is required - the states are all set in one
place per node type.
2019-11-26 11:29:42 +00:00
Bryan Boreham
5ebe9b4b18 Merge pull request #3720 from DarthSett/master
Adding the "user-agent" Header
2019-11-26 11:19:36 +00:00
Sumit Lalwani
1ce7707f25 Update pod status to terminating
Signed-off-by: Sumit Lalwani <sumit.lalwani97@gmail.com>
2019-11-25 11:31:48 +05:30
DarthSett
ccfd2f0427 Added test to check user-agent header 2019-10-28 19:34:22 +05:30
DarthSett
4eab46670e Update user-agent in probe/appclient/probe_config.go
Co-Authored-By: Filip Barl <filip.barl@gmail.com>
2019-10-25 19:31:17 +05:30
DarthSett
af31e30439 Add user-agent header 2019-10-25 12:33:35 +05:30
“DarthSett”
7adc70c5a5 add the user-agent header 2019-10-23 16:35:57 +05:30
rahul agrawal
0000173c05 add the user-agent header 2019-10-23 16:31:59 +05:30
Bryan Boreham
b9f10e9b73 refactor(probe/ebpf): make ebpf setup safer
It was possible for `t.ebpfTracker` to change underneath this code
while running on a background goroutine, so change it to take
`ebpfTracker` as a parameter.

While we're here, rename the functions to better match what they do.
2019-10-14 11:25:04 +00:00
Bryan Boreham
ae83c6545e fix(probe/ebpf): feed initial connections synchronously on restart
If we run `getInitialState()` async there is some chance we will see
another ebpf failure and call `useProcfs()` before `getInitialState()`
gets to the last line, whereupon it will crash on nil pointer.

Also it seems pointless to call `performEbpfTrack()` without waiting
for something to feed in, so I suspect this is what the original
author had in mind.

It will slow down this one `Report()` on machines with a lot of
processes or connections, but ebpfTracker restart is supposed to be a
rare event.
2019-10-14 08:22:25 +00:00
Bryan Boreham
b24917993e fix: report http error if /api call fails
Previously it would try to run the JSON decoder on a string like "404
not found" and report that failing.
2019-10-06 17:27:49 +00:00
Bryan Boreham
bd9f88b985 Merge pull request #3696 from weaveworks/ebpf-non-strings
handle IP addresses in binary rather than strings
2019-10-04 14:45:15 +01:00
Bryan Boreham
23d8a418e1 performance: network namespace ID is a 32-bit quantity
This shrinks some data-structures slightly.

Citation: https://github.com/torvalds/linux/blob/6f0d349d922b/include/linux/ns_common.h#L10
2019-10-04 13:11:30 +00:00
Bryan Boreham
c6ce47f87d diagnostics: make fourTuple.String() human-readable 2019-10-04 13:09:50 +00:00
Bryan Boreham
cbbb2ff24c performance: in Docker reporter, reduce IP type conversions.
The code was converting IP addresses to strings and back again.
2019-09-25 20:15:34 +00:00
Bryan Boreham
2941850a75 performance: in connection tracker, hold IP addresses in binary rather than strings
This is more compact, and saves effort converting to and from the string format.
2019-09-25 20:15:05 +00:00
Bryan Boreham
8d9e337a75 chore: fix typos in debugging format strings 2019-09-25 20:08:29 +00:00
Satyam Zode
e7e9e97943 Merge pull request #3606 from weaveworks/update-gopacket
Update google/gopacket library
2019-09-23 16:11:08 +05:30
Bryan Boreham
17c1aaa131 chore(probe): for Kubernetes 1.16 move to 'v1' APIs
Scope will no longer work with Kubernetes 1.8 and below.

For CronJob there is no 'v1' as yet, but we can remove the alpha
version.
2019-09-21 15:52:34 +00:00
Bryan Boreham
9208e08bf3 Change dns snooper timeout to avoid spinning in pcap
The previous code seems to be relying on a 64-bit to 32-bit conversion
working in a certain way; when gopacket was changed to cast the value
explicitly it starts returning immeditely from pcap.
2019-09-20 14:32:53 +00:00
Akash Srivastava
4b6b12d2c8 Merge pull request #3688 from weaveworks/fix-testregdelete-flake
fix(test-flake): poll for result in TestRegistryDelete() to avoid race
2019-09-19 16:39:50 +05:30
Bryan Boreham
49dfd98c94 fix(test-flake): poll for result in TestRegistryDelete() to avoid race
Remove the `runtime.GoSched()` that doesn't guarantee anything.
2019-09-18 21:32:44 +00:00
Bryan Boreham
71a359e1d7 Merge pull request #3679 from weaveworks/probe-unsafe-merge
perf(probe): reduce copying of nodes
2019-09-18 15:56:06 +01:00
Bryan Boreham
b6d5594f9f perf(probe): publish delta reports to reduce data size
Similar to video compression which uses key-frames and differences
between them: every N publishes we send a full report, but inbetween
we only send what has changed.

Fairly simple approach in the probe - hold on to the last full report,
and for the deltas remove anything that would be merged in from the
full report.

On the receiving side in the app it already merges a set of reports
together to produce the final output for rendering, so provided N is
smaller than that set we don't need to do anything different.

Deltas don't need to represent nodes that have disappeared - an
earlier full node will have that node so it would be merged into the
final output anyway.
2019-09-18 08:00:28 +00:00
Bryan Boreham
eff5a1f9f7 Refactor: pull Publish() call up to publishLoop() 2019-09-18 08:00:28 +00:00
Bryan Boreham
a811afdba1 Merge pull request #3678 from weaveworks/nodes-omitempty
perf(probe): add 'omitempty' tag to Topology.Nodes
2019-09-17 16:25:52 +01:00
Akash Srivastava
0203757cf5 Merge pull request #3675 from weaveworks/reduce-probe-dependency
Stop render package depending on probe
2019-09-16 12:56:56 +05:30
Bryan Boreham
871751873b Stop render package depending on probe
This dependency makes it harder to see the structure of the program,
and sometimes complicates compilation.

Mostly just changing the source of strings that are already exported
from the report package.  A few new strings have to be moved there,
plus the function `IsPauseImageName()`.
2019-09-15 17:03:04 +00:00
Bryan Boreham
4c52889316 Add 'omitempty' tag to Topology.Nodes
So we save space writing out empty topologies.

Need to fix up `app_client_internal_test.go` to use Scope's
`test/reflect` package that understands empty==nil, so now it doesn't
need a previous workaround.

Remove a similar workaround in `probe_internal_test.go` that isn't
necessary since it's already using that package.
2019-09-15 15:50:08 +00:00
Bryan Boreham
2f9c9913c4 perf(probe): reduce copying of nodes
Where we know we are merging several reports into the one, we can call
UnsafeMerge() and skip the copy that Merge() will do.
2019-09-15 15:40:28 +00:00
Bryan Boreham
48aad1a20d Remove unused string constants 2019-09-13 11:42:21 +00:00
Bryan Boreham
15467d7310 Move host-related names out of probe code
Reduce the dependency on low-level libraries
2019-09-13 11:41:09 +00:00
Bryan Boreham
5cba126c12 Merge pull request #3600 from weaveworks/expose-probe-metrics
Expose probe metrics to Prometheus
2019-08-20 14:35:06 +01:00
Bryan Boreham
eba9f31f3f fix(probe): restart conntrack handler periodically to clear out data
We observe a slow increase in connections reported, and are unable to
find the root cause, so clear down the data every six hours and start
from a clean sheet.
2019-08-13 16:30:56 +00:00
Bryan Boreham
6e715d2697 fix(probe): Loosen ebpf parameters to reduce restarts
Delay kernel events by up to 0.2ms, to reduce the chance the ebpf
reporter sends them out-of-order, and allow out-of-order events to
happen up to once a minute without giving up on the ebpf reporter.
2019-08-13 16:17:23 +00:00
Bryan Boreham
609d9a7506 Merge pull request #3363 from princerachit/format-error-log
refactor(logs): Add reporter name to error logs.
2019-07-31 14:51:30 +01:00