Bryan Boreham
bcdf2caa61
Make lint happy
...
Mostly new-style Go build directives
Signed-off-by: Bryan Boreham <bjboreham@gmail.com >
2022-03-14 18:52:10 +00:00
Bryan Boreham
aa0aab4094
multitenant: only count real hosts for billing
...
PR #3822 made the kubernetes probe send host nodes, so we need to
exclude them from counting when billing.
2021-05-20 14:29:39 +00:00
Bryan Boreham
f4bc57b1fc
fix: copy report before modifying
...
We call `UnsafeRemovePartMergedNodes()` which modifies the data,
so all implementations of Report() must ensure they return a new object,
not one which is cached or shared across goroutines.
2021-05-18 09:33:29 +00:00
Bryan Boreham
1b0979057c
multitenant: add metrics for reports received
...
Previously we were reporting data on reports stored
2021-05-14 11:13:10 +00:00
Bryan Boreham
2dcb319ef5
Refactor: move collector metrics where they are used
2021-05-11 16:16:08 +00:00
Bryan Boreham
5833259963
Register liveCollector metrics in init
...
Otherwise they don't get registered for the awsCollector
2021-05-11 16:10:16 +00:00
Bryan Boreham
b3a40b7453
Add multitenant-live collector
...
For when we want to collect reports in memory, but not save them to store.
Extract this functionality out of awsCollector to create new
liveCollector object.
2021-05-11 11:45:55 +00:00
Bryan Boreham
2639a1309c
Re-order imports to match convention
2021-05-11 11:15:54 +00:00
Bryan Boreham
33acfa1e59
Log part-merged nodes dropped to tracing
...
So we have more idea what happened, in case of issue.
2021-04-23 10:51:24 +00:00
Bryan Boreham
4dbf908cde
Remove partially merged nodes from deltas
...
Scope probes send full reports and deltas. If a node is eliminated
between two full reports, then the app might only have a delta of its
last state. Remove all such nodes before rendering.
2021-04-23 09:38:45 +00:00
Bryan Boreham
ced99f5008
multitenant: serialise report to buffer before sending
...
Seems to be faster
2021-04-18 19:25:13 +00:00
Bryan Boreham
5856f372db
multitenant: resolve collectors less frequently
...
DNS records don't change that fast
2021-04-18 19:25:13 +00:00
Bryan Boreham
9b62023266
Do REST calls from to collectors in parallel
2021-04-18 19:25:13 +00:00
Bryan Boreham
99582ba835
Implement HasReports for live data from collectors
2021-04-18 19:25:13 +00:00
Bryan Boreham
055ca53241
refactor: extract fn to check whether collector or querier
2021-04-18 19:25:13 +00:00
Bryan Boreham
5032cca5c0
Multitenant mode: fetch live data from collectors
...
Collectors hold recent reports in memory.
When querier needs 'live' data, fetch it from collectors instead
of from the long-term store.
Send reports from collector to querier in msgpack; disable compression
on REST call, otherwise Go silently decompresses, which takes longer.
2021-04-18 19:25:13 +00:00
Bryan Boreham
667daef81b
Refactor: extract function reportsFromStore()
...
To help clarify subsequent changes
2021-04-18 19:01:44 +00:00
Bryan Boreham
5d12b7ff65
Refactor: extract multitenant collection of 'live' reports
...
To help clarify subsequent changes
2021-04-18 19:01:44 +00:00
Bryan Boreham
b9c8cf6998
Add flag for querier to talk to collectors
2021-04-18 19:01:44 +00:00
Bryan Boreham
1eb57c2e40
Multitenant collector now always saves async
...
Removed support for saving all reports immediately
2021-04-18 18:59:12 +00:00
Bryan Boreham
f41b90a7d8
Clean up 'import' ordering
2021-04-04 13:47:27 +01:00
Bryan Boreham
2cf48f2bdd
Don't call Fatal() on background thread in test
...
It doesn't fail the test
2021-04-04 13:46:58 +01:00
Bryan Boreham
103ea2095f
Fix lint warnings in Go code
...
All cosmetic.
2020-12-30 18:30:34 +00:00
Bryan Boreham
18acfcefe1
Run go fmt on various files
...
Seems that go fmt has changed behaviour since these files were last
checked in. Changes are all cosmetic.
2020-12-30 18:30:34 +00:00
Bryan Boreham
e6faa2ba4b
Merge pull request #3796 from weaveworks/billing-spy-interval
...
billing: cope with spy-interval set longer than publish-interval
2020-06-11 14:12:37 +01:00
Bryan Boreham
5318498d9a
improvement: make command-line parsing more robust
2020-06-11 11:15:35 +00:00
Bryan Boreham
70240fc82d
billing: cope with spy-interval set longer than publish-interval
2020-06-11 11:15:35 +00:00
Bryan Boreham
5264b61951
improvement: stop rendering if Context is cancelled
...
Typically this means the http caller has closed the connection,
so no point responding to them.
Also check at the point we send a response back, and log to OpenTracing.
2020-06-11 11:13:38 +00:00
Bryan Boreham
a20c51e94d
Higher limit on topology size for merged reports.
...
Where a report has been merged from several probes, give it a higher
limit before dropping topologies.
We will already have applied the limit on each single-probe report as
it came in, except for historical ones.
2020-05-25 13:12:47 +00:00
Bryan Boreham
fd65155cd6
Register the metric for dropped topologies
...
Missed earlier.
2020-05-25 13:12:27 +00:00
Bryan Boreham
b117b1a5ef
cosmetic: move misplaced import
2020-05-19 10:09:52 +00:00
Bryan Boreham
ad82fafde8
multitenant: scan container command-lines as well as process
2020-05-19 10:09:28 +00:00
Bryan Boreham
f83ad517d8
multitenant: extract command-line parsing function and add test
2020-05-19 10:07:23 +00:00
Bryan Boreham
323aa46d1c
fix (pipes): check websocket errors inside CopyToWebsocket()
...
Previously we were treating EOF on the reader as no-error, meaning
that operations like Kubernetes Describe would retry endlessly when
finished.
2020-05-06 10:04:40 +00:00
Bryan Boreham
fa4d1c4c2b
Upgrade reports before merging
...
In case they came from an older or an overload probe.
2020-04-16 19:27:40 +00:00
Bryan Boreham
b772fa83b3
Add a metric for topologies dropped because they are over limit
...
Need to modify DropTopologiesOver() to report what it dropped, and
plumb through the userid so the metric can show who has a problem.
2020-04-16 19:27:28 +00:00
Bryan Boreham
b1fc59819a
comment: clarify memcached error cases
2020-04-15 16:49:02 +00:00
Bryan Boreham
9a739fda46
Parallelise sending merged reports to store
...
Writes to DynamoDB and S3 can be done in parallel, which will reduce
the overall flush time.
2020-04-13 19:11:34 +00:00
Bryan Boreham
2629d13780
Add a histogram for flush times
2020-04-13 19:11:34 +00:00
Bryan Boreham
ccf031b8a9
enhancement(multitenant): merge incoming reports in a time window
...
This means we store fewer, bigger, reports, which reduces cost of
storage and time to render when data is viewed.
2020-04-13 19:11:34 +00:00
Bryan Boreham
104b9cba50
refactor: Call Close() on collector
...
Doesn't do anything at present, but will be used later.
Change the signature on BillingEmitter.Close() to match. Note we didn't use the error returned.
2020-04-13 19:11:34 +00:00
Bryan Boreham
777ff07e19
refactor(multitenant): break report storage code out into sub-functions
...
So the main Add() function isn't so long.
2020-04-13 19:11:34 +00:00
Bryan Boreham
8c46367808
fix(multitenant): move use of rounding map inside lock
2020-04-13 16:16:20 +00:00
Bryan Boreham
3f11352435
enhancement(multitenant): Track rounding error in billing calculation
...
Billing takes an integer number of seconds, so keep track of the
amount lost to rounding when the publish interval is not an integer.
2020-04-10 19:04:50 +00:00
Bryan Boreham
c784acc20d
Revert change to use report timestamp
...
This reverts commit 6b72246fe6 .
The app merges reports within a 15-second window of its own time, so
if one or more probes have a time that is several seconds different
they will get excluded from the window.
2020-03-28 13:58:34 +00:00
Bryan Boreham
6b72246fe6
fix (multitenant collector): Use consistent report timestamp
...
Previously the code called `time.Now()` in two different places so the
timestamps didn't match. Now we use the timestamp of the report itself.
Add the collector's local time to the report if it didn't have one.
2020-03-26 19:15:34 +00:00
Bryan Boreham
ba9ecdd9e2
Merge pull request #3752 from weaveworks/report-window
...
Set timestamp and window on each report
2020-03-11 21:12:02 +00:00
Bryan Boreham
53701aca1f
Cache the last-known report interval per user
...
Delta reports don't contain the string we are looking for, so remember
it from the last full report.
2020-03-06 18:03:51 +00:00
Bryan Boreham
a47cf0a2aa
Remove copying Merge() on Report
...
It was only used in a few places, and all of those were better off
using the Unsafe variant.
2020-03-06 15:03:43 +00:00
Bryan Boreham
329023b7c5
Improve calculation of usage in multitenant code
...
Use the duration supplied, if there is one.
It was looking for a process named "scope-probe", whereas the
executable is just named "scope".
2020-03-06 13:29:54 +00:00