* Pass Go context down to Renderers
This is useful for cancellation or tracing.
* Add tracing spans to app
Also log things like number of nodes in Map, total number of reports.
A lot of time could pass between recording the request count and hit
count pertaining to a particular report fetching batch, which skewed
calculations cache hit ratios.
Fix that by defering the request count recording to the end, which is
when we record the hit count.
Problem: Decoding a corrupt report grows the 'missing' list. Since we
are waiting for 'len(keys)-len(missing)' decoder go-routines, this
results in waiting for fewer go-routines than we should. The surplus
go-routines leak and we ignore their reports. And since the keys of the
ignored reports are not included in 'missing', we won't attempt to fetch
them from S3 either. Oops.
Fix: calculate the number of go-routines once, at the beginning.
* Rework Scope metrics according to Prometheus conventions.
- counters should end with _total
- elaborated and added units to help strings
- recommended for cache hit/miss metrics: track only the total and the
hits and in separate metrics, since the most common query will be
"hits / total"
- track all times in seconds (base units), which has become the standard
recommendation
- other small changes
There could be more changes that would require more thinking (what
dimensions to use, summaries vs. histograms, etc.), but this is probably
enough controversial material already :)
* Use timeRequestStatus() in sqs_control_router.go.