* Rework Scope metrics according to Prometheus conventions.
- counters should end with _total
- elaborated and added units to help strings
- recommended for cache hit/miss metrics: track only the total and the
hits and in separate metrics, since the most common query will be
"hits / total"
- track all times in seconds (base units), which has become the standard
recommendation
- other small changes
There could be more changes that would require more thinking (what
dimensions to use, summaries vs. histograms, etc.), but this is probably
enough controversial material already :)
* Use timeRequestStatus() in sqs_control_router.go.
* Add in-process caching to dynamodb collector
* Add metrics for dynamodb consumed capacity and report size
* Log and return errors during report collection
* Increase compression to the max
* Put reports in S3 and just use DynamoDB as an index.
* Review feedback
Fix a few bugs in the consul pipe router:
- Don't share a pointer
- Write nil to pipe when closing a bridge connection to ensure the connection shutdown.
- Ensure we shutdown bridge connections correctly
This is because the key is of the form "<userid>-<hour bucket>", but as I was testing without a userid, I didn't notice that "-<hour bucket>" was a valid number.
Add DynamoDB based collector
- Store compressed reports in dynamodb
Add SQS based control router.
- Uses a queue per probe and a queue per UI for control requests & responses.
Add Consul-based, horizontally-scalable, multi-tenant pipe router.
- Uses consul to coordinate each end of pipe connections replicas of a pipe service.