weave-scope

mirror of https://github.com/weaveworks/scope.git synced 2026-05-21 00:23:05 +00:00

Author	SHA1	Message	Date
Mike Lang	fad3e88269	Rename ECS Service node ids to be cluster;serviceName This is important for two reasons: * It prevents nasty false-equality bugs when two different services from different ECS clusters are present in the same report * It allows us to retrieve the cluster and service name - all the info we need to look up the service - using only the node ID. This matters, for example, when trying to handle a control request.	2017-02-03 13:45:18 -08:00
Alfonso Acosta	6347238f10	Review feedback	2017-01-27 13:05:50 +00:00
Alfonso Acosta	7ae94a8c8a	DNSSnooper: Support Dot1Q and limit decoding errors	2017-01-27 10:59:33 +00:00
Mike Lang	dee274e438	Merge pull request #2065 from weaveworks/mike/ecs/caching ECS reporter: Minimize API calls by caching task and service data	2017-01-24 11:03:51 -08:00
Mike Lang	c4eb0960f9	awsecs client: simplify list/describe services by removing ability to stream results between them, since this is such a minor optimization and greatly complicates the code.	2017-01-23 12:48:50 -08:00
Mike Lang	baffe94538	awsecs caching: Minor review changes	2017-01-20 14:31:41 -08:00
Alfonso Acosta	7aff988929	Simplify kubelet test	2017-01-20 18:23:11 +00:00
Alfonso Acosta	87f1c0f0f5	Merge pull request #2132 from weaveworks/2049-get-local-pods-from-kubelet Obtain local pods from kubelet	2017-01-19 12:57:54 +01:00
Mike Lang	79a83e3656	awsecs: Appease linter	2017-01-17 12:17:34 -08:00
Alban Crequy	f1e2b5d93a	probe: conntrack: fix output parsing With net.netfilter.nf_conntrack_acct = 1, conntrack adds the following fields in the output: packets=3 bytes=164 And with SELinux (e.g. Fedora), conntrack adds: secctx=... The parsing with fmt.Sscanf introduced in #2095 was unfortunately rejecting lines with those fields. This patch fixes that by adding more complicated parsing in decodeFlowKeyValues() with FieldsFunc and SplitN. Fixes #2117 Regression from #2095	2017-01-17 19:30:56 +01:00
Mike Lang	2b7662a3c6	Make reporter tests a seperate package to appease linter This requires making All The Things public. Yuck.	2017-01-17 03:02:47 -08:00
Alfonso Acosta	496e3f2072	Merge pull request #2114 from weaveworks/1972-non-established-proc-conns Report persistent connections in states other than ESTABLISHED	2017-01-17 10:45:53 +01:00
Alfonso Acosta	c6f7bdc78e	Obtain local pods from kubelet	2017-01-16 18:50:03 +00:00
Filip Barl	d3466b5454	Refactored the table component/model and wrote the tests Backward-compatibility fix	2017-01-16 17:05:36 +01:00
Filip Barl	6888108b83	Made the searching of generic tables work on the UI Extracted table headers common code on the frontend Fixed the search matching and extracted further common code in the UI	2017-01-16 12:22:10 +01:00
Filip Barl	e475a09ee6	Rendering sortable generic tables in the UI Rendering generic table columns Made Type a required attribute for TableTemplate Made generic table sortable on the UI	2017-01-16 12:22:10 +01:00
Filip Barl	31be525bd2	Created generic table model on backend Replaced MetadataRow with generic Row in Table model Sending through multicolumn tables from the backend	2017-01-16 12:22:10 +01:00
Mike Lang	5c19dc792e	ecs probe: add tests for reporter	2017-01-13 17:31:29 -08:00
Mike Lang	685af493bf	ecs probe: Allow cache settings to be tweaked	2017-01-12 11:37:23 -08:00
Mike Lang	513977081d	aws ecs probe: Use a size and time bound LRU gcache for caching instead of our own hand-rolled size-unbound cache	2017-01-12 10:34:41 -08:00
Mike Lang	e220ae822f	wip:	2017-01-12 07:11:12 -08:00
Alfonso Acosta	2be26e2be4	Limit connections to established and half-closed	2017-01-10 15:35:32 +00:00
Alfonso Acosta	89a0ab6799	Fix test data and improve /proc/net/tcp header parsing The header checking code was unsafe because: 1. It was accessing the byteslice at [2] without ensuring a length >= 3 2. It was assuming that the indentation of the 'sl' header is always 2 (which seems to be the case in recent kernels `8f18e4d03e/net/ipv4/tcp_ipv4.c (L2304)` and `8f18e4d03e/net/ipv6/tcp_ipv6.c (L1831)` ) but it's more robust to simply trim the byteslice.	2017-01-04 00:27:16 +00:00
Alfonso Acosta	99a7dc3b9a	Fix tests	2017-01-03 23:34:32 +00:00
Alfonso Acosta	a8b4e65b5c	Make linter happy	2017-01-03 22:55:28 +00:00
Alfonso Acosta	7716d96810	Report persistent connections in states other than ESTABLISHED This aligns the `/proc` connection tracking (persistent connections) with conntrack (short-lived connections).	2017-01-03 18:38:02 +00:00
Alfonso Acosta	b4e1fc7074	Merge pull request #2112 from weaveworks/2032-ensure-conntrack-events Check that conntrack events are enabled in the kernel	2017-01-02 23:11:52 +01:00
Alfonso Acosta	5c3ea83846	Fix minor typo	2017-01-02 14:28:22 +00:00
Alfonso Acosta	dfb52f0d93	Clarify even further that proc/PID/net/tcp varies by namespace	2017-01-02 14:27:37 +00:00
Alfonso Acosta	64f1a5d0f5	Check that conntrack events are enabled in the kernel	2017-01-02 09:22:26 +00:00
Alfonso Acosta	2cd76130a1	Merge pull request #2095 from weaveworks/1991-conntrack-parsing Disable XML in conntrack parsing	2016-12-22 11:00:51 +01:00
Alfonso Acosta	9d352e96f5	Review feedback	2016-12-22 09:33:52 +00:00
Alfonso Acosta	d22d64c710	Cleanup * Remove XML traces * Improve performance * Fix tests	2016-12-21 19:35:37 +00:00
Alfonso Acosta	06ff64d477	Forward OS/Kernel version to checkpoint Useful to prioritize ebpf testing Also: * Make treatment of kernel release and version consistent across Darwin/Linux	2016-12-19 20:08:08 +00:00
Alfonso Acosta	f19889f63c	Reduce garbage	2016-12-19 19:30:23 +00:00
Alfonso Acosta	5c02dfcbd2	Complete hacky manual parser	2016-12-19 11:30:00 +00:00
Alfonso Acosta	710c3bf82e	[WIP] Diable XML in conntrack parsing Not working yet	2016-12-19 11:30:00 +00:00
Mike Lang	49d3e7bbd3	wip:	2016-12-16 17:00:57 -08:00
Mike Lang	0fb74d6781	ecs client: more refactoring for nice code pulls the inner function of describeServices into its own top-level function, makes the lock part of the client object as a result	2016-12-15 14:11:58 -08:00
Mike Lang	adb6f9d4a1	Appease linter	2016-12-15 14:11:58 -08:00
Mike Lang	6f2efca968	more review feedback	2016-12-15 14:11:58 -08:00
Mike Lang	7d845f9130	ecs reporter: Review feedback, some trivial renames	2016-12-15 14:11:58 -08:00
Mike Lang	7ebb76d0a3	ecs reporter: Move some code around to break up large function	2016-12-15 14:11:58 -08:00
Mike Lang	1d63830792	awsecs reporter: Add lots of debug logging and fix bugs - describeServices wasn't describing the partial page left over at the end, which would cause incorrect results - the shim between listServices and describeServices was closing the channel every iteration, which would cause panic for write to closed channel - client was not being saved when created, so it gets recreated each time - we were describeTasks'ing even if we had no tasks to describe	2016-12-15 14:11:57 -08:00
Mike Lang	4234888bf4	ecs: Linter fixes	2016-12-15 14:11:57 -08:00
Mike Lang	357136721d	Fix compile errors and go fmt	2016-12-15 14:11:57 -08:00
Mike Lang	9d1e46f81b	ECS reporter: Use persistent client objects across reports Not only does this allow us to re-use connections, but vitally it allows us to make use of the new task and service caching within the client object.	2016-12-15 14:11:57 -08:00
Mike Lang	6b19bc2da9	Changes to how ECS AWS API is used to minimize API calls Due to AWS API rate limits, we need to minimize API calls as much as possible. Our stated objectives: * for all displayed tasks and services to have up-to-date metadata * for all tasks to map to services if able My approach here: * Tasks only contain immutable fields (that we care about). We cache tasks forever. We only DescribeTasks the first time we see a new task. * We attempt to match tasks to services with what info we have. Any "referenced" services, ie. a service with at least one matching task, needs to be updated to refresh changing data. * In the event that a task doesn't match any of the (updated) services, ie. a new service entirely needs to be found, we do a full list and detail of all services (we don't re-detail ones we just refreshed). * To avoid unbounded memory usage, we evict tasks and services from the cache after 1 minute without use. This should be long enough for things like temporary failures to be glossed over. This gives us exactly one call per task, and one call per referenced service per report, which is unavoidable to maintain fresh data. Expensive "describe all" service queries are kept to only when newly-referenced services appear, which should be rare. We could make a few very minor improvements here, such as trying to refresh unreferenced but known services before doing a list query, or getting details one by one when "describing all" and stopping when all matches have been found, but I believe these would produce very minor, if any, gains in number of calls while having an unjustifiable effect on latency since we wouldn't be able to do requests as concurrently. Speaking of which, this change has a minor performance impact. Even though we're now doing less calls, we can't do them as concurrently. Old code: concurrently: describe tasks (1 call) sequentially: list services (1 call) describe services (N calls concurrently) Assuming full concurrency, total latency: 2 end-to-end calls New code (worst case): sequentially: describe tasks (1 call) describe services (N calls concurrently) list services (1 call) describe services (N calls concurrently) Assuming full concurrency, total latency: 4 end-to-end calls In practical terms, I don't expect this to matter.	2016-12-15 14:11:57 -08:00
Mike Lang	5ed63de306	Merge pull request #2060 from weaveworks/mike/awsecs/fix-log-formatting ecs reporter: Fix some log lines that were passing *string instead of string	2016-12-14 11:20:59 -08:00
Alfonso Acosta	07aee0ed97	Merge pull request #2020 from kinvolk/alban/fix-getWalkedProcPid procspy: use a Reader to copy the background reader buffer	2016-12-07 12:53:53 +01:00

1 2 3 4 5 ...

590 Commits