troubleshoot

mirror of https://github.com/replicatedhq/troubleshoot.git synced 2026-04-15 07:16:34 +00:00

Author	SHA1	Message	Date
Evans Mungai	e6aff48f1b	feat: Prompt for privileged user if host collectors present in spec (#1513 ) * feat: Prompt for privileged user if host collectors present * Prompt preflight checks that have host collectors * Show cursor before prompting	2024-03-28 11:51:19 +00:00
Evans Mungai	a86b5c4441	chore(preflight): Better error message when not results found (#1397 ) code(preflight): Better error message when not results found	2023-12-20 16:10:47 +13:00
Evans Mungai	15a4802cd2	feat: Add dry run flag to print support bundle specs to std out (#1337 ) * Add dry-run flag * No traces on dry run * More refactoring * More updates to support bundle binary * More refactoring changes * Different approach of loading specs from URIs * Self review * More changes after review and testing * fix how we parse oci image uri * Remove unnecessary comment * Add missing file * Fix failing tests * Better error check for no collectors * Add default collectors when parsing support bundle specs * Add missed test fixture * Download specs with correct headers * Fix typo	2023-10-10 18:43:32 +01:00
Evans Mungai	b9f4fc4390	feat: Dry run flag to print preflight specs to std out (#1240 )	2023-09-12 14:42:10 +01:00
Evans Mungai	ff03bfa9cd	chore: make spec loaders internal APIs (#1313 ) * chore: make specs an internal package * Some minor improvements * Use LoadClusterSpecs in support bundle implementation * Remove change accidentally committed * Use LoadFromCLIArgs in preflight CLI implementation * Update comment * Fix edge case where the label selector is an empty string * Fix failing test	2023-08-30 14:02:30 +01:00
Pavan Sokke Nagaraj	39314ef200	chore: export error ErrInsufficientPermissionsToRun and func ShowTextResultsStructured (#1297 ) * chore: move ErrInsufficientPermissions to collect * chore: export func ShowTextResultsStructured * chore: rename to ErrInsufficientPermissionsToRun	2023-08-08 14:17:15 -04:00
Dan Jones	8237ac991c	fix: fixes #1270 (#1271 ) * fix: fixes #1270	2023-07-20 13:08:31 +12:00
Dexter Yan	784918e7ee	feat(preflight): adding warning message when validating the content of preflight and host preflight spec (#1250 )	2023-07-06 16:10:16 +12:00
Dexter Yan	f0efbf658a	fix(message): solve the terminal UI issue of truncating the message if it is long (#1242 )	2023-06-28 11:06:15 +12:00
Xav Paice	2c6b1869e2	feat: add test for HostPreflight spec read (#1227 )	2023-06-22 14:56:19 +01:00
Evans Mungai	401dfe2c57	feat: add loader APIs to load specs from raw troubleshoot spec (#1202 ) * feat: add loader APIs to load specs from a list of yaml docs The change introduces a loader package that will contain loader public APIs. The aim of these APIs will be to, given any source of troubleshoot specs, the loaders will fetch the specs and parse out all troubleshoot objects that can be extracted. * Some refactoring * Some more changes * More changes caught when testing vendor portal * Add tests and rename Troubleshoot kinds struct * Additional test * Handle ConfigMap and Secrets with multiple specs in them * Fix failing test * Revert multidoc split implementation * Fix merge conflict * Change LoadFromXXX functions to a single LoadSpecs function	2023-06-06 16:48:29 -04:00
Xav Paice	a3b7975690	Update the preflight secret label to troubleshoot.sh/kind (#1204 ) Partial-fix: #1070 Changes the default label for preflights to troubleshoot.sh/kind: preflight	2023-06-06 07:19:49 +12:00
Nathan Sullivan	6de79afc35	Search stdin for secrets with preflight specs (#1153 ) * we can now read preflight specs out of secrets, either from stdin or file input * moved spec read logic out into its own function so it can be unit tested easier * added more comprehensive unit testing on the different ways we can read in specs	2023-05-16 11:44:54 +10:00
Nathan Sullivan	3548b46cfc	support multiple exit codes based on what went wrong/right (#1135 ) 0 = all passed, 3 = at least one failure, 4 = no failures but at least 1 warn 1 as a catch all (generic errors), 2 for invalid input/specs etc ref https://github.com/replicatedhq/troubleshoot/issues/1131 docs https://github.com/replicatedhq/troubleshoot.sh/pull/489	2023-05-10 09:33:13 +10:00
danj-replicated	f692635054	Add stdin and multidoc support to preflight. (#1114 ) * add - url keyword for stdin * add basic multidoc support * filter on preflight kind * add e2e test for stdin	2023-04-14 11:21:45 +12:00
Evans Mungai	546ffde14b	feat: use klog as the default logging library (#1008 )	2023-02-24 18:24:51 +00:00
Evans Mungai	100f9a13b6	feat: Record summary of execution times of support bundle operations (collect/redact/analyse) (#935 ) When running a support bundle, we want to know how long each operation (collect, redact, analyze) takes. This commit adds a new trace exporter that records the start and end times of each operation, and then prints a summary of the execution. The summary is also stored in the support bundle. Related to #923	2023-02-07 09:50:21 +00:00
Diamon Wiggins	4fca6aff98	Deduplication for In-Cluster Collectors (#972 ) * adding dedup for in cluster collectors * add tests * return collector as is whenever marshalling to json fails --------- Co-authored-by: Evans Mungai <evans@replicated.com>	2023-02-01 14:14:43 -05:00
yunju.lly	0f6e6335fb	fix: address runtime error of nil pointer when concatenating preflight specs (#998 ) fix: address runtime error of nil pointer when concatenating preflight spec with hostpreflight spec in preflight run.go	2023-02-01 12:36:15 +00:00
Nathan Sullivan	827c49ca00	adding test coverage for preflight.RunPreflights() (#949 ) * adding test coverage for preflight.RunPreflights() TDD to work on https://github.com/replicatedhq/troubleshoot/issues/906 and verify the fix is successful * go.mod/go.sum: removing gnomock stuff since it's not in use (yet) * Makefile: try running the preflight integration test with the e2e tests, since there's a K3s instance in place already * Makefile add a dedicated test-integration task, which runs as it's own github action job * Makefile: exclude a few things from test-integration that break the github action job * WIP on preflight tests, addressing some of @banjoh's feedback, more to go though (specifically changing over to using assert) * preflight tests: use the testify libraries, restructure code to be formatted more like other tests in this project	2023-01-13 08:22:57 +10:00
Nathan Sullivan	de0371053a	preflight: ensure --output produces an output file of the desired format (#951 )	2023-01-13 07:55:47 +10:00
Nathan Sullivan	87c153cc8c	preflight: add yaml output format (#940 ) * preflight: add yaml output format ref https://github.com/replicatedhq/troubleshoot/issues/905	2023-01-04 14:27:00 +13:00
Nathan Sullivan	d73d5c6a3a	preflight: fix segfault when collector's are not defined in YAML (#939 ) * preflight: fix segfault when collector's are not defined in YAML * fix bug with kind: Preflight specs with uploadResultsTo, wrong variable being used :) ref https://github.com/replicatedhq/troubleshoot/pull/894 Co-authored-by: Evans Mungai <evans@replicated.com>	2023-01-03 14:01:49 -04:00
Evans Mungai	ebeed77287	chore: Upgrade gopsutil to v3 (#927 ) * Add host collector tests related to gopsutil upgrade * Upgrade gopsutil to v3	2022-12-24 13:42:13 +13:00
Craig O'Donnell	bc6528908f	fix: collect rbac permissions error (#928 )	2022-12-24 09:13:38 +13:00
Dexter Yan	be26462c19	feat(cluster_resources): increase default client burst and qps (#920 ) * feat(collect): add client burst and qps	2022-12-22 09:49:42 +13:00
Diamon Wiggins	f2be6f5829	Allow Preflight CLI to consume multiple specs as input (#894 ) To keep both the Support Bundle and Preflight CLIs similar, this PR adds the ability for the Preflight binary to allow multiple specs be provided as CLI args and for them all to be run.	2022-12-14 14:50:01 -04:00
Diamon Wiggins	a4c4b24056	Deduplication for Cluster Resources Collector (#832 ) * add dedup for cluster resources collector * restructure both collect.go in both pkg/supportbundle and pkg/preflight to be more similar for eventual refactor	2022-12-07 15:10:31 -04:00
Dexter Yan	7e3a59cfc0	feat(analyze): add ExcludeFiles field to textAnazlye (#867 ) * feat(analyze): add ExcludeFiles field to textAnazlye * feat(analyze): fix test for getFiles * feat(analyze): change function name to excludeFilePaths * feat(analyze): fix preflight test fail * feat(analyze): add tests for excludeFiles * feat(schemas): run make schemas * feat(analyze): use getChildCollectedFileContents function prototype * feat(analyze): reduce time complexity * feat(longhorn): add getFileContents as getCollectedFileContents	2022-11-28 10:45:10 +13:00
Dexter Yan	78bcafe489	fix(flag): fix wrong output filename (#834 ) * fix(flag): fix wrong output filename * fix(flag): add reset flag function * fix(flag): add output flag test cases * fix(flag): move resetFlags function into private go test * fix(flag): restructure flag tests with testify * fix(flag): remove resetFlags function * fix(flag): remove duplicated test and rewrite test names	2022-11-17 14:38:01 +13:00
Xav Paice	3513eeca19	Ensure clusterResources is added prior to other collectors (#768 ) This change ensures that the clusterResources collector runs prior to any others in order to not collect info on pods that collectors run during collection. Additionally centralizes functions that are common to all collection to make future maintenance simpler. Fixes: #767	2022-11-01 12:16:01 +13:00
Diamon Wiggins	bcaaa9e59a	Fix Preflight CheckRBAC (#776 ) * return collect result instead of nil	2022-10-13 12:54:40 +13:00
stefanrepl	9c986a74a6	make runPreflight and preflight cli flags public (#769 )	2022-10-10 16:34:54 -06:00
Diamon Wiggins	c7b84ad1e5	Refactor in-clusters collectors to use struct per collector (#670 ) refactor in-clusters collectors to use struct per collector	2022-10-03 13:53:05 -04:00
Xav Paice	f06201e050	Small typo fix in collect.go	2022-08-02 14:36:34 +12:00
Edgar Lanting	1e2e7e9aee	Update analyze.go - fix typo Fixed a typo in the comments: `analysze` -> `analyze`	2022-06-15 16:41:05 +02:00
Kira Boyle	5e7bd06fcb	do not return that a strict analyzer is present in an application if a strict analyzer is excluded	2022-06-14 10:23:42 -07:00
diamonwiggins	17fe3db79f	adding host collectors to support bundles	2022-05-11 22:50:03 +00:00
Ethan Mosbaugh	2c9a37a4f1	BoolOrString pollutes marshalling, does not respect omitempty (#566 ) * BoolOrString pollutes marshalling, does not respect omitempty * fix panic	2022-05-05 16:10:05 -07:00
Pavan Sokke Nagaraj	77a2475bd2	fix: return true when one of analyzers strict field is true (#552 ) * fix: return true when one of analyzers is true * fix: return true when one of analyzers is true * update: add unit test * update: log errors while processing analyzers * fix: return parse error instead of logging * fix: update err message * fix: evaluate strict check err separately	2022-03-23 20:24:21 -04:00
Pavan Sokke Nagaraj	7bfb54360c	map strict flag to analyze result (#551 ) * add util functions * rename func HasStrictAnalyzersFailed * map strict flag to analyze err result	2022-03-22 20:47:50 -04:00
Pavan Sokke Nagaraj	942234da80	Add `strict` flag to Analyzers and ResultAnalyzers (#539 ) * add strict flag to Analyzer/AnalyzerMeta and regenerate schemas and controller-gen code * map analyzer strict to result * Update stdout for human and json format * fix review comment * update interactive result * update interactive results * Update types.go * Update upload_results.go * print strict when only true	2022-02-23 15:07:51 -05:00
Jeff Golden	a818417e8c	include atomic collector statuses (#534 )	2022-02-10 11:27:13 -06:00
Salah Aldeen Al Saleh	7425f583fc	Don't include any default host collectors (#524 )	2022-01-10 16:19:56 -08:00
Simon Croome	977fc438ea	Remote host collectors (#392 ) * Add collect command and remote host collectors Adds the ability to run a host collector on a set of remote k8s nodes. Target nodes can be filtered using the --selector flag, with the same syntax as kubectl. Existing flags for --collector-image, --collector-pullpolicy and --request-timeout are used. To run on a specified node, --selector="kubernetes.io/hostname=kind-worker2" could be used. The collect command is used by the remote collector to output the results using a "raw" format, which uses the filename as the key, and the value the output as a escaped json string. When run manually it defaults to fully decoded json. The existing block devices, ipv4interfaces and services host collectors don't decode properly - the fix is to convert their slice output to a map (fix not included as unsure what depends on the existing format). The collect command is also useful for troubleshooting preflight issues. Examples are included to show remote collector usage. ``` bin/collect --collector-image=croomes/troubleshoot:latest examples/collect/remote/memory.yaml --namespace test { "kind-control-plane": { "system/memory.json": { "total": 1304207360 } }, "kind-worker": { "system/memory.json": { "total": 1695780864 } }, "kind-worker2": { "system/memory.json": { "total": 1726353408 } } } ``` The preflight command has been updated to run remote collectors. To run a host collector remotely it must be specified in the spec as a `remoteCollector`: ``` apiVersion: troubleshoot.sh/v1beta2 kind: HostPreflight metadata: name: memory spec: remoteCollectors: - memory: collectorName: memory analyzers: - memory: outcomes: - fail: when: "< 8Gi" message: At least 8Gi of memory is required - warn: when: "< 32Gi" message: At least 32Gi of memory is recommended - pass: message: The system has as sufficient memory ``` Results for each node are analyzed separately, with the node name appended to the title: ``` bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json {memory running 0 1} {memory completed 1 1} { "fail": [ { "title": "Amount of Memory (kind-worker2)", "message": "At least 8Gi of memory is required" }, { "title": "Amount of Memory (kind-worker)", "message": "At least 8Gi of memory is required" }, { "title": "Amount of Memory (kind-control-plane)", "message": "At least 8Gi of memory is required" } ] } ``` Also added a host collector to allow preflight checks of required kernel modules, which is the main driver for this change.	2021-10-06 09:03:53 -05:00
John Murphy	e0f6cab5b3	Fix removes control characters from non interactive preflight runs (#394 )	2021-07-23 09:46:36 -05:00
emosbaugh	8dcfa9886d	Copy from host collector (#391 ) * Copy from host collector * namespace improvements * better support for multiple nodes	2021-07-22 12:25:59 -07:00
emosbaugh	39350b5722	ConfigMap collector and secrets can be collected by selectors (#384 ) * ConfigMap collector and secrets can be collected by selectors * follow docs * Pass context and kubernetes client to collectors * collect tests * analyze tests * fix tests * improvements	2021-07-08 16:30:26 -07:00
Ethan Mosbaugh	9357d5ac96	Include result if not nil regardless of error	2021-04-28 02:58:59 +00:00
divolgin	62afc87af8	Add progress percentage	2021-03-18 22:29:27 +00:00

1 2

68 Commits