Commit Graph

68 Commits

Author SHA1 Message Date
Evans Mungai
e6aff48f1b feat: Prompt for privileged user if host collectors present in spec (#1513)
* feat: Prompt for privileged user if host collectors present

* Prompt preflight checks that have host collectors

* Show cursor before prompting
2024-03-28 11:51:19 +00:00
Evans Mungai
a86b5c4441 chore(preflight): Better error message when not results found (#1397)
code(preflight): Better error message when not results found
2023-12-20 16:10:47 +13:00
Evans Mungai
15a4802cd2 feat: Add dry run flag to print support bundle specs to std out (#1337)
* Add dry-run flag

* No traces on dry run

* More refactoring

* More updates to support bundle binary

* More refactoring changes

* Different approach of loading specs from URIs

* Self review

* More changes after review and testing

* fix how we parse oci image uri

* Remove unnecessary comment

* Add missing file

* Fix failing tests

* Better error check for no collectors

* Add default collectors when parsing support bundle specs

* Add missed test fixture

* Download specs with correct headers

* Fix typo
2023-10-10 18:43:32 +01:00
Evans Mungai
b9f4fc4390 feat: Dry run flag to print preflight specs to std out (#1240) 2023-09-12 14:42:10 +01:00
Evans Mungai
ff03bfa9cd chore: make spec loaders internal APIs (#1313)
* chore: make specs an internal package

* Some minor improvements

* Use LoadClusterSpecs in support bundle implementation

* Remove change accidentally committed

* Use LoadFromCLIArgs in preflight CLI implementation

* Update comment

* Fix edge case where the label selector is an empty string

* Fix failing test
2023-08-30 14:02:30 +01:00
Pavan Sokke Nagaraj
39314ef200 chore: export error ErrInsufficientPermissionsToRun and func ShowTextResultsStructured (#1297)
* chore: move ErrInsufficientPermissions to collect

* chore: export func ShowTextResultsStructured

* chore: rename to ErrInsufficientPermissionsToRun
2023-08-08 14:17:15 -04:00
Dan Jones
8237ac991c fix: fixes #1270 (#1271)
* fix: fixes #1270
2023-07-20 13:08:31 +12:00
Dexter Yan
784918e7ee feat(preflight): adding warning message when validating the content of preflight and host preflight spec (#1250) 2023-07-06 16:10:16 +12:00
Dexter Yan
f0efbf658a fix(message): solve the terminal UI issue of truncating the message if it is long (#1242) 2023-06-28 11:06:15 +12:00
Xav Paice
2c6b1869e2 feat: add test for HostPreflight spec read (#1227) 2023-06-22 14:56:19 +01:00
Evans Mungai
401dfe2c57 feat: add loader APIs to load specs from raw troubleshoot spec (#1202)
* feat: add loader APIs to load specs from a list of yaml docs

The change introduces a loader package that will contain loader
public APIs. The aim of these APIs will be to, given any source of
troubleshoot specs, the loaders will fetch the specs and parse out
all troubleshoot objects that can be extracted.

* Some refactoring

* Some more changes

* More changes caught when testing vendor portal

* Add tests and rename Troubleshoot kinds struct

* Additional test

* Handle ConfigMap and Secrets with multiple specs in them

* Fix failing test

* Revert multidoc split implementation

* Fix merge conflict

* Change LoadFromXXX functions to a single LoadSpecs function
2023-06-06 16:48:29 -04:00
Xav Paice
a3b7975690 Update the preflight secret label to troubleshoot.sh/kind (#1204)
Partial-fix: #1070

Changes the default label for preflights to troubleshoot.sh/kind: preflight
2023-06-06 07:19:49 +12:00
Nathan Sullivan
6de79afc35 Search stdin for secrets with preflight specs (#1153)
* we can now read preflight specs out of secrets, either from stdin or file input

* moved spec read logic out into its own function so it can be unit
tested easier

* added more comprehensive unit testing on the different ways we can read in specs
2023-05-16 11:44:54 +10:00
Nathan Sullivan
3548b46cfc support multiple exit codes based on what went wrong/right (#1135)
0 = all passed, 3 = at least one failure, 4 = no failures but at least 1 warn

1 as a catch all (generic errors), 2 for invalid input/specs etc

ref https://github.com/replicatedhq/troubleshoot/issues/1131

docs https://github.com/replicatedhq/troubleshoot.sh/pull/489
2023-05-10 09:33:13 +10:00
danj-replicated
f692635054 Add stdin and multidoc support to preflight. (#1114)
* add - url keyword for stdin
* add basic multidoc support
* filter on preflight kind
* add e2e test for stdin
2023-04-14 11:21:45 +12:00
Evans Mungai
546ffde14b feat: use klog as the default logging library (#1008) 2023-02-24 18:24:51 +00:00
Evans Mungai
100f9a13b6 feat: Record summary of execution times of support bundle operations (collect/redact/analyse) (#935)
When running a support bundle, we want to know how long each operation
(collect, redact, analyze) takes. This commit adds a new trace exporter
that records the start and end times of each operation, and then prints
a summary of the execution. The summary is also stored in the support
bundle.

Related to #923
2023-02-07 09:50:21 +00:00
Diamon Wiggins
4fca6aff98 Deduplication for In-Cluster Collectors (#972)
* adding dedup for in cluster collectors

* add tests

* return collector as is whenever marshalling to json fails

---------

Co-authored-by: Evans Mungai <evans@replicated.com>
2023-02-01 14:14:43 -05:00
yunju.lly
0f6e6335fb fix: address runtime error of nil pointer when concatenating preflight specs (#998)
fix: address runtime error of nil pointer when concatenating preflight spec with hostpreflight spec in preflight run.go
2023-02-01 12:36:15 +00:00
Nathan Sullivan
827c49ca00 adding test coverage for preflight.RunPreflights() (#949)
* adding test coverage for preflight.RunPreflights()

TDD to work on https://github.com/replicatedhq/troubleshoot/issues/906
and verify the fix is successful

* go.mod/go.sum: removing gnomock stuff since it's not in use (yet)

* Makefile: try running the preflight integration test with the e2e tests,
since there's a K3s instance in place already

* Makefile add a dedicated test-integration task, which runs as it's own
github action job

* Makefile: exclude a few things from test-integration that break the
github action job

* WIP on preflight tests, addressing some of @banjoh's feedback, more to
go though (specifically changing over to using assert)

* preflight tests: use the testify libraries, restructure code to be
formatted more like other tests in this project
2023-01-13 08:22:57 +10:00
Nathan Sullivan
de0371053a preflight: ensure --output produces an output file of the desired format (#951) 2023-01-13 07:55:47 +10:00
Nathan Sullivan
87c153cc8c preflight: add yaml output format (#940)
* preflight: add yaml output format

ref https://github.com/replicatedhq/troubleshoot/issues/905
2023-01-04 14:27:00 +13:00
Nathan Sullivan
d73d5c6a3a preflight: fix segfault when collector's are not defined in YAML (#939)
* preflight: fix segfault when collector's are not defined in YAML

* fix bug with kind: Preflight specs with uploadResultsTo, wrong variable being used :)

ref https://github.com/replicatedhq/troubleshoot/pull/894

Co-authored-by: Evans Mungai <evans@replicated.com>
2023-01-03 14:01:49 -04:00
Evans Mungai
ebeed77287 chore: Upgrade gopsutil to v3 (#927)
* Add host collector tests related to gopsutil upgrade

* Upgrade gopsutil to v3
2022-12-24 13:42:13 +13:00
Craig O'Donnell
bc6528908f fix: collect rbac permissions error (#928) 2022-12-24 09:13:38 +13:00
Dexter Yan
be26462c19 feat(cluster_resources): increase default client burst and qps (#920)
* feat(collect): add client burst and qps
2022-12-22 09:49:42 +13:00
Diamon Wiggins
f2be6f5829 Allow Preflight CLI to consume multiple specs as input (#894)
To keep both the Support Bundle and Preflight CLIs similar, this PR adds the ability for the Preflight binary to allow multiple specs be provided as CLI args and for them all to be run.
2022-12-14 14:50:01 -04:00
Diamon Wiggins
a4c4b24056 Deduplication for Cluster Resources Collector (#832)
* add dedup for cluster resources collector
* restructure both collect.go in both pkg/supportbundle and pkg/preflight to be more similar for eventual refactor
2022-12-07 15:10:31 -04:00
Dexter Yan
7e3a59cfc0 feat(analyze): add ExcludeFiles field to textAnazlye (#867)
* feat(analyze): add ExcludeFiles field to textAnazlye

* feat(analyze): fix test for getFiles

* feat(analyze): change function name to  excludeFilePaths

* feat(analyze): fix preflight test fail

* feat(analyze): add tests for excludeFiles

* feat(schemas): run make schemas

* feat(analyze): use getChildCollectedFileContents function prototype

* feat(analyze): reduce time complexity

* feat(longhorn): add getFileContents as getCollectedFileContents
2022-11-28 10:45:10 +13:00
Dexter Yan
78bcafe489 fix(flag): fix wrong output filename (#834)
* fix(flag): fix wrong output filename

* fix(flag): add reset flag function

* fix(flag): add output flag test cases

* fix(flag): move resetFlags function into private go test

* fix(flag): restructure flag tests with testify

* fix(flag): remove resetFlags function

* fix(flag): remove duplicated test and rewrite test names
2022-11-17 14:38:01 +13:00
Xav Paice
3513eeca19 Ensure clusterResources is added prior to other collectors (#768)
This change ensures that the clusterResources collector runs prior to any others
in order to not collect info on pods that collectors run during collection.

Additionally centralizes functions that are common to all collection to make future
maintenance simpler.

Fixes: #767
2022-11-01 12:16:01 +13:00
Diamon Wiggins
bcaaa9e59a Fix Preflight CheckRBAC (#776)
* return collect result instead of nil
2022-10-13 12:54:40 +13:00
stefanrepl
9c986a74a6 make runPreflight and preflight cli flags public (#769) 2022-10-10 16:34:54 -06:00
Diamon Wiggins
c7b84ad1e5 Refactor in-clusters collectors to use struct per collector (#670)
refactor in-clusters collectors to use struct per collector
2022-10-03 13:53:05 -04:00
Xav Paice
f06201e050 Small typo fix in collect.go 2022-08-02 14:36:34 +12:00
Edgar Lanting
1e2e7e9aee Update analyze.go - fix typo
Fixed a typo in the comments: `analysze` -> `analyze`
2022-06-15 16:41:05 +02:00
Kira Boyle
5e7bd06fcb do not return that a strict analyzer is present in an application if a strict analyzer is excluded 2022-06-14 10:23:42 -07:00
diamonwiggins
17fe3db79f adding host collectors to support bundles 2022-05-11 22:50:03 +00:00
Ethan Mosbaugh
2c9a37a4f1 BoolOrString pollutes marshalling, does not respect omitempty (#566)
* BoolOrString pollutes marshalling, does not respect omitempty

* fix panic
2022-05-05 16:10:05 -07:00
Pavan Sokke Nagaraj
77a2475bd2 fix: return true when one of analyzers strict field is true (#552)
* fix: return true when one of analyzers is true

* fix: return true when one of analyzers is true

* update: add unit test

* update: log errors while processing analyzers

* fix: return parse error instead of logging

* fix: update err message

* fix: evaluate strict check err separately
2022-03-23 20:24:21 -04:00
Pavan Sokke Nagaraj
7bfb54360c map strict flag to analyze result (#551)
* add util functions

* rename func HasStrictAnalyzersFailed

* map strict flag to analyze err result
2022-03-22 20:47:50 -04:00
Pavan Sokke Nagaraj
942234da80 Add strict flag to Analyzers and ResultAnalyzers (#539)
* add strict flag to Analyzer/AnalyzerMeta

and regenerate schemas and controller-gen code

* map analyzer strict to result

* Update stdout for human and json format

* fix review comment

* update interactive result

* update interactive results

* Update types.go

* Update upload_results.go

* print strict when only true
2022-02-23 15:07:51 -05:00
Jeff Golden
a818417e8c include atomic collector statuses (#534) 2022-02-10 11:27:13 -06:00
Salah Aldeen Al Saleh
7425f583fc Don't include any default host collectors (#524) 2022-01-10 16:19:56 -08:00
Simon Croome
977fc438ea Remote host collectors (#392)
* Add collect command and remote host collectors

Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl.  Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used.  To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.

The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string.  When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).

The collect command is also useful for troubleshooting preflight issues.

Examples are included to show remote collector usage.

```
bin/collect --collector-image=croomes/troubleshoot:latest  examples/collect/remote/memory.yaml --namespace test
{
  "kind-control-plane": {
    "system/memory.json": {
      "total": 1304207360
    }
  },
  "kind-worker": {
    "system/memory.json": {
      "total": 1695780864
    }
  },
  "kind-worker2": {
    "system/memory.json": {
      "total": 1726353408
    }
  }
}
```

The preflight command has been updated to run remote collectors.  To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:

```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: memory
spec:
  remoteCollectors:
    - memory:
        collectorName: memory
  analyzers:
    - memory:
        outcomes:
          - fail:
              when: "< 8Gi"
              message: At least 8Gi of memory is required
          - warn:
              when: "< 32Gi"
              message: At least 32Gi of memory is recommended
          - pass:
              message: The system has as sufficient memory
```

Results for each node are analyzed separately, with the node name
appended to the title:

```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
  "fail": [
    {
      "title": "Amount of Memory (kind-worker2)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-worker)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-control-plane)",
      "message": "At least 8Gi of memory is required"
    }
  ]
}
```

Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
2021-10-06 09:03:53 -05:00
John Murphy
e0f6cab5b3 Fix removes control characters from non interactive preflight runs (#394) 2021-07-23 09:46:36 -05:00
emosbaugh
8dcfa9886d Copy from host collector (#391)
* Copy from host collector

* namespace improvements

* better support for multiple nodes
2021-07-22 12:25:59 -07:00
emosbaugh
39350b5722 ConfigMap collector and secrets can be collected by selectors (#384)
* ConfigMap collector and secrets can be collected by selectors

* follow docs

* Pass context and kubernetes client to collectors

* collect tests

* analyze tests

* fix tests

* improvements
2021-07-08 16:30:26 -07:00
Ethan Mosbaugh
9357d5ac96 Include result if not nil regardless of error 2021-04-28 02:58:59 +00:00
divolgin
62afc87af8 Add progress percentage 2021-03-18 22:29:27 +00:00