Commit Graph

93 Commits

Author SHA1 Message Date
Evans Mungai
aea4f7c87c feat: Optionally save preflight bundles to disk (#1612)
* feat: Optionally save preflight bundles to disk

Signed-off-by: Evans Mungai <evans@replicated.com>

* Add e2e test of saving preflight bundle

Signed-off-by: Evans Mungai <evans@replicated.com>

* Update cli docs

Signed-off-by: Evans Mungai <evans@replicated.com>

* Expose GetVersionFile function publicly

Signed-off-by: Evans Mungai <evans@replicated.com>

* Store analysis.json file in preflight bundle

Signed-off-by: Evans Mungai <evans@replicated.com>

* Run go fmt when running lint fixers

Signed-off-by: Evans Mungai <evans@replicated.com>

* Always generate a preflight bundle in CLI

Signed-off-by: Evans Mungai <evans@replicated.com>

* Print saving bundle message to stderr

Signed-off-by: Evans Mungai <evans@replicated.com>

* Revert changes in docs directory

Signed-off-by: Evans Mungai <evans@replicated.com>

* Use NewResult constructor

Signed-off-by: Evans Mungai <evans@replicated.com>

* Log always when preflight bundle is saved to disk

Signed-off-by: Evans Mungai <evans@replicated.com>

---------

Signed-off-by: Evans Mungai <evans@replicated.com>
2024-09-16 23:36:52 +01:00
Gerard Nguyen
04e656a0a5 fix: [sc-106256] Add missing uri field to troubleshoot.sh types (#1578)
* new no-uri flag for preflight
* implement load additional spec from URIs
2024-07-19 08:23:55 +10:00
Evans Mungai
15a4802cd2 feat: Add dry run flag to print support bundle specs to std out (#1337)
* Add dry-run flag

* No traces on dry run

* More refactoring

* More updates to support bundle binary

* More refactoring changes

* Different approach of loading specs from URIs

* Self review

* More changes after review and testing

* fix how we parse oci image uri

* Remove unnecessary comment

* Add missing file

* Fix failing tests

* Better error check for no collectors

* Add default collectors when parsing support bundle specs

* Add missed test fixture

* Download specs with correct headers

* Fix typo
2023-10-10 18:43:32 +01:00
Evans Mungai
b9f4fc4390 feat: Dry run flag to print preflight specs to std out (#1240) 2023-09-12 14:42:10 +01:00
Nathan Sullivan
3548b46cfc support multiple exit codes based on what went wrong/right (#1135)
0 = all passed, 3 = at least one failure, 4 = no failures but at least 1 warn

1 as a catch all (generic errors), 2 for invalid input/specs etc

ref https://github.com/replicatedhq/troubleshoot/issues/1131

docs https://github.com/replicatedhq/troubleshoot.sh/pull/489
2023-05-10 09:33:13 +10:00
danj-replicated
285631446e Add ability to fetch preflights from OCI registry to standard out (#1117)
* add oci-fetch command
2023-04-14 11:25:42 +12:00
Evans Mungai
546ffde14b feat: use klog as the default logging library (#1008) 2023-02-24 18:24:51 +00:00
Evans Mungai
100f9a13b6 feat: Record summary of execution times of support bundle operations (collect/redact/analyse) (#935)
When running a support bundle, we want to know how long each operation
(collect, redact, analyze) takes. This commit adds a new trace exporter
that records the start and end times of each operation, and then prints
a summary of the execution. The summary is also stored in the support
bundle.

Related to #923
2023-02-07 09:50:21 +00:00
Evans Mungai
a31e2ff1b9 chore: Add CLI flags to enable CPU & memory profiling (#926)
Allow collecting of CPU and memory diagnostics when running troubleshoot CLI applications using --memprofile and --cpuprofile flags. These flags accept file paths if where to store the collected runtime data
2023-01-04 11:56:07 +00:00
Diamon Wiggins
f2be6f5829 Allow Preflight CLI to consume multiple specs as input (#894)
To keep both the Support Bundle and Preflight CLIs similar, this PR adds the ability for the Preflight binary to allow multiple specs be provided as CLI args and for them all to be run.
2022-12-14 14:50:01 -04:00
stefanrepl
9c986a74a6 make runPreflight and preflight cli flags public (#769) 2022-10-10 16:34:54 -06:00
Diamon Wiggins
c7b84ad1e5 Refactor in-clusters collectors to use struct per collector (#670)
refactor in-clusters collectors to use struct per collector
2022-10-03 13:53:05 -04:00
Ethan Mosbaugh
7c74f8b755 Disable client-go logging 2022-06-10 15:39:23 +00:00
Marc Campbell
5da84663b1 Support for oci:// retreival of specs 2022-06-03 12:40:15 -07:00
Pavan Sokke Nagaraj
942234da80 Add strict flag to Analyzers and ResultAnalyzers (#539)
* add strict flag to Analyzer/AnalyzerMeta

and regenerate schemas and controller-gen code

* map analyzer strict to result

* Update stdout for human and json format

* fix review comment

* update interactive result

* update interactive results

* Update types.go

* Update upload_results.go

* print strict when only true
2022-02-23 15:07:51 -05:00
Craig O'Donnell
0a2ed01a46 improvement: added --output flag for preflight and support bundle (#538)
* improvement: added --output flag for preflight and support bundle

* improvement: added datetime to preflight default file name
2022-02-17 17:28:28 -05:00
Simon Croome
977fc438ea Remote host collectors (#392)
* Add collect command and remote host collectors

Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl.  Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used.  To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.

The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string.  When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).

The collect command is also useful for troubleshooting preflight issues.

Examples are included to show remote collector usage.

```
bin/collect --collector-image=croomes/troubleshoot:latest  examples/collect/remote/memory.yaml --namespace test
{
  "kind-control-plane": {
    "system/memory.json": {
      "total": 1304207360
    }
  },
  "kind-worker": {
    "system/memory.json": {
      "total": 1695780864
    }
  },
  "kind-worker2": {
    "system/memory.json": {
      "total": 1726353408
    }
  }
}
```

The preflight command has been updated to run remote collectors.  To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:

```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: memory
spec:
  remoteCollectors:
    - memory:
        collectorName: memory
  analyzers:
    - memory:
        outcomes:
          - fail:
              when: "< 8Gi"
              message: At least 8Gi of memory is required
          - warn:
              when: "< 32Gi"
              message: At least 32Gi of memory is recommended
          - pass:
              message: The system has as sufficient memory
```

Results for each node are analyzed separately, with the node name
appended to the title:

```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
  "fail": [
    {
      "title": "Amount of Memory (kind-worker2)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-worker)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-control-plane)",
      "message": "At least 8Gi of memory is required"
    }
  ]
}
```

Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
2021-10-06 09:03:53 -05:00
Dan Stough
0478a7a60f fix: cluster-res collector fixed to one namespace 2021-09-03 19:23:44 +00:00
Jalaja Ganapathy
e23fb2ce59 run support-bundle and preflight checks even with restricted access (#404) 2021-08-13 07:52:49 -07:00
John Murphy
e0f6cab5b3 Fix removes control characters from non interactive preflight runs (#394) 2021-07-23 09:46:36 -05:00
emosbaugh
8dcfa9886d Copy from host collector (#391)
* Copy from host collector

* namespace improvements

* better support for multiple nodes
2021-07-22 12:25:59 -07:00
emosbaugh
d7b6aa2758 Log progress when interactive=false (#382)
* Log progress when interactive=false

* safe print statement
2021-07-08 13:57:35 -07:00
divolgin
4047977b35 Make cursors visible on CTRL+C 2021-07-01 23:08:05 +00:00
divolgin
5f2525b663 Report back some basic progress 2021-03-18 18:56:27 +00:00
Dex
0a19d35073 hide spinner if interactive false (#328)
* hide preflight spinner if interactive is false

Co-authored-by: Salah Aldeen Al Saleh <salahalsaleh1993@gmail.com>
2021-03-09 09:42:38 -08:00
Ethan Mosbaugh
4b78c430ca Host preflight ux improvements 2021-03-02 17:27:01 +00:00
Andrew Reed
fe4db40b43 Move host preflights examples into separate directory
Add all supported analyzers to host preflight sample.
Don't log transient errors waiting for TCP connection.
Begin human stdout results on new line after spinner.
2021-02-15 22:46:12 +00:00
Andrew Reed
10a34c2e58 Host preflight (#311)
* Add HostPreflight v1beta2

* Work on TCP Load Balancer

* Host disk usage collector and analyzer

* Host memory analyzer

* TCP port status

* TCP load balancer

* Review changes

Co-authored-by: Marc Campbell <marc.e.campbell@gmail.com>
2021-02-08 16:09:01 -05:00
Matias Manavella
f1868d0ba8 return messages modified 2020-10-22 14:42:05 -03:00
Matias Manavella
3cbdb41c8f return error when both flags are used 2020-10-22 14:35:08 -03:00
Matias Manavella
d2f6594c0c return error when both flags are used 2020-10-22 14:31:34 -03:00
Matias Manavella
56408fab01 return error when both flags are used 2020-10-22 14:28:35 -03:00
Matias Manavella
f0d9418e21 parseTimeFlag function and error handling added 2020-10-22 12:26:29 -03:00
Matias Manavella
1ba3840ad9 parseTimeFlag function and error handling added 2020-10-22 11:37:52 -03:00
Matias Manavella
a3d667298e Update cmd/preflight/cli/root.go
Co-authored-by: Mark Pundsack <markpundsack@users.noreply.github.com>
2020-10-22 10:13:45 -03:00
Matias Manavella
5cf4ae2157 Update cmd/preflight/cli/root.go
Co-authored-by: Mark Pundsack <markpundsack@users.noreply.github.com>
2020-10-22 10:13:32 -03:00
Matias Manavella
4e047b2372 nil and error handling 2020-10-21 14:47:25 -03:00
Matias Manavella
2436a0c163 Update cmd/preflight/cli/root.go
Co-authored-by: Salah Aldeen Al Saleh <salahalsaleh1993@gmail.com>
2020-10-21 13:59:24 -03:00
Matias Manavella
7186b75f7e --since flag added 2020-10-21 09:51:52 -03:00
Matias Manavella
e16eabd531 added flag --since-time 2020-10-19 16:53:13 -03:00
divolgin
6e86cdc803 Allow preflight spec to be loaded from a secret 2020-10-01 01:37:37 +00:00
Bryant Hagadorn
3f48574071 Add all auth providers (#266)
* Update main.go

* Adding generic auth for OIDC
2020-09-15 15:34:10 -07:00
divolgin
a0ce85ae1e Adding troubleshoot.sh/v1beta2 2020-09-01 19:57:11 +00:00
GraysonNull
4e08ce4d93 remove commented code blocks, move appName func to shared package 2020-08-13 16:38:51 +00:00
GraysonNull
e8dc5ae1b8 use table widget for bundle analysis so scrolling works 2020-08-13 16:02:29 +00:00
GraysonNull
0e667e9685 scrolling on analyzers table 2020-08-11 15:16:23 +00:00
Marc Campbell
65f957db81 Refactor to support K8s 1.18 2020-06-12 09:28:49 -07:00
Andrew Lavery
9692d5a457 attempt to read the file at the provided path before trying url 2020-03-25 17:27:16 -04:00
Marc Campbell
bd71222715 Removing more unused code 2020-03-23 10:01:30 -07:00
Marc Campbell
983aaaacea Defining types in a package 2020-03-10 17:32:16 +00:00