83 Commits

Author SHA1 Message Date
Evans Mungai
a31e2ff1b9 chore: Add CLI flags to enable CPU & memory profiling (#926)
Allow collecting of CPU and memory diagnostics when running troubleshoot CLI applications using --memprofile and --cpuprofile flags. These flags accept file paths if where to store the collected runtime data
2023-01-04 11:56:07 +00:00
Diamon Wiggins
f2be6f5829 Allow Preflight CLI to consume multiple specs as input (#894)
To keep both the Support Bundle and Preflight CLIs similar, this PR adds the ability for the Preflight binary to allow multiple specs be provided as CLI args and for them all to be run.
2022-12-14 14:50:01 -04:00
stefanrepl
9c986a74a6 make runPreflight and preflight cli flags public (#769) 2022-10-10 16:34:54 -06:00
Diamon Wiggins
c7b84ad1e5 Refactor in-clusters collectors to use struct per collector (#670)
refactor in-clusters collectors to use struct per collector
2022-10-03 13:53:05 -04:00
Ethan Mosbaugh
7c74f8b755 Disable client-go logging 2022-06-10 15:39:23 +00:00
Marc Campbell
5da84663b1 Support for oci:// retreival of specs 2022-06-03 12:40:15 -07:00
Pavan Sokke Nagaraj
942234da80 Add strict flag to Analyzers and ResultAnalyzers (#539)
* add strict flag to Analyzer/AnalyzerMeta

and regenerate schemas and controller-gen code

* map analyzer strict to result

* Update stdout for human and json format

* fix review comment

* update interactive result

* update interactive results

* Update types.go

* Update upload_results.go

* print strict when only true
2022-02-23 15:07:51 -05:00
Craig O'Donnell
0a2ed01a46 improvement: added --output flag for preflight and support bundle (#538)
* improvement: added --output flag for preflight and support bundle

* improvement: added datetime to preflight default file name
2022-02-17 17:28:28 -05:00
Simon Croome
977fc438ea Remote host collectors (#392)
* Add collect command and remote host collectors

Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl.  Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used.  To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.

The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string.  When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).

The collect command is also useful for troubleshooting preflight issues.

Examples are included to show remote collector usage.

```
bin/collect --collector-image=croomes/troubleshoot:latest  examples/collect/remote/memory.yaml --namespace test
{
  "kind-control-plane": {
    "system/memory.json": {
      "total": 1304207360
    }
  },
  "kind-worker": {
    "system/memory.json": {
      "total": 1695780864
    }
  },
  "kind-worker2": {
    "system/memory.json": {
      "total": 1726353408
    }
  }
}
```

The preflight command has been updated to run remote collectors.  To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:

```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: memory
spec:
  remoteCollectors:
    - memory:
        collectorName: memory
  analyzers:
    - memory:
        outcomes:
          - fail:
              when: "< 8Gi"
              message: At least 8Gi of memory is required
          - warn:
              when: "< 32Gi"
              message: At least 32Gi of memory is recommended
          - pass:
              message: The system has as sufficient memory
```

Results for each node are analyzed separately, with the node name
appended to the title:

```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
  "fail": [
    {
      "title": "Amount of Memory (kind-worker2)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-worker)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-control-plane)",
      "message": "At least 8Gi of memory is required"
    }
  ]
}
```

Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
2021-10-06 09:03:53 -05:00
Dan Stough
0478a7a60f fix: cluster-res collector fixed to one namespace 2021-09-03 19:23:44 +00:00
Jalaja Ganapathy
e23fb2ce59 run support-bundle and preflight checks even with restricted access (#404) 2021-08-13 07:52:49 -07:00
John Murphy
e0f6cab5b3 Fix removes control characters from non interactive preflight runs (#394) 2021-07-23 09:46:36 -05:00
emosbaugh
8dcfa9886d Copy from host collector (#391)
* Copy from host collector

* namespace improvements

* better support for multiple nodes
2021-07-22 12:25:59 -07:00
emosbaugh
d7b6aa2758 Log progress when interactive=false (#382)
* Log progress when interactive=false

* safe print statement
2021-07-08 13:57:35 -07:00
divolgin
4047977b35 Make cursors visible on CTRL+C 2021-07-01 23:08:05 +00:00
divolgin
5f2525b663 Report back some basic progress 2021-03-18 18:56:27 +00:00
Dex
0a19d35073 hide spinner if interactive false (#328)
* hide preflight spinner if interactive is false

Co-authored-by: Salah Aldeen Al Saleh <salahalsaleh1993@gmail.com>
2021-03-09 09:42:38 -08:00
Ethan Mosbaugh
4b78c430ca Host preflight ux improvements 2021-03-02 17:27:01 +00:00
Andrew Reed
fe4db40b43 Move host preflights examples into separate directory
Add all supported analyzers to host preflight sample.
Don't log transient errors waiting for TCP connection.
Begin human stdout results on new line after spinner.
2021-02-15 22:46:12 +00:00
Andrew Reed
10a34c2e58 Host preflight (#311)
* Add HostPreflight v1beta2

* Work on TCP Load Balancer

* Host disk usage collector and analyzer

* Host memory analyzer

* TCP port status

* TCP load balancer

* Review changes

Co-authored-by: Marc Campbell <marc.e.campbell@gmail.com>
2021-02-08 16:09:01 -05:00
Matias Manavella
f1868d0ba8 return messages modified 2020-10-22 14:42:05 -03:00
Matias Manavella
3cbdb41c8f return error when both flags are used 2020-10-22 14:35:08 -03:00
Matias Manavella
d2f6594c0c return error when both flags are used 2020-10-22 14:31:34 -03:00
Matias Manavella
56408fab01 return error when both flags are used 2020-10-22 14:28:35 -03:00
Matias Manavella
f0d9418e21 parseTimeFlag function and error handling added 2020-10-22 12:26:29 -03:00
Matias Manavella
1ba3840ad9 parseTimeFlag function and error handling added 2020-10-22 11:37:52 -03:00
Matias Manavella
a3d667298e Update cmd/preflight/cli/root.go
Co-authored-by: Mark Pundsack <markpundsack@users.noreply.github.com>
2020-10-22 10:13:45 -03:00
Matias Manavella
5cf4ae2157 Update cmd/preflight/cli/root.go
Co-authored-by: Mark Pundsack <markpundsack@users.noreply.github.com>
2020-10-22 10:13:32 -03:00
Matias Manavella
4e047b2372 nil and error handling 2020-10-21 14:47:25 -03:00
Matias Manavella
2436a0c163 Update cmd/preflight/cli/root.go
Co-authored-by: Salah Aldeen Al Saleh <salahalsaleh1993@gmail.com>
2020-10-21 13:59:24 -03:00
Matias Manavella
7186b75f7e --since flag added 2020-10-21 09:51:52 -03:00
Matias Manavella
e16eabd531 added flag --since-time 2020-10-19 16:53:13 -03:00
divolgin
6e86cdc803 Allow preflight spec to be loaded from a secret 2020-10-01 01:37:37 +00:00
divolgin
a0ce85ae1e Adding troubleshoot.sh/v1beta2 2020-09-01 19:57:11 +00:00
GraysonNull
4e08ce4d93 remove commented code blocks, move appName func to shared package 2020-08-13 16:38:51 +00:00
GraysonNull
e8dc5ae1b8 use table widget for bundle analysis so scrolling works 2020-08-13 16:02:29 +00:00
GraysonNull
0e667e9685 scrolling on analyzers table 2020-08-11 15:16:23 +00:00
Marc Campbell
65f957db81 Refactor to support K8s 1.18 2020-06-12 09:28:49 -07:00
Andrew Lavery
9692d5a457 attempt to read the file at the provided path before trying url 2020-03-25 17:27:16 -04:00
Marc Campbell
bd71222715 Removing more unused code 2020-03-23 10:01:30 -07:00
Marc Campbell
983aaaacea Defining types in a package 2020-03-10 17:32:16 +00:00
Marc Campbell
e74101070d Refactor 2020-03-10 01:07:57 +00:00
Marc Campbell
1dda832933 Export runCollectors 2020-03-10 00:04:23 +00:00
Marc Campbell
6c212e24f0 Better marshaling 2020-03-06 01:15:49 +00:00
Marc Campbell
d81cff0d67 BoolString can only unmarshal as json 2020-03-06 00:33:39 +00:00
divolgin
052c932e05 Treat analyzer errors as preflight check fails 2020-01-09 22:48:54 +00:00
Andrew Lavery
0a04256d62 include ServerGroupsAndResources in troubleshoot cluster-resources
also run cluster-resources collector even when some rbac permissions are missing

print progress message when skipping a collector due to rbac issues

include a sample troubleshoot yaml spec in the repo
2020-01-08 15:10:32 -08:00
Andrew Lavery
55f2ed44bf Check RBAC before running collectors 2019-12-31 21:32:42 +00:00
divolgin
89250b0a87 include errors in messages on failures 2019-12-26 16:32:17 +00:00
divolgin
8e1cb615a5 Don't print usage on error and no double-logging 2019-12-24 22:04:43 +00:00