Commit Graph

823 Commits

Author SHA1 Message Date
divolgin
9cbdb70d16 Merge pull request #457 from replicatedhq/divolgin/pod-logs
Include pod logs for pods that are failing
2021-10-25 17:18:32 -07:00
divolgin
20f1b60f11 Include pod logs for pods that are failing 2021-10-26 00:01:26 +00:00
divolgin
fafac18d29 Merge pull request #455 from replicatedhq/divolgin/ceph
Fix ceph collector
2021-10-22 16:38:55 -07:00
divolgin
072d2d7a36 Fix ceph collector 2021-10-22 23:01:13 +00:00
Andrew Reed
7b36e6a1f8 Copy in longhorn client (#454) 2021-10-22 15:24:07 -05:00
Rishabh Bohra
cf03503216 feat: Collect custom resources (#447)
* feat: Collect custom resources
Co-authored-by: Martin Hrabovcin<mhrabovcin@users.noreply.github.com>

Co-authored-by: Andrew Reed <andrew@replicated.com>
2021-10-21 16:49:59 -05:00
Dimitri Koshkin
111396eb39 fix: pass redact flag when running support-bundle (#406)
Co-authored-by: Salah Aldeen Al Saleh <sg.alsaleh@gmail.com>
2021-10-20 10:47:27 -07:00
Jalaja Ganapathy
372454651e collector/analyzer for host operating system (#443)
* collector/analyzer for host operating system

* address cr comments

* cleanup

* fix invoking the analyzer
code cleanup

* fix cr comments

* add corner case unit-test

* fix kernel version parsing

* address review comments

* add default case

* parse using regex

* added more testcases and fixed the bug found in cr

* few small things
v0.16.0
2021-10-12 14:42:23 -07:00
divolgin
5dece3eb75 Merge pull request #451 from replicatedhq/divolgin/panic
Check nil pointers
2021-10-12 10:12:11 -07:00
divolgin
e095a7838f Check nil pointers 2021-10-12 16:10:02 +00:00
Vera Harless
08953d46d1 fix: add collect to goreleaser (#450) v0.15.0 2021-10-08 15:44:55 -04:00
Andrew Lavery
bc197761ea Merge pull request #434 from croomes/handle-api-deprecations
Handle k8s api deprecations
2021-10-08 06:52:31 -07:00
Simon Croome
dc8b38d249 Handle k8s api deprecations 2021-10-07 18:55:51 +01:00
Vera Harless
73609c4fef feat: add more detail to the ceph analyzer output (#445) 2021-10-06 11:22:56 -04:00
Simon Croome
977fc438ea Remote host collectors (#392)
* Add collect command and remote host collectors

Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl.  Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used.  To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.

The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string.  When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).

The collect command is also useful for troubleshooting preflight issues.

Examples are included to show remote collector usage.

```
bin/collect --collector-image=croomes/troubleshoot:latest  examples/collect/remote/memory.yaml --namespace test
{
  "kind-control-plane": {
    "system/memory.json": {
      "total": 1304207360
    }
  },
  "kind-worker": {
    "system/memory.json": {
      "total": 1695780864
    }
  },
  "kind-worker2": {
    "system/memory.json": {
      "total": 1726353408
    }
  }
}
```

The preflight command has been updated to run remote collectors.  To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:

```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: memory
spec:
  remoteCollectors:
    - memory:
        collectorName: memory
  analyzers:
    - memory:
        outcomes:
          - fail:
              when: "< 8Gi"
              message: At least 8Gi of memory is required
          - warn:
              when: "< 32Gi"
              message: At least 32Gi of memory is recommended
          - pass:
              message: The system has as sufficient memory
```

Results for each node are analyzed separately, with the node name
appended to the title:

```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
  "fail": [
    {
      "title": "Amount of Memory (kind-worker2)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-worker)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-control-plane)",
      "message": "At least 8Gi of memory is required"
    }
  ]
}
```

Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
2021-10-06 09:03:53 -05:00
Andrew Reed
4d52760d35 Collector and analyzer for sysctl parameters (#441)
Collector and analyzer for sysctl parameters
v0.14.0
2021-10-01 13:43:26 -05:00
divolgin
6e34aa615e Merge pull request #442 from replicatedhq/divolgin/closer
Allow memory writers
v0.13.17
2021-09-30 11:42:14 -07:00
divolgin
ca51e92878 Allow memory writers 2021-09-30 18:25:52 +00:00
divolgin
06750d478e Merge pull request #439 from replicatedhq/divolgin/nil
Don't panic when no data is collected
v0.13.16
2021-09-29 15:03:50 -07:00
divolgin
6d0a57b16e Don't panic when no data is collected 2021-09-29 21:25:28 +00:00
Jalaja Ganapathy
8a29442a2a Remove ID from host preflight spec (#438) v0.13.15 2021-09-29 09:49:54 -07:00
divolgin
299497c0c0 Merge pull request #429 from danbudris/copyFromHostForCpNodes
add toleration to copy-from-host daemonset to allow collection from CP nodes
2021-09-29 09:01:14 -07:00
divolgin
050f5939c6 Merge pull request #437 from replicatedhq/divolgin/memory
Save collector data to disk directly
2021-09-29 08:12:05 -07:00
divolgin
0e8bedc281 Save collector data to disk directly 2021-09-29 00:15:02 +00:00
Dan Stough
bb0515830d Merge pull request #436 from replicatedhq/dans-fix-sbom-perms
chore(ci): fix sbom assets for krew
2021-09-28 15:09:47 -04:00
Dan Stough
b903f1f1c4 chore(ci): fix sbom asset perms 2021-09-28 16:37:53 +00:00
Jalaja Ganapathy
f26c9b4136 fix README syntax (#433) v0.13.14 2021-09-24 17:35:36 -07:00
Jalaja Ganapathy
eb795c98b6 fix serializer for unique id (#432) v0.13.13 2021-09-24 14:20:37 -07:00
Jalaja Ganapathy
a0b3b3f7dc add an unique id to each host preflights (#431)
* add an unique id to each host preflights

* auto generated files

* updated schemas for the new field id

* keeping it consistent with the rest of the spec
2021-09-24 13:29:14 -07:00
danbudris
67987a4432 add toleration to allow copy-from-host daemonset to run on CP nodes 2021-09-23 17:53:57 -04:00
Salah Aldeen Al Saleh
1bdd3db8c5 update schemas (#428)
* update schemas

* update controller-gen
v0.13.12
2021-09-23 11:03:19 -07:00
John Murphy
a2b5edb551 added missing cosign.key (#427)
SBOM generation was failing because it missed a step to generate the private key needed for SBOM signing from Github secret.
v0.13.11
2021-09-23 10:46:30 -05:00
Salah Aldeen Al Saleh
880c7dc3ea ability to specify a list of namespaces for the cluster resources collector (#424)
* ability to specify a list of namespaces for the cluster resources collector
2021-09-23 08:02:05 -07:00
divolgin
922f7c8b23 Merge pull request #425 from replicatedhq/divolgin/results
Analyzers should not return multiple results
2021-09-22 16:13:54 -07:00
divolgin
afa08e5362 Analyzers should not return multiple results 2021-09-22 22:50:38 +00:00
Dan Stough
614aed52c9 Merge pull request #422 from replicatedhq/dans/fix-clean-noninteractive-output
fix(support-bundle): no client-go warnings or control chars if noninteractive.
2021-09-22 13:38:43 -04:00
Dan Stough
72a50ee3f2 fix(support-bundle): no client-go warnings or control chars if noninteractive 2021-09-22 15:59:35 +00:00
Salah Aldeen Al Saleh
0c7fede7b6 check for nil analyzers (#421) 2021-09-21 12:12:10 -07:00
John Murphy
639bf7a832 Add signed SBOM to troubleshoot (#414)
This change will generate a signed software bill of materials and add it to the repository release archives when the project is released.
2021-09-21 13:55:41 -05:00
John Murphy
48287097d8 added email alias to code of conduct (#420) 2021-09-21 13:52:00 -05:00
divolgin
cb5ddf752f Merge pull request #419 from danbudris/machineReadableNonInteractiveOutput
make non-interactive `support-bundle` output more machine readable
2021-09-21 09:21:06 -07:00
danbudris
52e1a04f57 Merge branch 'machineReadableNonInteractiveOutput' of https://github.com/danbudris/troubleshoot into machineReadableNonInteractiveOutput 2021-09-17 11:21:34 -04:00
danbudris
5b4b548aa0 if interactive, only return the print archivePath to stdout; if non-interactive, print whole analysis as json 2021-09-17 11:20:39 -04:00
Daniel Budris
f2a232d174 use analyzerResults not analysis for key 2021-09-17 11:05:34 -04:00
danbudris
f4e675dae0 add json tags to output struct for easier unmarshalling 2021-09-17 10:57:52 -04:00
danbudris
867df407ea convert output bytearray to string before printing 2021-09-17 10:50:22 -04:00
danbudris
e0fb748498 move non-interactive output to discreet struct with marshalling methods; dont show output for non-interactive; format everything in JSON 2021-09-17 10:38:38 -04:00
danbudris
463783d2fa resolve merge conflicts 2021-09-15 21:25:15 -04:00
danbudris
2ce78ac33a Merge branch 'master' of https://github.com/replicatedhq/troubleshoot into machineReadableNonInteractiveOutput 2021-09-15 21:19:01 -04:00
danbudris
4cf0f5881d make non-interactive support-bundle output more machine readable
when using the `interactive=false` flag of `support-bundle`, the spinner would still spin and the archive path and analysis output were kind of smooshed together with the logs.

now, if `interactive=false`, only print each recieved collector callback message once, and don't spin

also, add a key to the archivePath and analyzerOutput that are returned, for easier programatic parsing
2021-09-15 20:58:09 -04:00