419 Commits

Author SHA1 Message Date
Pavan Sokke Nagaraj
7bfb54360c map strict flag to analyze result (#551)
* add util functions

* rename func HasStrictAnalyzersFailed

* map strict flag to analyze err result
2022-03-22 20:47:50 -04:00
diamonwiggins
2b774e16d7 adding serviceaccountname parameter to run collector 2022-03-03 06:12:42 +00:00
Salah Al Saleh
9d41d4a7be Fix getting pod details for new support bundle formats (#543) 2022-03-01 15:55:47 -08:00
Pavan Sokke Nagaraj
e248ab0f97 Fix strict flag mapping (#542)
* add func BoolOrDefaultFalse and Bool

* use strict.BoolOrDefaultFalse

* Update pkg/multitype/boolstring.go

Co-authored-by: Andrew Lavery <laverya@umich.edu>

* Update pkg/multitype/boolstring_test.go

Co-authored-by: Andrew Lavery <laverya@umich.edu>

* Update pkg/multitype/boolstring_test.go

Co-authored-by: Andrew Lavery <laverya@umich.edu>

* Update boolstring_test.go

* remove duplicate test

* Update pkg/multitype/boolstring_test.go

Co-authored-by: garcialuis <garcialuisdev@gmail.com>

Co-authored-by: Andrew Lavery <laverya@umich.edu>
Co-authored-by: garcialuis <garcialuisdev@gmail.com>
2022-02-24 13:31:51 -05:00
Pavan Sokke Nagaraj
942234da80 Add strict flag to Analyzers and ResultAnalyzers (#539)
* add strict flag to Analyzer/AnalyzerMeta

and regenerate schemas and controller-gen code

* map analyzer strict to result

* Update stdout for human and json format

* fix review comment

* update interactive result

* update interactive results

* Update types.go

* Update upload_results.go

* print strict when only true
2022-02-23 15:07:51 -05:00
Craig O'Donnell
0a2ed01a46 improvement: added --output flag for preflight and support bundle (#538)
* improvement: added --output flag for preflight and support bundle

* improvement: added datetime to preflight default file name
2022-02-17 17:28:28 -05:00
Jeff Golden
a818417e8c include atomic collector statuses (#534) 2022-02-10 11:27:13 -06:00
divolgin
3351c289ab Add GVK to k8s objects in cluster-resources files 2022-02-04 01:31:07 +00:00
Salah Aldeen Al Saleh
6f0cf6550d find total memory instead of available (#525) 2022-01-11 17:02:44 -08:00
Salah Aldeen Al Saleh
7425f583fc Don't include any default host collectors (#524) 2022-01-10 16:19:56 -08:00
Andrew Lavery
8fc7d12e19 mark a number of fields as not being required
namespace/namepsaces in resource status analyzers, and the OS list in host package collectors
2022-01-06 23:54:19 +01:00
divolgin
007edd1181 Allow specifying namespaces when analyzing cluster resources 2021-12-17 21:47:06 +00:00
divolgin
3cedbe16a7 Organize test files by type and namespace 2021-12-17 19:23:54 +00:00
Salah Aldeen Al Saleh
4c72573936 os minor should default to 0 (#513) 2021-12-10 13:17:36 -08:00
Salah Aldeen Al Saleh
d1f341b8ed host system packages collector/analyzer (#506)
* host system packages collector/analyzer
2021-12-10 12:05:21 -08:00
Ethan Mosbaugh
177f2da16d Update github.com/containers/image/v5 2021-11-30 23:37:25 +00:00
Ethan Mosbaugh
59d50e7679 Fix go mod 2021-11-30 21:26:24 +00:00
Ethan Mosbaugh
fba0f97225 found not ound 2021-11-30 20:12:29 +00:00
Ethan Mosbaugh
4d0eaf471f crd not storageClass 2021-11-30 20:12:09 +00:00
Salah Aldeen Al Saleh
c7c21e88fb fix custom resources redaction file path (#480)
* fix custom resources redaction file path
2021-11-02 12:36:15 -07:00
divolgin
739ee666af Allow text analyzer to not generate an error if no files match 2021-10-29 17:52:59 +00:00
divolgin
742ddc8c06 Ensure outcomes are optional in every case 2021-10-29 00:23:32 +00:00
divolgin
7cb6d90a39 replicaset analyzer supports label selectors 2021-10-28 22:06:15 +00:00
Sean Rester
5d9f14fde5 Merge pull request #474 from replicatedhq/add-node-status-check
38798: Adding node status check
2021-10-28 17:52:18 -04:00
Salah Aldeen Al Saleh
14463642b0 a function to get a pod details from the support bundle (#476)
* a function to get a pod details from the support bundle
2021-10-28 14:06:02 -07:00
Salah Aldeen Al Saleh
45dd980012 update cluster pod analyzers comment (#475) 2021-10-28 10:31:59 -07:00
Salah Aldeen Al Saleh
e100e7c478 get container logs for unhealthy pods (#469)
* get container logs for unhealthy pods

Co-authored-by: divolgin <dmitriy@replicated.com>
Co-authored-by: divolgin <divolgin@users.noreply.github.com>
2021-10-28 09:21:14 -07:00
Sean Rester
1345b200aa 38798: Adding node status check 2021-10-28 11:16:26 -04:00
divolgin
db3d27d38f Fix windows build 2021-10-27 21:29:11 +00:00
divolgin
e7daba9d0c Merge pull request #470 from replicatedhq/divolgin/analyzers
Replicaset collector and analyzer
2021-10-27 13:51:42 -07:00
divolgin
ada35eb31c Replicaset collector and analyzer 2021-10-27 20:24:14 +00:00
Salah Aldeen Al Saleh
f2374cf113 add involved object to clusterPodStatuses analyzer result (#459)
* cluster pod statuses analyzer involved object
2021-10-27 12:18:49 -07:00
Salah Aldeen Al Saleh
5a8561a31f include logs for init containers as well (#467) 2021-10-27 10:20:55 -07:00
divolgin
1cdfd96768 Jobs status analyzer 2021-10-26 23:41:02 +00:00
divolgin
f108c3ca57 Analyze all deployments in all namespaces 2021-10-26 21:36:27 +00:00
divolgin
491376b772 Merge pull request #462 from replicatedhq/divolgin/analyzers
Ability to analyze all statefulsets
2021-10-26 14:08:32 -07:00
divolgin
34724e7932 Ability to analyze all statefulsets 2021-10-26 20:51:45 +00:00
deepsource-autofix[bot]
3e60fcda8e Fix check for empty string 2021-10-26 19:24:08 +00:00
Salah Aldeen Al Saleh
26402a7b04 cluster pod statuses analyzer improvements (#458)
* add pod status reason to cluster pod statuses analyzer
2021-10-26 08:42:40 -07:00
Salah Aldeen Al Saleh
3d1d53ee9d ClusterPodStatuses analyzer (#456)
* ClusterPodStatuses analyzer

Co-authored-by: divolgin <dmitriy@replicated.com>
2021-10-25 17:44:59 -07:00
divolgin
20f1b60f11 Include pod logs for pods that are failing 2021-10-26 00:01:26 +00:00
divolgin
072d2d7a36 Fix ceph collector 2021-10-22 23:01:13 +00:00
Andrew Reed
7b36e6a1f8 Copy in longhorn client (#454) 2021-10-22 15:24:07 -05:00
Rishabh Bohra
cf03503216 feat: Collect custom resources (#447)
* feat: Collect custom resources
Co-authored-by: Martin Hrabovcin<mhrabovcin@users.noreply.github.com>

Co-authored-by: Andrew Reed <andrew@replicated.com>
2021-10-21 16:49:59 -05:00
Dimitri Koshkin
111396eb39 fix: pass redact flag when running support-bundle (#406)
Co-authored-by: Salah Aldeen Al Saleh <sg.alsaleh@gmail.com>
2021-10-20 10:47:27 -07:00
Jalaja Ganapathy
372454651e collector/analyzer for host operating system (#443)
* collector/analyzer for host operating system

* address cr comments

* cleanup

* fix invoking the analyzer
code cleanup

* fix cr comments

* add corner case unit-test

* fix kernel version parsing

* address review comments

* add default case

* parse using regex

* added more testcases and fixed the bug found in cr

* few small things
2021-10-12 14:42:23 -07:00
divolgin
e095a7838f Check nil pointers 2021-10-12 16:10:02 +00:00
Simon Croome
dc8b38d249 Handle k8s api deprecations 2021-10-07 18:55:51 +01:00
Vera Harless
73609c4fef feat: add more detail to the ceph analyzer output (#445) 2021-10-06 11:22:56 -04:00
Simon Croome
977fc438ea Remote host collectors (#392)
* Add collect command and remote host collectors

Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl.  Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used.  To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.

The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string.  When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).

The collect command is also useful for troubleshooting preflight issues.

Examples are included to show remote collector usage.

```
bin/collect --collector-image=croomes/troubleshoot:latest  examples/collect/remote/memory.yaml --namespace test
{
  "kind-control-plane": {
    "system/memory.json": {
      "total": 1304207360
    }
  },
  "kind-worker": {
    "system/memory.json": {
      "total": 1695780864
    }
  },
  "kind-worker2": {
    "system/memory.json": {
      "total": 1726353408
    }
  }
}
```

The preflight command has been updated to run remote collectors.  To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:

```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: memory
spec:
  remoteCollectors:
    - memory:
        collectorName: memory
  analyzers:
    - memory:
        outcomes:
          - fail:
              when: "< 8Gi"
              message: At least 8Gi of memory is required
          - warn:
              when: "< 32Gi"
              message: At least 32Gi of memory is recommended
          - pass:
              message: The system has as sufficient memory
```

Results for each node are analyzed separately, with the node name
appended to the title:

```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
  "fail": [
    {
      "title": "Amount of Memory (kind-worker2)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-worker)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-control-plane)",
      "message": "At least 8Gi of memory is required"
    }
  ]
}
```

Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
2021-10-06 09:03:53 -05:00