Commit Graph

580 Commits

Author SHA1 Message Date
Diamon Wiggins
2fcdc77cd3 Standardize Cluster Resources Collector File Paths (#971)
* using const for cluster resources k8s objects to standardize directories and files
2023-01-25 13:34:15 -05:00
Ethan Mosbaugh
ad1a56251f feat(hostpreflights): udp port status (#981)
* feat(hostpreflights): udp port status

* fix(hostpreflights): tcpPortStatus -> udpPortStatus
2023-01-24 16:38:54 -05:00
Evans Mungai
c8d9864235 Upgrade dependencies (#959)
* Upgrade docker distribution module

* Upgrade github.com/blang/semver dependency

* Upgrade github.com/godbus/dbus dependency
2023-01-13 10:32:39 +00:00
Nathan Sullivan
827c49ca00 adding test coverage for preflight.RunPreflights() (#949)
* adding test coverage for preflight.RunPreflights()

TDD to work on https://github.com/replicatedhq/troubleshoot/issues/906
and verify the fix is successful

* go.mod/go.sum: removing gnomock stuff since it's not in use (yet)

* Makefile: try running the preflight integration test with the e2e tests,
since there's a K3s instance in place already

* Makefile add a dedicated test-integration task, which runs as it's own
github action job

* Makefile: exclude a few things from test-integration that break the
github action job

* WIP on preflight tests, addressing some of @banjoh's feedback, more to
go though (specifically changing over to using assert)

* preflight tests: use the testify libraries, restructure code to be
formatted more like other tests in this project
2023-01-13 08:22:57 +10:00
Nathan Sullivan
de0371053a preflight: ensure --output produces an output file of the desired format (#951) 2023-01-13 07:55:47 +10:00
Dexter Yan
962e2c7d7e feat(support-bundle): optimize the error log of ceph and longhorn when kURL add-on were not enabled (#943) 2023-01-10 09:37:42 +13:00
Evans Mungai
70af0ff3d0 fix: Collect logs from all pods specified in logs collector spec (#952)
Fixes: #945
2023-01-09 12:04:01 -04:00
Edgar Lanting
199efca2ea Added note in to clarify purpose of MaxBytes in cluster_resources.go (#946) 2023-01-09 10:00:12 +13:00
Edgar Lanting
a442ac8fc4 BREAKING: (Feature) maximum pod log size limit
introduces a new option to limit the size of a pod log when added to the bundle. This will make sure the support bundle will not grow to an unacceptable size and thus might contain information that is too old.

The maximum size of a pod log in a bundle is set by default to 5MB, and can be changed if we decide upon the need.

BREAKING CHANGE: any logs that are collected by the logs collector are now limited by default to 5MB unless a different size limit is specified.  Folks expecting log files larger than that to be collected without truncation will need to adjust their support bundle specs.

Fixes: #878
2023-01-05 16:23:43 +13:00
Nathan Sullivan
87c153cc8c preflight: add yaml output format (#940)
* preflight: add yaml output format

ref https://github.com/replicatedhq/troubleshoot/issues/905
2023-01-04 14:27:00 +13:00
Evans Mungai
a523551da9 feat(redactors): Run redactors on an existing support bundle (#887)
* feat(redactors): Run redactors on an existing support bundle

Add redact subcommand to support-bundle to allow running redactors on an
existing bundle to creating a new redacted bundle.

The command will be launched like so

support-bundle redact <redactor urls> --bundle support-bundle.tar.gz

Fixes: #705
2023-01-03 18:05:15 +00:00
Nathan Sullivan
d73d5c6a3a preflight: fix segfault when collector's are not defined in YAML (#939)
* preflight: fix segfault when collector's are not defined in YAML

* fix bug with kind: Preflight specs with uploadResultsTo, wrong variable being used :)

ref https://github.com/replicatedhq/troubleshoot/pull/894

Co-authored-by: Evans Mungai <evans@replicated.com>
2023-01-03 14:01:49 -04:00
ada mancini
0f2892c316 add cpuArchitecture filter to nodeResources collector (#930)
* filter on cpu architecture

* filter by cpu architecture

* fail if we dont have a label match too

* add tests for cpu arch filter

* update for make schemas
2022-12-29 12:17:11 -05:00
Xav Paice
df43c9fc21 If no pods are selected for log collector, do not wait for timeout (#932)
Fixes: #931
2022-12-29 11:17:54 +13:00
Evans Mungai
ebeed77287 chore: Upgrade gopsutil to v3 (#927)
* Add host collector tests related to gopsutil upgrade

* Upgrade gopsutil to v3
2022-12-24 13:42:13 +13:00
Craig O'Donnell
bc6528908f fix: collect rbac permissions error (#928) 2022-12-24 09:13:38 +13:00
ada mancini
e2053a00a2 rename the troubleshoot label to "support-bundle" (#918) 2022-12-22 19:00:05 -03:00
David Rohnow
421cccf919 Add timeout context log collector (#914) 2022-12-22 10:23:37 +13:00
Dexter Yan
be26462c19 feat(cluster_resources): increase default client burst and qps (#920)
* feat(collect): add client burst and qps
2022-12-22 09:49:42 +13:00
danj-replicated
e48fa36eaf Add generic kubernetes resource analyzer (#780)
* First draft of a generic cluster-resource analyzer

* Add more resource mappings

* Support some cluster-scoped resources

the structure of this could probably be a bit tidyer, but this now
allows us to target non-namespaced resources simply by not specifying
the namespace in the analyzer.

* General tidy up

* pull resource selection into it's own function

* remove pointless pointer to string

* Export findResource function

This lets other analyzers use it.

* Add tests for cluster resources analyzer

* Update schemas

* Address some of @banjoh's comments

* rework resource selection

thanks @banjoh

* Replace FindFiles with GetFile

Since we already know where we're looking for files,
it doesn't make sense to have to loop over a single item slice.

* Use assert instead of require

* format

* Change default behaviour for no namespace

Now not providing a namespace causes us to default to "default", with an
explicit bool to toggle cluster-scoped resource checking.

This should feel somewhat more intuitive when writing analyzers that use
this function

* Generate schemas

* Value → expectedValue
2022-12-19 11:31:43 -04:00
Diamon Wiggins
f2be6f5829 Allow Preflight CLI to consume multiple specs as input (#894)
To keep both the Support Bundle and Preflight CLIs similar, this PR adds the ability for the Preflight binary to allow multiple specs be provided as CLI args and for them all to be run.
2022-12-14 14:50:01 -04:00
Evans Mungai
cd1511a8fb fix(collectors): store unhealthy pod logs correctly (#909)
The symlinking logs feature led to a regression where symlinks of
unhealthy pods were overwritting logs in the support bundle. This
fix allows the cluster resources collector to instruct the logs
collector not to symlink logs, which in turn ensures logs are not
overwritten.

Fixes: #908
2022-12-14 14:47:20 -04:00
Diamon Wiggins
a4c4b24056 Deduplication for Cluster Resources Collector (#832)
* add dedup for cluster resources collector
* restructure both collect.go in both pkg/supportbundle and pkg/preflight to be more similar for eventual refactor
2022-12-07 15:10:31 -04:00
Evans Mungai
8c31e61367 fix(collectors): Fix logs collection in longhorn collector (#886)
* fix(collectors): Fix logs collection in longhorn collector

* Small typo

* Run go fmt on added changes
2022-12-02 12:51:07 -05:00
Evans Mungai
2a61a8686a feat(collectors): Add TLS parameters to the postgres collector (#875)
For a postgres collector spec targeting a server configured to accept
(m)TLS connections we need to pass in the necessary parameters in order
to successfully connect to the server. Both preflight and support bundle
specs use this collector.

This change allows us to pass in the necessary TLS parameters via inlined
TLS configuration or via a secret reference.

Fixes #747
2022-11-30 15:52:08 +13:00
Xav Paice
c85bf9a9a6 BREAKING: remove IP address redaction (#734)
This change removes the IPv4 address redaction which previously ran by default on all
support bundle collections.

Folks that want to redact IPv4 addresses will need to add that redactor manually to their redactor specs.
2022-11-30 08:42:42 +13:00
Evans Mungai
bfb77ad601 feat(collectors): Add TLS parameters to the redis collector (#870)
feat(collectors): Add mTLS parameters to the redis collector

For a redis collector spec targeting a redis server configured to accept
(m)TLS connections we need to pass in the necessary TLS parameters in order
to successfully connect to the server. Both preflight and support bundle
specs use this collector.

This change allows us to pass in the necessary TLS parameters via inlined
TLS configuration or via a secret reference.

Fixes #746
2022-11-29 17:47:52 +00:00
Chuck D'Antonio
c4c66633e5 Includes virtual memory parameters in Sysctl (#874)
TL;DR
-----

Updates Sysctl collector and analyzer for virtual memory parameters

Details
-------

Adds supoort for virtual memory parameters to the Sysctl collector and
analyzers. I uncovered this writing a pre-flight for a Helm chart that
includes ECK as a subchart. Since ECK requires a specific minimum value
for `vm.max_map_count` I wanted to use the Sysctl analyzer to check for
the expected value, but wasn't able to because of the limited values it
collected. I also learned that Sonarqube expects the same parameter to
be increased, so it seemed like a general enough requirement to add it
in.

The code updates the collector to collect values under `/proc/sys/vm`
and adds tests to the analyzer to based on the ECK requirements. Making
the tests pass required adding operators to the when expression, since
the existing code only allowed for `=`, `==`, and `===`. The when
expression now supports `>`, `<`, `>=`, and `<=`.

All tests pass.
2022-11-29 12:32:59 -05:00
Evans Mungai
b693e6650d chore: Add GH workflow to ensure schemas are generated in a PR (#872)
* Make file target to check generated schemas

* Add missing yaml tag to Logs struct

* chore: Add GH workflow to ensure schemas are generated in a PR
2022-11-28 10:58:09 +13:00
Dexter Yan
7e3a59cfc0 feat(analyze): add ExcludeFiles field to textAnazlye (#867)
* feat(analyze): add ExcludeFiles field to textAnazlye

* feat(analyze): fix test for getFiles

* feat(analyze): change function name to  excludeFilePaths

* feat(analyze): fix preflight test fail

* feat(analyze): add tests for excludeFiles

* feat(schemas): run make schemas

* feat(analyze): use getChildCollectedFileContents function prototype

* feat(analyze): reduce time complexity

* feat(longhorn): add getFileContents as getCollectedFileContents
2022-11-28 10:45:10 +13:00
Evans Mungai
fbbcf87405 feat(collectors): Store all pod logs in cluster-resources directory (#821)
* feat(collectors): Store all pod logs in cluster-resources directory

All pod logs collected by the logs collector will now be stored in
/cluster-resources/pods/logs/[namespace]/[pod]/[container].log. This
will provide consistency and allow sbctl to find the logs when we run
`kubectl logs <pod>`. To allow backwards compatibility, symlinks of the
log files will be created in the current expected locations.

Closes: #744
2022-11-22 07:10:34 +13:00
Dexter Yan
78bcafe489 fix(flag): fix wrong output filename (#834)
* fix(flag): fix wrong output filename

* fix(flag): add reset flag function

* fix(flag): add output flag test cases

* fix(flag): move resetFlags function into private go test

* fix(flag): restructure flag tests with testify

* fix(flag): remove resetFlags function

* fix(flag): remove duplicated test and rewrite test names
2022-11-17 14:38:01 +13:00
Diamon Wiggins
c34d80c300 Discover Redactors in Cluster (#827)
Adds the ability to search for support bundle specs and redactors, in both configmaps and secrets
2022-11-10 17:36:51 +13:00
Xav Paice
80cca8a487 Add omitempty to StorageClassName in schema (#814)
* Add omitempty to StorageClassName in schema

Allow a StorageClass spec to not require specify storageClassName.

Fixes: #813
2022-11-09 11:34:37 +13:00
Xav Paice
3513eeca19 Ensure clusterResources is added prior to other collectors (#768)
This change ensures that the clusterResources collector runs prior to any others
in order to not collect info on pods that collectors run during collection.

Additionally centralizes functions that are common to all collection to make future
maintenance simpler.

Fixes: #767
2022-11-01 12:16:01 +13:00
Edgar Lanting
34817b67d0 Update cluster_resources.go (#804)
Due to deprecation of the API at `policy/v1beta1` for `PodDisruptionBudgets` and `batch/v1beta1` for `CronJobs`, updated cluster_resources.go to accommodate using either apiVersion v1 & v1beta1
2022-10-28 14:56:57 +13:00
Diamon Wiggins
e2ac7bf715 fix ceph title (#799) 2022-10-24 12:35:35 -05:00
Diamon Wiggins
3d4bd4b601 trim whitespace from collected contents (#796) 2022-10-21 16:41:07 +13:00
Ahmed Mousa
764f0ac8b6 'added collection of roles, cluster roles and their respective bindings' (#779)
Co-authored-by: Edgar Lanting <edgarlanting@users.noreply.github.com>
2022-10-17 11:02:07 -05:00
Diamon Wiggins
04c7a18da3 Fix Progress Callback for Support Bundle Collection (#781)
fix progress callback for support bundle and revert collector title changes
2022-10-14 12:29:59 -04:00
Diamon Wiggins
bcaaa9e59a Fix Preflight CheckRBAC (#776)
* return collect result instead of nil
2022-10-13 12:54:40 +13:00
Chuck D'Antonio
2298ec3030 Supports the Kubernetes distribution analyzer identifying VMware Tanzu (#766)
Adds a check to the Kubernetes distribution analyzer to identify VMware Tanzu using the same approach as identifying OpenShift.
2022-10-13 12:52:22 +13:00
Diamon Wiggins
48beb303be export context field from collector structs (#771) 2022-10-11 14:54:54 -04:00
stefanrepl
9c986a74a6 make runPreflight and preflight cli flags public (#769) 2022-10-10 16:34:54 -06:00
ada mancini
eb40b9422f implement uri: field (#702) 2022-10-05 15:35:55 +13:00
danj-replicated
e80235f0a8 Collect resourcequotas (#729)
Signed-off-by: Dan Jones <danj@replicated.com>

Signed-off-by: Dan Jones <danj@replicated.com>
2022-10-05 12:58:54 +13:00
Diamon Wiggins
c7b84ad1e5 Refactor in-clusters collectors to use struct per collector (#670)
refactor in-clusters collectors to use struct per collector
2022-10-03 13:53:05 -04:00
Diamon Wiggins
7eecf6c526 improving error handling 2022-09-14 10:58:08 -04:00
Diamon Wiggins
ec6ec59303 fixing tests 2022-09-13 23:27:49 -04:00
Diamon Wiggins
e53871b4dc adding tests 2022-09-13 23:00:57 -04:00