Benjamin Yang 74811c669b Auto-collectors: foundational discovery, image metadata, CLI integrat… (#1845)
* Auto-collectors: foundational discovery, image metadata, CLI integration; reset PRD markers

* Address PR review feedback

- Implement missing namespace exclude patterns functionality
- Fix image facts collector to use empty Data field instead of static string
- Correct APIVersion to use troubleshoot.sh/v1beta2 consistently

* Fix bug bot issues: API parsing, EOF error, and API group corrections

- Fix RBAC API parsing errors in rbac_checker.go (getAPIGroup/getAPIVersion functions)
- Fix FakeReader EOF error to use standard io.EOF instead of custom error
- Fix incorrect API group from troubleshoot.sh to troubleshoot.replicated.com in run.go

These changes address the issues identified by the bug bot and ensure proper
interface compliance and consistent API group usage.

* Fix multiple bug bot issues

- Fix RBAC API parsing errors in rbac_checker.go (getAPIGroup/getAPIVersion functions)
- Fix FakeReader EOF error to use standard io.EOF instead of custom error
- Fix incorrect API group from troubleshoot.sh to troubleshoot.replicated.com in run.go
- Fix image facts collector Data field to contain structured JSON instead of static strings

These changes address all issues identified by the bug bot and ensure proper
interface compliance, consistent API usage, and meaningful data fields.

* Update auto_discovery.go

* Fix TODO comments in Auto-collector section

Fixed 3 of 4 TODOs as requested in PR review:

1. pkg/collect/images/registry_client.go (line 46):
   - Implement custom CA certificate loading
   - Add x509 import and certificate parsing logic
   - Enables image collection from private registries with custom CAs

2. cmd/troubleshoot/cli/diff.go (line 209):
   - Implement bundle file count functionality
   - Add tar/gzip imports and getFileCountFromBundle() function
   - Properly counts files in support bundle archives (.gz/.tgz)

3. cmd/troubleshoot/cli/run.go (line 338):
   - Replace TODO with clarifying comment about RemoteCollectors usage
   - Confirmed RemoteCollectors are still actively used in preflights

The 4th TODO (diff.go line 196) is left as-is since it's explicitly marked
as Phase 4 future work (Support Bundle Differencing implementation).

Addresses PR review feedback about unimplemented TODO comments.

---------

Co-authored-by: Benjamin Yang <benjaminyang@Benjamins-MacBook-Pro.local>
2025-09-12 13:44:26 -06:00
2021-11-12 19:03:35 +00:00
2025-07-28 16:59:09 +01:00
2025-09-12 12:30:57 -06:00
2020-01-29 13:49:02 -08:00
2019-07-05 22:38:40 +00:00
2019-07-19 00:55:32 +00:00
2019-07-05 22:38:40 +00:00

Replicated Troubleshoot

Replicated Troubleshoot is a framework for collecting, redacting, and analyzing highly customizable diagnostic information about a Kubernetes cluster. Troubleshoot specs are created by 3rd-party application developers/maintainers and run by cluster operators in the initial and ongoing operation of those applications.

Troubleshoot provides two CLI tools as kubectl plugins (using Krew): kubectl preflight and kubectl support-bundle. Preflight provides pre-installation cluster conformance testing and validation (preflight checks) and support-bundle provides post-installation troubleshooting and diagnostics (support bundles).

To know more about troubleshoot, please visit: https://troubleshoot.sh/

Preflight Checks

Preflight checks are an easy-to-run set of conformance tests that can be written to verify that specific requirements in a cluster are met.

To run a sample preflight check from a sample application, install the preflight kubectl plugin:

curl https://krew.sh/preflight | bash

and run, where https://preflight.replicated.com provides an example preflight spec:

kubectl preflight https://preflight.replicated.com

NOTE this is an example. Do not use to validate real scenarios.

For more details on creating the custom resource files that drive preflight checks, visit creating preflight checks.

Support Bundle

A support bundle is an archive that's created in-cluster, by collecting logs and cluster information, and executing specified commands (including redaction of sensitive information). After creating a support bundle, the cluster operator will normally deliver it to the 3rd-party application vendor for analysis and disconnected debugging. Another Replicated project, KOTS, provides k8s apps an in-cluster UI for processing support bundles and viewing analyzers (as well as support bundle collection).

To collect a sample support bundle, install the troubleshoot kubectl plugin:

curl https://krew.sh/support-bundle | bash

and run, where https://support-bundle.replicated.com provides an example support bundle spec:

kubectl support-bundle https://support-bundle.replicated.com

NOTE this is an example. Do not use to validate real scenarios.

For more details on creating the custom resource files that drive support-bundle collection, visit creating collectors and creating analyzers.

And see our other tool sbctl that makes it easier to interact with support bundles using kubectl commands you already know

Community

For questions about using Troubleshoot, how to contribute and engaging with the project in any other way, please refer to the following resources and channels.

Software Bill of Materials

A signed SBOM that includes Troubleshoot dependencies is included in each release.

  • troubleshoot-sbom.tgz contains a software bill of materials for Troubleshoot.
  • troubleshoot-sbom.tgz.sig is the digital signature for troubleshoot-sbom.tgz
  • key.pub is the public key from the key pair used to sign troubleshoot-sbom.tgz

The following example illustrates using cosign to verify that troubleshoot-sbom.tgz has not been tampered with.

$ cosign verify-blob --key key.pub --signature troubleshoot-sbom.tgz.sig troubleshoot-sbom.tgz
Verified OK

If you were to get an error similar to the one below, it means you are verifying an SBOM signed using cosign v1 using a newer v2 of the binary. This version introduced breaking changes which require an additional flag --insecure-ignore-tlog=true to successfully verify SBOMs like so.

$ cosign verify-blob --key key.pub --signature troubleshoot-sbom.tgz.sig troubleshoot-sbom.tgz --insecure-ignore-tlog=true
WARNING: Skipping tlog verification is an insecure practice that lacks of transparency and auditability verification for the blob.
Verified OK
Description
Preflight Checks and Support Bundles Framework for Kubernetes Applications
Readme Apache-2.0 36 MiB
Languages
Go 98.1%
Python 0.9%
Shell 0.8%
Makefile 0.2%