Commit Graph

826 Commits

Author SHA1 Message Date
Andrew Lavery
ad7d52f7e5 add a collector that checks s3 access (#2007)
* add a collector that checks s3 access

* testing and analyzer

* analyzer test

* fmt
2026-04-09 11:12:34 -07:00
Evans Mungai
670a510a2d feat(analyze): optional additionalDeviceTypes parameter for blockDevices (#2002)
feat(analyze): optional additionalDeviceTypes for blockDevices; refactor match config and tests

Allow preflights to count extra lsblk TYPE values (e.g. loop, lvm) by listing them in
blockDevices.additionalDeviceTypes on BlockDevicesAnalyze. Types in this list are
eligible whether or not includeUnmountedPartitions is set; disk and optional
partitions behave as before.

Refactor matching to use blockDevicesMatchConfig and document eligibility on that
type. Add host_block_devices_match_test.go for type-rule tables and preflight-style
integration cases; keep classic scenarios in host_block_devices_test.go with a
shared analyzeHostBlockDevicesOutput helper.
Regenerate CRDs and deepcopy for the new API field.

Signed-off-by: Evans Mungai <evans@replicated.com>
2026-03-31 20:29:51 +01:00
Andrew Lavery
e8bf6435e4 add a '--metadata' flag to support-bundle (#1993)
* add a '--metadata' flag to support-bundle

* test the metadata flag e2e
2026-03-12 13:30:28 -04:00
Andrew Lavery
94db56d668 add a dedicated support bundle metadata collector (#1992)
* add support bundle metadata collector

* add e2e test for the new collector

* make fmt

* properly include v1beta3

* remove the ability to specify an arbitrary secret
2026-03-12 12:38:45 -04:00
Ethan Mosbaugh
596a1f21a6 fix: add back collect binray, release docker image (#1991) 2026-03-11 10:51:19 -07:00
Martin Wunderlich
cfe3849bff Issue 1980: timeout for supportbundle collect too short (#1986)
* Issue 1980 - Timeout for supportbundle collect too short

- leave default timeout at 30 seconds
- but: make configurable with SupportBundleOpts
- add timeout parameter to CLI flags
- add unit tests

* Issue 1980 - Timeout for supportbundle collect too short

- fix formatting
2026-03-10 16:52:34 -07:00
ada mancini
9030fff9d0 Add IngressClass analyzer (#1981)
* Add CLUSTER_RESOURCES_INGRESS_CLASS constant

* Collect IngressClass resources in cluster resources

* Add IngressClass analyzer API type

* Regenerate deepcopy for IngressClass type

* Update client-gen output from make generate

* Add IngressClass analyzer tests

* Implement IngressClass analyzer

* Register IngressClass analyzer in dispatcher

* Restore v1beta3 import in clientset scheme registration

The v1beta3 import was accidentally removed during client-gen
regeneration, causing a compile error since the SchemeBuilder
still references troubleshootv1beta3.AddToScheme.
2026-02-27 13:01:36 -05:00
ada mancini
73017ec48e feat: collect CertificateSigningRequests in clusterResources collector (#1964)
* Add .worktrees to .gitignore

Prevent worktree directories from being tracked in the repository.

* feat: collect CertificateSigningRequests in clusterResources collector

Add support for collecting CertificateSigningRequests (CSRs) from the
certificates.k8s.io/v1 API in the clusterResources collector.

Changes:
- Added certificateSigningRequests() helper function in cluster_resources.go
  following the existing pattern for other cluster-scoped resources
- Integrated CSR collection into the Collect() method between
  volumeAttachments and configMaps
- Added CLUSTER_RESOURCES_CERTIFICATE_SIGNING_REQUESTS constant
- Implemented fail-safe error handling for permission denied scenarios
  (e.g., managed clusters like EKS that may deny CSR access)

Testing:
- Added Test_CertificateSigningRequests() with table-driven tests for
  single and multiple CSR collection scenarios
- Added Test_CertificateSigningRequests_PermissionDenied() to verify
  fail-safe behavior when API access is forbidden
- All existing tests pass with no regressions

CSRs are saved to: cluster-resources/certificatesigningrequests.json
Errors are saved to: cluster-resources/certificatesigningrequests-errors.json

* style: run make fmt to align constant declarations

Formatting changes only - realigned constant declarations for
consistent spacing.

* fix: add .worktrees as separate line in .gitignore

The /support-bundle directory should remain ignored (for built
binaries), and /.worktrees/ should be added as a separate line.
2026-01-21 14:18:53 -05:00
Andrew Lavery
a50bd612e8 use oras.land/oras-go/v2 (#1957) 2026-01-14 14:36:04 -06:00
Adam Wolfe Gordon
985416f20c Copy TaintExists to pkg/k8sutil and stop importing k8s.io/kubernetes (#1952)
Importing k8s.io/kubernetes causes any go modules that depend on this one to
have some issues. For example, the following happens in a module that depends on
troubleshoot:

```shell
$ go list -modfile=./go.mod -m -json -mod=mod all
go: k8s.io/cloud-provider@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/cluster-bootstrap@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/controller-manager@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/cri-client@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/csi-translation-lib@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/dynamic-resource-allocation@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/endpointslice@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/externaljwt@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/kube-controller-manager@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/kube-proxy@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/kube-scheduler@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/mount-utils@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/pod-security-admission@v0.0.0: invalid version: unknown revision v0.0.0
go: k8s.io/sample-apiserver@v0.0.0: invalid version: unknown revision v0.0.0
```

The only thing being used from k8s.io/kubernetes is a simple utility function,
`TaintExists`. Copy it into pkg/k8sutil to eliminate the need for the import.

Signed-off-by: Adam Wolfe Gordon <awg@upbound.io>
Co-authored-by: Andrew Lavery <laverya@umich.edu>
2026-01-14 14:40:33 -05:00
Andrew Lavery
128f9311fe move to go.podman.io dependencies (#1956)
* move to go.podman.io dependencies

* go fmt
2026-01-09 10:40:47 -08:00
Benjamin Yang
a9d2180dd6 102 redactor newline corruption clean (#1947)
* fix: prevent redactors from corrupting binary files (#102)

Redactors were adding newlines to files without them, corrupting binary
files during support bundle collection (51 bytes → 53 bytes).

Created LineReader to track original newline state and only restore
newlines when they were present in the original file.

- Added pkg/redact/line_reader.go
- Refactored single_line.go, multi_line.go, literal.go
- Added 48 tests, all passing
- Verified: binary files now preserved byte-for-byte

Fixes #102


* fix: handle empty lines correctly in MultiLineRedactor

- Check line1 == nil instead of len(line1) == 0 for empty file detection
- Fixes edge case where file containing only '\n' would be dropped
- Addresses bugbot finding about empty line handling


* fix: handle empty lines correctly in MultiLineRedactor

- Check line1 != nil instead of len(line1) > 0 in both locations
- Fixes edge case where empty trailing lines would be dropped
- Fix test isolation in literal_test.go (move ResetRedactionList to parent)
- Addresses bugbot findings about empty line handling

* fmt

* chore: update regression baselines from run 20107431959

* adding defense

* fix: propagate non-EOF errors in all early return paths

Ensure non-EOF errors (like buffer overflow) are properly propagated
to caller in both pre-loop early returns. Addresses bugbot finding.

* fix: use unique test names to prevent redaction list pollution

Use t.Name() instead of hardcoded 'test' to ensure each test
has unique redactor name, preventing parallel test interference

---------

Co-authored-by: hedge-sparrow <sparrow@spooky.academy>
2025-12-10 16:55:54 -06:00
ada mancini
cf816f8e26 fix(discovery): handle partial results from ServerGroupsAndResources (#1944) 2025-12-10 10:33:37 -05:00
Ethan Mosbaugh
9343b43e77 fix(collect): cluster resource errors json file has wrong name (#1936)
* fix(ci): regression test updates binary to latest release
* fix cluster resources collector
2025-11-28 10:17:03 +13:00
Xav Paice
e45e2cadd3 Fix collector ordering: preserve order when grouping by type (#1935)
- Fix issue where EnsureClusterResourcesFirst ordering was lost when
  collectors were grouped by type into a map (Go maps have random
  iteration order)
- Preserve collector type order by tracking collectorTypeOrder slice
  as collectors are added to the map
- Apply fix to both pkg/preflight/collect.go and
  pkg/supportbundle/collect.go
- Add comprehensive tests to verify clusterResources runs first and
  relative order of other collectors is preserved
- Enhance EnsureClusterResourcesFirst tests with additional edge cases
2025-11-26 15:34:17 +13:00
replicated-ci
73ac499d3e Bump Go from 1.24.6 to 1.25.4 (#1930)
* Bump Go to version from 1.24.6 to 1.25.4

* fix: use net.JoinHostPort for IPv6 compatibility

Fix IPv6 address formatting in namespace-pinger.go by replacing
fmt.Sprintf with net.JoinHostPort, which correctly handles both
IPv4 and IPv6 addresses.

Changes:
- PingTCP: Use net.JoinHostPort for client connections
- startTCPEchoServer: Use net.JoinHostPort for server listener

This fixes go vet errors introduced by Go 1.25's stricter checks:
  address format "%s:%d" does not work with IPv6

IPv4 example: 192.168.1.1:8080
IPv6 example: [::1]:8080 (brackets added automatically)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Mullen <nwmullen@gmail.com>
2025-11-21 11:54:22 -06:00
Benjamin Yang
cf2db49f86 applied native sidecar fix (#1914) 2025-11-04 11:30:42 -06:00
Noah Campbell
8197ddecfe added --values and --set flags to lint command (#1907)
* added --values and --set flags to lint command

* Update lint_test.go
2025-10-23 13:20:21 -05:00
Noah Campbell
2cebe3d8f6 Support bundle upload functionality works for apps installed via Helm (#1904)
* Gets licenseid and app slug from cluster secrets

* Update upload.go

* Update cluster_resources.go
2025-10-15 13:12:36 -05:00
Noah Campbell
6ffc83dc43 Updated linter (#1903)
* moved linter to new branch

* reads each yaml file separately when given multiple

* split monolith lint file into more reasonably sized files

* github action linter fix

* lint error codes follow the rest of the codebase's standard
2025-10-14 16:25:50 -05:00
Benjamin Yang
21dc4e9b09 Fix ollama windows installer (#1894)
* Fix Windows filename issue in scheduled support bundles

* Fix: Close temp file before executing Ollama installer on Windows

Windows requires files to be closed before they can be executed. This fix
ensures the temporary installer file is properly closed before attempting
to run it, preventing file access errors on Windows systems.
2025-10-14 10:51:52 -05:00
Noah Campbell
5aa088b3b6 Revert unintended commits on main 2025-10-13 15:23:31 -05:00
Noah Campbell
3f5ab9c721 doesnt harcode apiVersion line when looking and figures out which apiVersion to give if none is there 2025-10-13 15:18:43 -05:00
Noah Campbell
0316bb2e12 improved --fix capabilities 2025-10-13 15:18:33 -05:00
Noah Campbell
a5f4afb488 added lint subcommand 2025-10-13 15:18:04 -05:00
Noah Campbell
b7f499c737 Arbitrary secret key refs and templating in collectors (#1895)
* Uses secrets from cluster

* updated gitignore to stop ignoring needed files

* Delete specs.go.bak

* make fmt

* added preflight to generic loader

* Tells user to run in cluster if using secretKeyRef

* Update loader.go

* Update loader.go
2025-10-13 12:19:37 -05:00
Benjamin Yang
df40c661a2 Fix windows cronjob (#1891)
* Fix Windows filename issue in scheduled support bundles

* fix bugbot
2025-10-10 10:24:48 -05:00
Benjamin Yang
6c5c310eb3 Fix ollama clean (#1885)
* fixing .json format

* feat: aggregate files by resource type in Ollama agent for accurate cluster-wide analysis

- Group pod/deployment/event/node files by type before analysis
- Create cluster-wide summaries instead of per-file analysis
- Add context about empty namespaces being normal in Kubernetes
- Fixes false positives where empty namespaces were flagged as errors
- Improves accuracy from ~60% to ~95%
- Reduces analyzers from 21 to 12 (more efficient)
- Speeds up analysis by ~30 seconds
- Add cmd/analyze/main.go for building standalone analyze binary

* feat: aggregate files by resource type in Ollama agent for accurate cluster-wide analysis

- Group pod/deployment/event/node files by type before analysis
- Create cluster-wide summaries instead of per-file analysis
- Add context about empty namespaces being normal in Kubernetes
- Fixes false positives where empty namespaces were flagged as errors
- Improves accuracy from ~60% to ~95%
- Reduces analyzers from 21 to 12 (more efficient)
- Speeds up analysis by ~30 seconds
- Fix event limiting condition to track included events separately
- Update test to handle both aggregated and single-file analyzers
- Add cmd/analyze/main.go for building standalone analyze binary

* fixing error

* fixing bugbot

* fix bugbot errors

* fix bugbot errors

* bugbot errors

* fixing more bugbot errors

* fix: initialize namespace stats only after validating resource type

- Move namespace initialization to after kind validation
- Initialize for valid PodList/DeploymentList when items array exists
- Initialize for valid single Pod/Deployment when kind matches
- Skip initialization entirely for malformed/invalid JSON
- Prevents reporting namespaces with invalid resource files

* refactor: use if-else structure for clearer control flow

- Restructure pod/deployment aggregation to use explicit if-else
- Makes it clear that lists are processed in if block, singles in else
- Functionally identical but clearer for static analysis
- Resolves bugbot false positives about unreachable code
2025-10-08 16:57:00 -05:00
Marc Campbell
35759c47af V1beta3 (#1873)
* Change workflow branch from 'main' to 'v1beta3'

* Auto updater (#1849)

* added auto updater

* updated docs

* commit to trigger actions

* Auto-collectors: foundational discovery, image metadata, CLI integrat… (#1845)

* Auto-collectors: foundational discovery, image metadata, CLI integration; reset PRD markers

* Address PR review feedback

- Implement missing namespace exclude patterns functionality
- Fix image facts collector to use empty Data field instead of static string
- Correct APIVersion to use troubleshoot.sh/v1beta2 consistently

* Fix bug bot issues: API parsing, EOF error, and API group corrections

- Fix RBAC API parsing errors in rbac_checker.go (getAPIGroup/getAPIVersion functions)
- Fix FakeReader EOF error to use standard io.EOF instead of custom error
- Fix incorrect API group from troubleshoot.sh to troubleshoot.replicated.com in run.go

These changes address the issues identified by the bug bot and ensure proper
interface compliance and consistent API group usage.

* Fix multiple bug bot issues

- Fix RBAC API parsing errors in rbac_checker.go (getAPIGroup/getAPIVersion functions)
- Fix FakeReader EOF error to use standard io.EOF instead of custom error
- Fix incorrect API group from troubleshoot.sh to troubleshoot.replicated.com in run.go
- Fix image facts collector Data field to contain structured JSON instead of static strings

These changes address all issues identified by the bug bot and ensure proper
interface compliance, consistent API usage, and meaningful data fields.

* Update auto_discovery.go

* Fix TODO comments in Auto-collector section

Fixed 3 of 4 TODOs as requested in PR review:

1. pkg/collect/images/registry_client.go (line 46):
   - Implement custom CA certificate loading
   - Add x509 import and certificate parsing logic
   - Enables image collection from private registries with custom CAs

2. cmd/troubleshoot/cli/diff.go (line 209):
   - Implement bundle file count functionality
   - Add tar/gzip imports and getFileCountFromBundle() function
   - Properly counts files in support bundle archives (.gz/.tgz)

3. cmd/troubleshoot/cli/run.go (line 338):
   - Replace TODO with clarifying comment about RemoteCollectors usage
   - Confirmed RemoteCollectors are still actively used in preflights

The 4th TODO (diff.go line 196) is left as-is since it's explicitly marked
as Phase 4 future work (Support Bundle Differencing implementation).

Addresses PR review feedback about unimplemented TODO comments.

---------

Co-authored-by: Benjamin Yang <benjaminyang@Benjamins-MacBook-Pro.local>

* resetting make targets and github workflows to support v1beta3 releas… (#1853)

* resetting make targets and github workflows to support v1beta3 release later

* removing generate

* remove

* removing

* removing

* Support bundle diff (#1855)

implemented support bundle diff command

* Preflight docs and template subcommands (#1847)

* Added docs and template subcommands with test files

* uses helm templating preflight yaml files

* merge doc requirements for multiple inputs

* Helm aware rendering and markdown output

* v1beta3 yaml structure better mirrors beta2

* Update sample-preflight-templated.yaml

* Added docs and template subcommands with test files

* uses helm templating preflight yaml files

* merge doc requirements for multiple inputs

* Helm aware rendering and markdown output

* v1beta3 yaml structure better mirrors beta2

* Update sample-preflight-templated.yaml

* Added/updated documentation on subcommands

* Update docs.go

* commit to trigger actions

* Updated yaml spec (#1851)

* v1beta3 spec can be read by preflight

* added test files for ease of testing

* updated v1beta3 guide doc and added tests

* fixed not removing tmp files from v1beta3 processing

* created v1beta2 to v1beta3 converter

* Updated yaml spec (#1863)

* v1beta3 spec can be read by preflight

* added test files for ease of testing

* v1beta3 renderer fixes

* fixed gitignore issue

* Auto support bundle upload (#1860)

* basic auto uploading support bundles

* added upload command

* added default vendor endpoint

* added auth system from replicated cli

* fixed case sensitivity issue in YAML parsing

* support bundle uploads for end customers

* app slug flag and detection without licenseID

* moved v1beta3 examples to proper directory

* does not auto update for package managers (#1850)

* V1beta3 cleanup (#1869)

* moving some files around

* more cleanup

* removing more unused

* update ci for v1beta3 (#1870)

* fmt:

* removing unused examples

* add a v1beta3 fixture

* removing coverage reporting

* adding brew (#1872)

* Fixing testing errors (#1871)

fix: resolve failing unit tests and diff consistency in v1beta3

- Fix readLinesFromReader to return lines WITH newlines (like difflib.SplitLines)
- Update test expectations to match correct function behavior with newlines
- This ensures consistency between streaming and non-streaming diff paths
- Fix timeout test by changing from 10ms to 500ms to eliminate flaky failures

Fixes TestReadLinesFromReader and Test_loadSupportBundleSpecsFromURIs_TimeoutError
Resolves diff output inconsistency between code paths

* Fix/exec textanalyze path clean (#1865)

* created roadmap and yaml claude agent

* Update roadmap.md

* Fix textAnalyze analyzer to auto-match exec collector nested paths

- Auto-detect exec output files (*-stdout.txt, *-stderr.txt, *-errors.json)
- Convert simple filenames to wildcard patterns automatically
- Preserve existing wildcard patterns
- Fixes 'No matching file' errors for exec + textAnalyze workflows

---------

Co-authored-by: Noah Campbell <noah.edward.campbell@gmail.com>

* bump goreleaser to v2

* remove collect binary and risc binary

* remove this check

* add debug logging

* larger runner for release

* dropping goreleaser

* fix syntax

* fix syntax

* goreleaser

* larger

* prerelease auto and more

* publish to directory:

* some more goreleaser/homebrew stuffs

* removing risc

* bump example

* Advanced analysis clean (#1868)

* created roadmap and yaml claude agent

* Update roadmap.md

* feat: Clean advanced analysis implementation - core agents, engine, artifacts

* Remove unrelated files - keep only advanced analysis implementation

* fix: Fix goroutine leak in hosted agent rate limiter

- Added stop channel and stopped flag to RateLimiter struct
- Modified replenishTokens to listen for stop signal and exit cleanly
- Added Stop() method to gracefully shutdown rate limiter
- Added Stop() method to HostedAgent to cleanup rate limiter on shutdown

Fixes cursor bot issue: Rate Limiter Goroutine Leak

* fix: Fix analyzer config and model validation bugs

Bug 1: Analyzer Config Missing File Path
- Added filePath to DeploymentStatus analyzer config in convertAnalyzerToSpec
- Sets namespace-specific path (cluster-resources/deployments/{namespace}.json)
- Falls back to generic path (cluster-resources/deployments.json) if no namespace
- Fixes LocalAgent.analyzeDeploymentStatus backward compatibility

Bug 2: HealthCheck Fails Model Validation
- Changed Ollama model validation from prefix match to exact match
- Prevents false positives where llama2:13b would match request for llama2:7b
- Ensures agent only reports healthy when exact model is available

Both fixes address cursor bot reported issues and maintain backward compatibility.

* fixing lint errors

* fixing lint errors

* adding CLI flags

* fix: resolve linting errors for CI

- Remove unnecessary nil check in host_kernel_configs.go (len() for nil slices is zero)
- Remove unnecessary fmt.Sprintf() calls in ceph.go for static strings
- Apply go fmt formatting fixes

Fixes failing lint CI check

* fix: resolve CI failures in build-test workflow and Ollama tests

1. Fix GitHub Actions workflow logic error:
   - Replace problematic contains() expression with explicit job result checks
   - Properly handle failure and cancelled states for each job
   - Prevents false positive failures in success summary job

2. Fix Ollama agent parseLLMResponse panics:
   - Add proper error handling for malformed JSON in LLM responses
   - Return error when JSON is found but invalid (instead of silent fallback)
   - Add error when no meaningful content can be parsed from response
   - Prevents nil pointer dereference in test assertions

Fixes failing build-test/success and build-test/test CI checks

* fix: resolve all CI failures and cursor bot issues

1. Fix disable-ollama flag logic bug:
   - Remove disable-ollama from advanced analysis trigger condition
   - Prevents unintended advanced analysis mode when no agents registered
   - Allows proper fallback to legacy analysis

2. Fix diff test consistency:
   - Update test expectations to match function behavior (lines with newlines)
   - Ensures consistency between streaming and non-streaming diff paths

3. Fix Ollama agent error handling:
   - Add proper error return for malformed JSON in LLM responses
   - Add meaningful content validation for markdown parsing
   - Prevents nil pointer panics in test assertions

4. Fix analysis engine mock agent:
   - Mock agent now processes and returns results for all provided analyzers
   - Fixes test expectation mismatch (expected 8 results, got 1)

Resolves all failing CI checks: lint, test, and success workflow logic

---------

Co-authored-by: Noah Campbell <noah.edward.campbell@gmail.com>

* Auto-Collect (#1867)

* Fix auto-collector missing files issue

- Add KOTS-aware detection for diagnostic files
- Replace silent RBAC filtering with user warnings
- Enhance error file collection for troubleshooting
- Achieve parity with traditional support bundles

Resolves issue where auto-collector was missing:
- KOTS diagnostic files (now 4 vs 3)
- ConfigMaps (now 6 vs 6)
- Maintains superior log collection (24 vs 0)

Final result: [SUCCESS] comprehensive collection achieved

* fixing bugbog

* fix: resolve production readiness issues in auto-collect branch

1. Fix diff test expectations (lines should have newlines for difflib consistency)
2. Fix preflight tests to use existing v1beta3 example file
3. Fix autodiscovery test context parameter (function signature update)

Resolves TestReadLinesFromReader and preflight v1beta3 test failures

* fix: resolve autodiscovery tests and cursor bot image matching issues

1. Fix cursor bot image matching bug in isKotsadmImage:
   - Replace flawed prefix matching with proper image component detection
   - Handle private registries correctly (registry.company.com/kotsadm/kotsadm:v1.0.0)
   - Prevent false positives with proper delimiter checking
   - Add helper functions: containsImageComponent, splitImagePath, removeTagAndDigest

2. Fix autodiscovery test failures:
   - Add TestMode flag to DiscoveryOptions to control KOTS diagnostic collection
   - Tests use TestMode=true to get only foundational collectors (no KOTS diagnostics)
   - Preserves production behavior while enabling clean testing

Resolves failing TestDiscoverer_DiscoverFoundational tests and cursor bot issues

* Cron job clean (#1862)

* created roadmap and yaml claude agent

* Update roadmap.md

* chore(deps): bump sigstore/cosign-installer from 3.9.2 to 3.10.0 (#1857)

Bumps [sigstore/cosign-installer](https://github.com/sigstore/cosign-installer) from 3.9.2 to 3.10.0.
- [Release notes](https://github.com/sigstore/cosign-installer/releases)
- [Commits](https://github.com/sigstore/cosign-installer/compare/v3.9.2...v3.10.0)

---
updated-dependencies:
- dependency-name: sigstore/cosign-installer
  dependency-version: 3.10.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump the security group with 2 updates (#1858)

Bumps the security group with 2 updates: [github.com/vmware-tanzu/velero](https://github.com/vmware-tanzu/velero) and [helm.sh/helm/v3](https://github.com/helm/helm).


Updates `github.com/vmware-tanzu/velero` from 1.16.2 to 1.17.0
- [Release notes](https://github.com/vmware-tanzu/velero/releases)
- [Changelog](https://github.com/vmware-tanzu/velero/blob/main/CHANGELOG.md)
- [Commits](https://github.com/vmware-tanzu/velero/compare/v1.16.2...v1.17.0)

Updates `helm.sh/helm/v3` from 3.18.6 to 3.19.0
- [Release notes](https://github.com/helm/helm/releases)
- [Commits](https://github.com/helm/helm/compare/v3.18.6...v3.19.0)

---
updated-dependencies:
- dependency-name: github.com/vmware-tanzu/velero
  dependency-version: 1.17.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: helm.sh/helm/v3
  dependency-version: 3.19.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* chore(deps): bump helm.sh/helm/v3 from 3.18.6 to 3.19.0 in /examples/sdk/helm-template in the security group (#1859)

chore(deps): bump helm.sh/helm/v3

Bumps the security group in /examples/sdk/helm-template with 1 update: [helm.sh/helm/v3](https://github.com/helm/helm).


Updates `helm.sh/helm/v3` from 3.18.6 to 3.19.0
- [Release notes](https://github.com/helm/helm/releases)
- [Commits](https://github.com/helm/helm/compare/v3.18.6...v3.19.0)

---
updated-dependencies:
- dependency-name: helm.sh/helm/v3
  dependency-version: 3.19.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add cron job support bundle scheduler

Complete implementation with K8s integration:
- pkg/schedule/job.go: Job management and persistence
- pkg/schedule/daemon.go: Real-time scheduler daemon
- pkg/schedule/cli.go: CLI commands (create, list, delete, daemon)
- pkg/schedule/schedule_test.go: Comprehensive unit tests
- cmd/troubleshoot/cli/root.go: CLI integration

* fixing bugbot

* Fix all bugbot errors: auto-update stability, job cooldown timing, and daemon execution

* Deleting Agent

* removed unused flags

* fixing auto-upload

* fixing markdown files

* namespace not required flag for auto collectors to work

* loosened cron job validation

* writes logs to logfile

* fix: resolve autoFromEnv variable scoping issue for CI

- Ensure autoFromEnv variable and its usage are in correct scope
- Fix build errors: declared and not used / undefined variable
- All functionality preserved and tested locally
- Force add to override gitignore

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Noah Campbell <noah.edward.campbell@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: clean tokenization system implementation (#1874)

Core tokenization functionality with minimal file changes:

 Core Features:
- Intelligent tokenization engine (tokenizer.go)
- Context-aware secret classification (PASSWORD, APIKEY, DATABASE, etc.)
- Cross-file correlation with deterministic HMAC-SHA256 tokens
- Optional encrypted mapping for token→original value resolution

 Integration:
- CLI flags: --tokenize, --redaction-map, --encrypt-redaction-map
- Updated all redactor types: literal, single-line, multi-line, YAML
- Support bundle integration with auto-upload compatibility
- Backward compatibility: preserves ***HIDDEN*** when disabled

 Production Ready:
- Only 11 essential files (vs 31 in original PR)
- No excessive test files or documentation
- Clean build, all functionality verified
- Maintains existing redaction behavior by default

Token format: ***TOKEN_<TYPE>_<HASH>*** (e.g., ***TOKEN_PASSWORD_A1B2C3***)

* Removes silent failing (#1877)

* preserves stdout and stderr from collectors

* Delete eliminate-silent-failures.md

* Update host_kernel_modules_test.go

* added error logs when a collector fails to start

* Update host_filesystem_performance_linux.go

* fixed error saving logic inconsistency

* Update collect.go

* Improved error handling for support bundles and redactors for windows (#1878)

* improved error handling and window locking

* Delete all-windows-collectors.yaml

* addressing bugbot concerns

* Update host_tcpportstatus.go

* Update redact.go

* Add regression test suite to github actions

* Update regression-test.yaml

* Update regression-test.yaml

* Update regression-test.yaml

* create test/output directory

* handle node-specific files and multiple report arguments

* simplify comparison to detect code regressions only

* handle empty structural_compare rules

* removed v1beta3 branch from github workflow

* Update Makefile

* removed outdated actions

* Update Makefile

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Noah Campbell <noah.edward.campbell@gmail.com>
Co-authored-by: Benjamin Yang <82779168+bennyyang11@users.noreply.github.com>
Co-authored-by: Benjamin Yang <benjaminyang@Benjamins-MacBook-Pro.local>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-08 10:22:11 -07:00
Danil Grigorev
fcf46d44f0 bug: Respect provided kubeconfig in helm collector (#1833)
Respect provided kubeconfig in helm collector

Signed-off-by: Danil-Grigorev <daniil.grigorev.dev@gmail.com>
Co-authored-by: Ethan Mosbaugh <emosbaugh@gmail.com>
2025-10-02 12:25:13 -07:00
Ethan Mosbaugh
f352396e2e fix(collect): add context timeout to registry collector (#1846)
* fix(collect): add context timeout to registry collector

* f

* f
2025-09-10 11:47:58 -07:00
João Antunes
5cd98acdc1 fix(cluster_resources): pod disruption budgets for policy v1 not being collected (#1843)
* fix(cluster_resources): pod disruption budgets for policy v1 not being collected

* fix: e2e test

* fix: now actually fix the e2e test
2025-09-10 16:30:06 +01:00
Ash
dd48aadf7f Allow filtering node resources on taint. (#1840)
* allow filtering node resources on taint
2025-09-09 14:33:51 +01:00
Ethan Mosbaugh
6e62251904 chore(deps): bump the security group with 16 updates (#1835)
* chore(deps): bump the security group with 16 updates

Bumps the security group with 16 updates:

| Package | From | To |
| --- | --- | --- |
| [github.com/shirou/gopsutil/v4](https://github.com/shirou/gopsutil) | `4.25.7` | `4.25.8` |
| [github.com/spf13/cobra](https://github.com/spf13/cobra) | `1.9.1` | `1.10.1` |
| [github.com/spf13/pflag](https://github.com/spf13/pflag) | `1.0.7` | `1.0.9` |
| [github.com/stretchr/testify](https://github.com/stretchr/testify) | `1.11.0` | `1.11.1` |
| [go.opentelemetry.io/otel](https://github.com/open-telemetry/opentelemetry-go) | `1.37.0` | `1.38.0` |
| [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go) | `1.37.0` | `1.38.0` |
| [k8s.io/api](https://github.com/kubernetes/api) | `0.33.4` | `0.34.0` |
| [k8s.io/apiextensions-apiserver](https://github.com/kubernetes/apiextensions-apiserver) | `0.33.4` | `0.34.0` |
| [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery) | `0.33.4` | `0.34.0` |
| [k8s.io/apiserver](https://github.com/kubernetes/apiserver) | `0.33.4` | `0.34.0` |
| [k8s.io/cli-runtime](https://github.com/kubernetes/cli-runtime) | `0.33.4` | `0.34.0` |
| [k8s.io/client-go](https://github.com/kubernetes/client-go) | `0.33.4` | `0.34.0` |
| [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) | `0.21.0` | `0.22.0` |
| [k8s.io/kubelet](https://github.com/kubernetes/kubelet) | `0.33.4` | `0.34.0` |
| [k8s.io/metrics](https://github.com/kubernetes/metrics) | `0.33.4` | `0.34.0` |
| [k8s.io/utils](https://github.com/kubernetes/utils) | `0.0.0-20241104100929-3ea5e8cea738` | `0.0.0-20250604170112-4c0f3b243397` |


Updates `github.com/shirou/gopsutil/v4` from 4.25.7 to 4.25.8
- [Release notes](https://github.com/shirou/gopsutil/releases)
- [Commits](https://github.com/shirou/gopsutil/compare/v4.25.7...v4.25.8)

Updates `github.com/spf13/cobra` from 1.9.1 to 1.10.1
- [Release notes](https://github.com/spf13/cobra/releases)
- [Commits](https://github.com/spf13/cobra/compare/v1.9.1...v1.10.1)

Updates `github.com/spf13/pflag` from 1.0.7 to 1.0.9
- [Release notes](https://github.com/spf13/pflag/releases)
- [Commits](https://github.com/spf13/pflag/compare/v1.0.7...v1.0.9)

Updates `github.com/stretchr/testify` from 1.11.0 to 1.11.1
- [Release notes](https://github.com/stretchr/testify/releases)
- [Commits](https://github.com/stretchr/testify/compare/v1.11.0...v1.11.1)

Updates `go.opentelemetry.io/otel` from 1.37.0 to 1.38.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.37.0...v1.38.0)

Updates `go.opentelemetry.io/otel/sdk` from 1.37.0 to 1.38.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.37.0...v1.38.0)

Updates `k8s.io/api` from 0.33.4 to 0.34.0
- [Commits](https://github.com/kubernetes/api/compare/v0.33.4...v0.34.0)

Updates `k8s.io/apiextensions-apiserver` from 0.33.4 to 0.34.0
- [Release notes](https://github.com/kubernetes/apiextensions-apiserver/releases)
- [Commits](https://github.com/kubernetes/apiextensions-apiserver/compare/v0.33.4...v0.34.0)

Updates `k8s.io/apimachinery` from 0.33.4 to 0.34.0
- [Commits](https://github.com/kubernetes/apimachinery/compare/v0.33.4...v0.34.0)

Updates `k8s.io/apiserver` from 0.33.4 to 0.34.0
- [Commits](https://github.com/kubernetes/apiserver/compare/v0.33.4...v0.34.0)

Updates `k8s.io/cli-runtime` from 0.33.4 to 0.34.0
- [Commits](https://github.com/kubernetes/cli-runtime/compare/v0.33.4...v0.34.0)

Updates `k8s.io/client-go` from 0.33.4 to 0.34.0
- [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/kubernetes/client-go/compare/v0.33.4...v0.34.0)

Updates `sigs.k8s.io/controller-runtime` from 0.21.0 to 0.22.0
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/main/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/controller-runtime/compare/v0.21.0...v0.22.0)

Updates `k8s.io/kubelet` from 0.33.4 to 0.34.0
- [Commits](https://github.com/kubernetes/kubelet/compare/v0.33.4...v0.34.0)

Updates `k8s.io/metrics` from 0.33.4 to 0.34.0
- [Commits](https://github.com/kubernetes/metrics/compare/v0.33.4...v0.34.0)

Updates `k8s.io/utils` from 0.0.0-20241104100929-3ea5e8cea738 to 0.0.0-20250604170112-4c0f3b243397
- [Commits](https://github.com/kubernetes/utils/commits)

---
updated-dependencies:
- dependency-name: github.com/shirou/gopsutil/v4
  dependency-version: 4.25.8
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: security
- dependency-name: github.com/spf13/cobra
  dependency-version: 1.10.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: github.com/spf13/pflag
  dependency-version: 1.0.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: security
- dependency-name: github.com/stretchr/testify
  dependency-version: 1.11.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: security
- dependency-name: go.opentelemetry.io/otel
  dependency-version: 1.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.38.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/api
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/apiextensions-apiserver
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/apimachinery
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/apiserver
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/cli-runtime
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/client-go
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-version: 0.22.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/kubelet
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/metrics
  dependency-version: 0.34.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: security
- dependency-name: k8s.io/utils
  dependency-version: 0.0.0-20250604170112-4c0f3b243397
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: security
...

Signed-off-by: dependabot[bot] <support@github.com>

* f

* f

* f

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 09:32:56 -07:00
Diamon Wiggins
2861425293 feat(run-pod): allow image pull retries in run pod collector (#1811)
* allow image pull retries in run pod collector

* fix formatting
2025-07-25 11:30:14 -04:00
Diamon Wiggins
7a6bffeff5 chore: fix noisy info logs (#1808)
* refine logging

* keep progress message at level 0
2025-07-09 20:58:47 -04:00
Ethan Mosbaugh
38d8a45171 fix(host-analyze): certificate analyzer wrong file path (#1807) 2025-07-09 09:32:04 -04:00
Diamon Wiggins
989780af69 feat: allow secrets collector to retreive all key data if specified (#1801)
* allow secrets collector retreival all key data if specified

* add new line

* remove unneeded comments
2025-06-30 10:06:14 -04:00
Ethan Mosbaugh
a4a387eb0e chore: CVE-2024-0406 remove github.com/mholt/archiver/v3 dependency (#1793) 2025-06-06 11:35:56 -07:00
Dmitriy Ivolgin
03efedf714 Follow logs when using runDaemonSet and runPod collectors (#1783)
Follow logs when using runDaemonSet collector

Signed-off-by: divolgin <dmitriy@replicated.com>
2025-05-09 12:54:28 -07:00
dependabot[bot]
9bca9c5245 chore(deps): bump github.com/distribution/distribution/v3 from 3.0.0-rc.3 to 3.0.0 (#1771)
* chore(deps): bump github.com/distribution/distribution/v3

Bumps [github.com/distribution/distribution/v3](https://github.com/distribution/distribution) from 3.0.0-rc.3 to 3.0.0.
- [Release notes](https://github.com/distribution/distribution/releases)
- [Commits](https://github.com/distribution/distribution/compare/v3.0.0-rc.3...v3.0.0)

---
updated-dependencies:
- dependency-name: github.com/distribution/distribution/v3
  dependency-version: 3.0.0
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* update go

* use constant format strings

* f

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lavery <laverya@umich.edu>
2025-04-21 16:45:47 +00:00
Luke Amdor
3a457d11fc feat: add timestamps flag to logs collector (#1776)
* feat: add timestamps falg to logs collector

Kubernetes logs can be transmitted with the captured timestamps. This is useful for containers that do not log with timestamps. So I'm exposing that as a flag.

* fix: update schemas
2025-04-17 10:51:07 -04:00
Andrew Lavery
ab7f50d0ce improve detection of tanzu clusters (#1769) 2025-04-03 11:25:44 -07:00
Ethan Mosbaugh
641c195db3 fix(collect.runPod): does not delete image pull secrets without name in spec (#1761)
* fix(collect.runPod): fix deleting image pull secrets

* f

* f
2025-03-17 16:21:28 -05:00
Johannes Tuchscherer
ef1cd66b1e Handling the case when the Cluster Analyzer doesn't find a resource (#1760)
* Handling the case when the Cluster Analyzer doesn't find a resource

* Add namespace information to Resource not found fail message
2025-03-14 11:22:49 -07:00
Greg Schofield
64c63d3f7a Log namespace when analyzing deployment status (#1757) 2025-03-12 13:49:15 +00:00
Andrew Lavery
9d9b3c565c add additional test cases to the host os info analyzer (#1754) 2025-03-06 16:57:59 -06:00
Johannes Tuchscherer
3665d25abf Http comperators (#1753)
* Allowing more comperators for the http analyzer

* test

* Update pkg/analyze/host_http.go

Co-authored-by: Andrew Lavery <laverya@umich.edu>

---------

Co-authored-by: Andrew Lavery <laverya@umich.edu>
2025-03-06 21:40:47 +00:00
Salah Al Saleh
97dcae9fc7 Ability to use sprig functions in analyzer templates (#1745)
* Ability to use sprig functions in analyzer templates
2025-02-21 08:10:46 -08:00
Ethan Mosbaugh
b80f38a9a0 fix(redact): multi-line redactors strip empty lines (#1742) 2025-02-20 21:55:05 -05:00