Commit Graph

789 Commits

Author SHA1 Message Date
Diamon Wiggins
989780af69 feat: allow secrets collector to retreive all key data if specified (#1801)
* allow secrets collector retreival all key data if specified

* add new line

* remove unneeded comments
2025-06-30 10:06:14 -04:00
Ethan Mosbaugh
a4a387eb0e chore: CVE-2024-0406 remove github.com/mholt/archiver/v3 dependency (#1793) 2025-06-06 11:35:56 -07:00
Dmitriy Ivolgin
03efedf714 Follow logs when using runDaemonSet and runPod collectors (#1783)
Follow logs when using runDaemonSet collector

Signed-off-by: divolgin <dmitriy@replicated.com>
2025-05-09 12:54:28 -07:00
dependabot[bot]
9bca9c5245 chore(deps): bump github.com/distribution/distribution/v3 from 3.0.0-rc.3 to 3.0.0 (#1771)
* chore(deps): bump github.com/distribution/distribution/v3

Bumps [github.com/distribution/distribution/v3](https://github.com/distribution/distribution) from 3.0.0-rc.3 to 3.0.0.
- [Release notes](https://github.com/distribution/distribution/releases)
- [Commits](https://github.com/distribution/distribution/compare/v3.0.0-rc.3...v3.0.0)

---
updated-dependencies:
- dependency-name: github.com/distribution/distribution/v3
  dependency-version: 3.0.0
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* update go

* use constant format strings

* f

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lavery <laverya@umich.edu>
2025-04-21 16:45:47 +00:00
Luke Amdor
3a457d11fc feat: add timestamps flag to logs collector (#1776)
* feat: add timestamps falg to logs collector

Kubernetes logs can be transmitted with the captured timestamps. This is useful for containers that do not log with timestamps. So I'm exposing that as a flag.

* fix: update schemas
2025-04-17 10:51:07 -04:00
Andrew Lavery
ab7f50d0ce improve detection of tanzu clusters (#1769) 2025-04-03 11:25:44 -07:00
Ethan Mosbaugh
641c195db3 fix(collect.runPod): does not delete image pull secrets without name in spec (#1761)
* fix(collect.runPod): fix deleting image pull secrets

* f

* f
2025-03-17 16:21:28 -05:00
Johannes Tuchscherer
ef1cd66b1e Handling the case when the Cluster Analyzer doesn't find a resource (#1760)
* Handling the case when the Cluster Analyzer doesn't find a resource

* Add namespace information to Resource not found fail message
2025-03-14 11:22:49 -07:00
Greg Schofield
64c63d3f7a Log namespace when analyzing deployment status (#1757) 2025-03-12 13:49:15 +00:00
Andrew Lavery
9d9b3c565c add additional test cases to the host os info analyzer (#1754) 2025-03-06 16:57:59 -06:00
Johannes Tuchscherer
3665d25abf Http comperators (#1753)
* Allowing more comperators for the http analyzer

* test

* Update pkg/analyze/host_http.go

Co-authored-by: Andrew Lavery <laverya@umich.edu>

---------

Co-authored-by: Andrew Lavery <laverya@umich.edu>
2025-03-06 21:40:47 +00:00
Salah Al Saleh
97dcae9fc7 Ability to use sprig functions in analyzer templates (#1745)
* Ability to use sprig functions in analyzer templates
2025-02-21 08:10:46 -08:00
Ethan Mosbaugh
b80f38a9a0 fix(redact): multi-line redactors strip empty lines (#1742) 2025-02-20 21:55:05 -05:00
Andrew Lavery
dca4e675fa update shirou/gopsutil to v4 (#1744) 2025-02-20 16:04:03 -08:00
Andrew Lavery
fb9ea281cb improve the host OS collector and analyzer (#1743)
The OS version analyzer did not allow checking for things like "redhat 8.x" - this equates to >= 8 && < 9 in the new code.

Also, we previously only collected the OS name (like redhat, centos, or ubuntu) not the OS family (which would be rhel, rhel, and debian for the previous OSes) - this greatly reduces the number of cases required in an analyzer.
2025-02-20 13:03:53 -08:00
Ethan Mosbaugh
51c3a0c40f fix(host-preflights): buildtin kernel modules file from wrong path (#1741)
* fix(host-preflights): buildtin kernel modules file from wrong path

* f

* f

* f

* f
2025-02-18 16:19:58 -05:00
Ethan Mosbaugh
8e1dc9c5cb fix(preflights): builtin kernel modules file may be not found (#1738) 2025-02-17 15:54:38 -08:00
Ethan Mosbaugh
923293e79a fix(preflights): support for builtin kernel modules (#1737)
* fix(preflights): support for builtin kernel modules

* f
2025-02-17 16:57:44 -06:00
Ethan Mosbaugh
ae2b5d1311 fix(ci): remove windows build from goreleaser (#1736) 2025-02-13 21:55:12 +00:00
Salah Al Saleh
d5a6b19417 Add a host analyzer to check if a subnet contains an IP address (#1735)
* Add a host collector / analyzer to check if a subnet contains an IP address
2025-02-13 13:16:59 -08:00
Ethan Mosbaugh
716dda221d fix(host.kernelModules): /lib/modules does not exist in a container (#1734) 2025-02-13 11:54:33 -06:00
Dexter Yan
683391522e fix(window): improve rename file process and remove windows release (#1728) 2025-02-11 17:33:08 +13:00
Ash
de791e951c Enable Daemonsets in ClusterResources analyzer (#1729) 2025-02-06 13:55:39 -05:00
Gerard Nguyen
fa5365cfae fix: [sc-118962] Unable to Retrieve TLS Parameters from Kubernetes Secrets with the Postgres Collector (#1724)
* use Data instead of StringData
2025-01-28 21:39:39 +11:00
Xav Paice
86b7e54466 Revert "feat: save YAML spec used to generate support bundle/preflight" (#1715)
Revert "feat: save YAML spec used to generate support bundle/preflight (#1713)"

This reverts commit f6f51acbd5.
2025-01-06 09:42:58 +11:00
Gerard Nguyen
f6f51acbd5 feat: save YAML spec used to generate support bundle/preflight (#1713)
* save YAML spec of support bundle

* save YAML spec of preflight

* add unit test

* redact TLS private key by default in output spec

* update YAML path for HTTP TLS redactor
2025-01-04 11:35:43 +11:00
Dexter Yan
64ee9e5596 feat(nodeResources): add GPU support (#1708)
* feat(nodeResources): add GPU support

* add resourceCapacity and sum test

* update with make schemas

* Correct tests names

Signed-off-by: Evans Mungai <evans@replicated.com>

---------

Signed-off-by: Evans Mungai <evans@replicated.com>
Co-authored-by: Evans Mungai <evans@replicated.com>
2025-01-03 15:11:10 +13:00
Gerard Nguyen
a6fbf144b8 feat: container statuses analyzer (#1698)
* new schema for analyzer ClusterContainerStatues
2024-12-04 10:36:23 +11:00
Miguel Varela Ramos
8e2647077d feat: add support for matchExpressions when filtering for nodes (#1697)
* feat: add support for matchExpressions when filtering for nodes

* fix: make generate
2024-11-30 23:15:26 +11:00
Ash
ecc92b1e3e [bug] Quick fix for handling non 200 status codes when loading specs from URI (#1695)
* Quick fix for handling non 200 status codes when loading specs from URI

Go http client already handles 3xx responses for us

* note
2024-11-25 15:04:38 +00:00
Ricardo Maraschini
9f5f0633cf feat: rename templating variables (#1693)
when templating the output of the namespace connectivity check we were
referring to the 'fromCIDR' as 'fromNamespace'. it makes way more sense
to refer to it as 'fromCIDR' as this is how it is provided in the input
for the collector.

as this is a brand new feature it is very unlikely that anyone is using
this feature (except for the embedded cluster that still needs to be
patched accodringly).

this is how the analyser were defined before:

```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
    name: ec-cluster-preflight
spec:
    analyzers:
        - networkNamespaceConnectivity:
            collectorName: check-network-connectivity
            outcomes:
            - pass:
                message: "Communication between {{ .FromNamespace }} and {{ .ToNamespace }} is working"
            - fail:
                message: "{{ .ErrorMessage }}"
```

and this is how it is now:

```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
    name: ec-cluster-preflight
spec:
    analyzers:
        - networkNamespaceConnectivity:
            collectorName: check-network-connectivity
            outcomes:
            - pass:
                message: "Communication between {{ .FromCIDR }} and {{ .ToCIDR }} is working"
            - fail:
                message: "{{ .ErrorMessage }}"

```
2024-11-21 16:03:50 +01:00
Dexter Yan
6167fd8a5e fix(collector): fix dns collector limited to 63 chars (#1690) 2024-11-19 17:47:24 +13:00
Gerard Nguyen
7bb88e6b83 feat: ensure Copy collector run last (#1688)
* ensure Copy collector run last

* * add unit test
* reorder in Preflight as well
2024-11-15 10:59:38 +11:00
Dexter Yan
1a828fa90b fix(analyzer): add missing warning in outcome (#1687) 2024-11-13 16:32:54 +13:00
Ash
deeeea7cec exec remote host collectors in a daemonset (#1671)
Co-authored-by: Gerard Nguyen <gerard@replicated.com>
Co-authored-by: Dexter Yan <yanshaocong@gmail.com>
2024-11-12 08:47:24 +13:00
João Antunes
197f6de425 feat(host_analyzer): add host sysctl analyzer (#1681)
* feat(host_analyzer): add host sysctl analyzer

* chore: add e2e tests to support bundle collection

* chore: missing spec e2e test update

* chore: cleanup remote collector and use parse operator

* chore: update schemas
2024-11-08 18:55:24 +00:00
Evans Mungai
d25aa7d0ea fix: Do not fail analysis if node list does not exist (#1678)
* fix: Do not error if node list does not exist

Signed-off-by: Evans Mungai <evans@replicated.com>

* fix test fail

---------

Signed-off-by: Evans Mungai <evans@replicated.com>
Co-authored-by: Dexter Yan <yanshaocong@gmail.com>
2024-11-08 09:53:03 +13:00
João Antunes
77c9968ff6 feat(host_sysctl): add host sysctl collector (#1676)
* feat(host_sysctl): add host sysctl collector

* chore: add examples

* Update pkg/collect/host_sysctl.go

Co-authored-by: Evans Mungai <evans@replicated.com>

* chore: use sysctl package vs exec calls

* chore: make linter happy

* chore: make schemas

* chore: go back to sysctl exec

* chore: make linter happy

---------

Co-authored-by: Evans Mungai <evans@replicated.com>
2024-11-07 18:18:11 +00:00
Diamon Wiggins
06506ed95d Fix remote host collection RBAC checks (#1672)
* fix remote host collection rbac checks

* move saveNodeList into collectRemoteHost function

* fix resource attribute list and retrieve namespace from kubeconfig

* revert change to set a default namespace from kubeconfig

* remove duplicate code
2024-11-07 10:07:27 -05:00
Ricardo Maraschini
e272683bce feat: implement collector and analyser for network namespace connectivity (#1670)
* feat: implement collector and analyser for network namespace connectivity

checks if two network namespaces can talk to each other on udp and tcp.
its usage is as follows:

```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: test
spec:
  hostCollectors:
  - networkNamespaceConnectivity:
      collectorName: check-network-connectivity
      fromCIDR: 10.0.0.0/24
      toCIDR: 10.0.1.0/24
  hostAnalyzers:
  - networkNamespaceConnectivity:
      collectorName: check-network-connectivity
      outcomes:
      - pass:
          message: "Communication between 10.0.0.0/24 and 10.0.1.0/24 is working"
      - fail:
          message: "Communication between 10.0.0.0/24 and 10.0.1.0/24 isn't working"
```

if this fails then you may need to enable `forwarding` with:

```bash
sysctl -w net.ipv4.ip_forward=1
```

if it still fails then you may need to configure firewalld to allow the
traffic or simply disable it for sake of testing.

* chore: rebuild schemas

* chore: remove unused property

* chore: disable namespaces for other platforms

* chore: make sure we timeout temporary servers

* feat: analyzer now supports multi-node collection

* feat: check both udp and tcp even on failure

check both protocols even if one fails. this pr commit also introduces a
timeout that can be set by the user.

* feat: add templating to the failure outcome

allow users to dump the errors found during the analysis.

* chore: addressing pr comments

* feat: delete interface pair before namespace

even though the interface pair is deleted everyttime we delete the
namespace on my tests we better delete it before we delete the
namespace.

this comes out of a review comment where some people seem to still be
able to see the interface pair even after the namespace is deleted.

i.e. better safe than sorry.

* chore: fix typo on comment
2024-11-06 11:30:13 +01:00
Ash
ea900a1881 chore: Refactor host cpu analyzer for remote collection (#1664)
* Refactor host cpu analyzer for remote collection

---------

Co-authored-by: Gerard Nguyen <gerard@replicated.com>
2024-11-06 14:43:27 +11:00
Gerard Nguyen
f0b8de68ae feat: multiple nodes analyzers (#1667)
* implement refactor for multiple node analyzers

---------

Co-authored-by: Diamon Wiggins <38189728+diamonwiggins@users.noreply.github.com>
2024-11-04 14:17:39 +11:00
Ash
544a700062 [sc-114813] copy HostCollector fails to copy binary files when run in cluster (#1669)
* Don't convert output bytes to string

This prevents binary files getting mangled when the collector ourput is being passed around between functions

* Update pkg/collect/runner.go

Co-authored-by: Evans Mungai <evans@replicated.com>

* organise imports

---------

Co-authored-by: Evans Mungai <evans@replicated.com>
2024-10-31 10:44:35 +00:00
Dexter Yan
059b5d14d2 fix(collector): limit run pod collector to delete only one related secret (#1668)
* fix(collector): limit run pod collector to delete only related secret

* change to ctx
2024-10-30 14:19:30 +00:00
Evans Mungai
deda4ce98c feat: Do not prompt users to save support bundle analysis results (#1662)
In interactive mode, do not prompt users to save support
bundle analysis results. Users end up providing this file
instead of the support bundle archive. The analysis results
are contained in the support bundle archive already

Signed-off-by: Evans Mungai <evans@replicated.com>
2024-10-25 13:03:16 +01:00
Dexter Yan
350418c6e9 feat(host-collector): add progress for host collector (#1659) 2024-10-25 15:34:09 +13:00
ada mancini
eacff7112f support adding a CA cert to http collector (#1624)
* add a TLS parameter for cacert

* pass a ca cert into http request

* test preflight

* make schemas

* log extra information from http request

* pass a proxy into the collector spec

* hitting a segfault; breakpoint

* accept a dir, file, or a string-literal as CA

* move tls params into get, put, post methods

* test for cert untrusted response

* make generate

* make schemas

* more test cases

* make schemas

* dont include system certs

* make generate && make schemas

* resolve gosec G402 warning

* remove old check for system certs

* ignore errcheck "return value not checked" linter errors
2024-10-23 18:15:08 -04:00
Dexter Yan
0d21eed5f8 fix(support): add missing host collectors for ParseSupportBundle (#1656)
* fix(support): add missing host collectors for ParseSupportBundle

* update

* add host ananlyers
2024-10-22 13:07:44 +13:00
Diamon Wiggins
b88bc8ddf7 Refactor Multi Node Analyzers (#1646)
* initial refactor of host os analyzer

* refactor remote collect analysis

---------

Signed-off-by: Evans Mungai <evans@replicated.com>
Co-authored-by: Gerard Nguyen <gerard@replicated.com>
Co-authored-by: Evans Mungai <evans@replicated.com>
2024-10-22 10:45:50 +13:00
Evans Mungai
9c24ab6067 chore: Remove preempted deprecation warnings (#1655)
Signed-off-by: Evans Mungai <evans@replicated.com>
2024-10-22 08:35:36 +11:00