when templating the output of the namespace connectivity check we were
referring to the 'fromCIDR' as 'fromNamespace'. it makes way more sense
to refer to it as 'fromCIDR' as this is how it is provided in the input
for the collector.
as this is a brand new feature it is very unlikely that anyone is using
this feature (except for the embedded cluster that still needs to be
patched accodringly).
this is how the analyser were defined before:
```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
name: ec-cluster-preflight
spec:
analyzers:
- networkNamespaceConnectivity:
collectorName: check-network-connectivity
outcomes:
- pass:
message: "Communication between {{ .FromNamespace }} and {{ .ToNamespace }} is working"
- fail:
message: "{{ .ErrorMessage }}"
```
and this is how it is now:
```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
name: ec-cluster-preflight
spec:
analyzers:
- networkNamespaceConnectivity:
collectorName: check-network-connectivity
outcomes:
- pass:
message: "Communication between {{ .FromCIDR }} and {{ .ToCIDR }} is working"
- fail:
message: "{{ .ErrorMessage }}"
```
* fix remote host collection rbac checks
* move saveNodeList into collectRemoteHost function
* fix resource attribute list and retrieve namespace from kubeconfig
* revert change to set a default namespace from kubeconfig
* remove duplicate code
* feat: implement collector and analyser for network namespace connectivity
checks if two network namespaces can talk to each other on udp and tcp.
its usage is as follows:
```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: test
spec:
hostCollectors:
- networkNamespaceConnectivity:
collectorName: check-network-connectivity
fromCIDR: 10.0.0.0/24
toCIDR: 10.0.1.0/24
hostAnalyzers:
- networkNamespaceConnectivity:
collectorName: check-network-connectivity
outcomes:
- pass:
message: "Communication between 10.0.0.0/24 and 10.0.1.0/24 is working"
- fail:
message: "Communication between 10.0.0.0/24 and 10.0.1.0/24 isn't working"
```
if this fails then you may need to enable `forwarding` with:
```bash
sysctl -w net.ipv4.ip_forward=1
```
if it still fails then you may need to configure firewalld to allow the
traffic or simply disable it for sake of testing.
* chore: rebuild schemas
* chore: remove unused property
* chore: disable namespaces for other platforms
* chore: make sure we timeout temporary servers
* feat: analyzer now supports multi-node collection
* feat: check both udp and tcp even on failure
check both protocols even if one fails. this pr commit also introduces a
timeout that can be set by the user.
* feat: add templating to the failure outcome
allow users to dump the errors found during the analysis.
* chore: addressing pr comments
* feat: delete interface pair before namespace
even though the interface pair is deleted everyttime we delete the
namespace on my tests we better delete it before we delete the
namespace.
this comes out of a review comment where some people seem to still be
able to see the interface pair even after the namespace is deleted.
i.e. better safe than sorry.
* chore: fix typo on comment
* Don't convert output bytes to string
This prevents binary files getting mangled when the collector ourput is being passed around between functions
* Update pkg/collect/runner.go
Co-authored-by: Evans Mungai <evans@replicated.com>
* organise imports
---------
Co-authored-by: Evans Mungai <evans@replicated.com>
In interactive mode, do not prompt users to save support
bundle analysis results. Users end up providing this file
instead of the support bundle archive. The analysis results
are contained in the support bundle archive already
Signed-off-by: Evans Mungai <evans@replicated.com>
* add a TLS parameter for cacert
* pass a ca cert into http request
* test preflight
* make schemas
* log extra information from http request
* pass a proxy into the collector spec
* hitting a segfault; breakpoint
* accept a dir, file, or a string-literal as CA
* move tls params into get, put, post methods
* test for cert untrusted response
* make generate
* make schemas
* more test cases
* make schemas
* dont include system certs
* make generate && make schemas
* resolve gosec G402 warning
* remove old check for system certs
* ignore errcheck "return value not checked" linter errors
* Add image parameter to the goldpinger collector
* Pass image directly as a function arg
Also allow util image to be set in spec
* Remove pointless util image override
* Update pkg/collect/goldpinger.go
Co-authored-by: Evans Mungai <evans@replicated.com>
* Simplify image override
---------
Co-authored-by: Evans Mungai <evans@replicated.com>
allow users to check if specific cpu flags are supported by the host.
```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
name: ec-cluster-preflight
spec:
collectors:
- cpu: {}
analyzers:
- cpu:
checkName: CPU
outcomes:
- pass:
when: hasFlags cmov,cx8,fpu,fxsr,mmx
message: CPU supports all required flags
- fail:
message: CPU not supported
```
allows troubleshoot to collect and analyze CPU micro architecture. this
is an usage example:
```yaml
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
name: ec-cluster-preflight
spec:
collectors:
- cpu: {}
analyzers:
- cpu:
checkName: CPU
outcomes:
- pass:
when: 'supports x86-64-v2'
message: CPU supports x86-64-v2
- fail:
message: CPU does not support x86-64-v2
```
Change to stop re-analysing preflight results when uploadResultsTo is present leading to duplicate results
Signed-off-by: Evans Mungai <evans@replicated.com>
* feat: Install goldpinger if one does not exist when running goldpinger collector
- Deploy golpinger daemonset if one is not detected in the cluster
- Clean up all deployed resources
- Add delay to allow users to wait for goldpinger to perform checks
Signed-off-by: Evans Mungai <evans@replicated.com>
* Add missing test data file
Signed-off-by: Evans Mungai <evans@replicated.com>
* Better naming of create resource functions
Signed-off-by: Evans Mungai <evans@replicated.com>
---------
Signed-off-by: Evans Mungai <evans@replicated.com>
* add struct for host dns collector
* add miekg/dns
* add more logs
* nit
* new field names
* use Hostnames instead of Names
* misc update
* make schemas
* no error when there is no resolv.conf
* query all searches
* add summary.json file
* merge summary into result file
* query AAAA and CNAME as well
* update schema for hostnames to be required
* store DNS collector in JSON output for analyze later
* fix incorrect path
* configurable dns image
* make non resolvable domain configurable
* nit update address field
* * update dns util image
* add unit test
* new schema for etcd collector
* add placeholder
* wip
* get supported distribution
* add exec implementation
* wait for etcd pod to be ready
* misc
* update k0s etcd certs path
* fix unit tests
* address code reviews
* update from code review
* add etcdctl version
Linux control groups host collector that detects whether the specified mountPoint is a cgroup filesystem and what version it is. The collector also collects information of the configured cgroup controllers.
Signed-off-by: Evans Mungai <evans@replicated.com>