introduces a new option to limit the size of a pod log when added to the bundle. This will make sure the support bundle will not grow to an unacceptable size and thus might contain information that is too old.
The maximum size of a pod log in a bundle is set by default to 5MB, and can be changed if we decide upon the need.
BREAKING CHANGE: any logs that are collected by the logs collector are now limited by default to 5MB unless a different size limit is specified. Folks expecting log files larger than that to be collected without truncation will need to adjust their support bundle specs.
Fixes: #878
* filter on cpu architecture
* filter by cpu architecture
* fail if we dont have a label match too
* add tests for cpu arch filter
* update for make schemas
* First draft of a generic cluster-resource analyzer
* Add more resource mappings
* Support some cluster-scoped resources
the structure of this could probably be a bit tidyer, but this now
allows us to target non-namespaced resources simply by not specifying
the namespace in the analyzer.
* General tidy up
* pull resource selection into it's own function
* remove pointless pointer to string
* Export findResource function
This lets other analyzers use it.
* Add tests for cluster resources analyzer
* Update schemas
* Address some of @banjoh's comments
* rework resource selection
thanks @banjoh
* Replace FindFiles with GetFile
Since we already know where we're looking for files,
it doesn't make sense to have to loop over a single item slice.
* Use assert instead of require
* format
* Change default behaviour for no namespace
Now not providing a namespace causes us to default to "default", with an
explicit bool to toggle cluster-scoped resource checking.
This should feel somewhat more intuitive when writing analyzers that use
this function
* Generate schemas
* Value → expectedValue
feat(collectors): Add mTLS parameters to the redis collector
For a redis collector spec targeting a redis server configured to accept
(m)TLS connections we need to pass in the necessary TLS parameters in order
to successfully connect to the server. Both preflight and support bundle
specs use this collector.
This change allows us to pass in the necessary TLS parameters via inlined
TLS configuration or via a secret reference.
Fixes#746
* feat(analyze): add ExcludeFiles field to textAnazlye
* feat(analyze): fix test for getFiles
* feat(analyze): change function name to excludeFilePaths
* feat(analyze): fix preflight test fail
* feat(analyze): add tests for excludeFiles
* feat(schemas): run make schemas
* feat(analyze): use getChildCollectedFileContents function prototype
* feat(analyze): reduce time complexity
* feat(longhorn): add getFileContents as getCollectedFileContents
* Add Mysql variables to collector
* Cleanup row scanning and a few updates based on feedback
* Close db connection
* Move defer db.close
* Updates based on feedback
* Use vars in loop instead of struct
* Only pull parameters specified in collector config
Co-authored-by: Ethan Mosbaugh <ethan@replicated.com>
* add strict flag to Analyzer/AnalyzerMeta
and regenerate schemas and controller-gen code
* map analyzer strict to result
* Update stdout for human and json format
* fix review comment
* update interactive result
* update interactive results
* Update types.go
* Update upload_results.go
* print strict when only true
* collector/analyzer for host operating system
* address cr comments
* cleanup
* fix invoking the analyzer
code cleanup
* fix cr comments
* add corner case unit-test
* fix kernel version parsing
* address review comments
* add default case
* parse using regex
* added more testcases and fixed the bug found in cr
* few small things
* Add collect command and remote host collectors
Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl. Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used. To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.
The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string. When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).
The collect command is also useful for troubleshooting preflight issues.
Examples are included to show remote collector usage.
```
bin/collect --collector-image=croomes/troubleshoot:latest examples/collect/remote/memory.yaml --namespace test
{
"kind-control-plane": {
"system/memory.json": {
"total": 1304207360
}
},
"kind-worker": {
"system/memory.json": {
"total": 1695780864
}
},
"kind-worker2": {
"system/memory.json": {
"total": 1726353408
}
}
}
```
The preflight command has been updated to run remote collectors. To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:
```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
name: memory
spec:
remoteCollectors:
- memory:
collectorName: memory
analyzers:
- memory:
outcomes:
- fail:
when: "< 8Gi"
message: At least 8Gi of memory is required
- warn:
when: "< 32Gi"
message: At least 32Gi of memory is recommended
- pass:
message: The system has as sufficient memory
```
Results for each node are analyzed separately, with the node name
appended to the title:
```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
"fail": [
{
"title": "Amount of Memory (kind-worker2)",
"message": "At least 8Gi of memory is required"
},
{
"title": "Amount of Memory (kind-worker)",
"message": "At least 8Gi of memory is required"
},
{
"title": "Amount of Memory (kind-control-plane)",
"message": "At least 8Gi of memory is required"
}
]
}
```
Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
* add an unique id to each host preflights
* auto generated files
* updated schemas for the new field id
* keeping it consistent with the rest of the spec