210 Commits

Author SHA1 Message Date
xavpaice
6fe887df62 Merge pull request #638 from replicatedhq/xav/go_modules/periph.io/x/periph-3.7.2
Bump periph.io/x/periph from 3.6.8+incompatible to 3.7.2
2022-07-27 07:28:08 +12:00
dependabot[bot]
057be29943 Bump periph.io/x/periph from 3.6.8+incompatible to 3.7.2
- [Release notes](https://github.com/google/periph/releases)

updated-dependencies:
- dependency-name: periph.io/x/periph
  dependency-type: direct:production
  update-type: version-update:semver-minor

Module periph.io/x/periph has moved from 3.6.8 to re-arrange locations
in 3.7.2.  This reworks to take advantage of the new format.
2022-07-26 18:33:22 +12:00
Martin Hrabovcin
642d098238 fix: use kube-aware version selector for crs collector 2022-07-15 08:36:16 +02:00
divolgin
8e7ea022f7 Adding some utility interfaces for collectors 2022-07-08 14:53:46 -07:00
Ethan Mosbaugh
1820c40c43 Make sure to clean up resources when done including run pods (#615) 2022-07-02 14:40:55 -07:00
Diamon Wiggins
cab3fc7f4e Redact Host Collectors in Support Bundles (#614)
Add redactors for Host Collectors
2022-07-01 18:57:58 -04:00
Diamon Wiggins
c9c305570b Host Run Collector (#606)
Host Run Collector
2022-06-29 12:14:56 -04:00
Tarun Gupta Akirala
43a936a0d7 fix: use storedVersions to determine specific CR version 2022-06-24 12:09:10 -07:00
Ethan Mosbaugh
74b4802b46 Add support for k8s 1.24+ 2022-05-24 11:05:59 -07:00
Ethan Mosbaugh
30fb4e2108 Fix run collector text analyze file path mismatch 2022-05-16 23:33:23 +00:00
diamonwiggins
5ec3524bde fixing host filesystem perf function for windows and darwin 2022-05-12 20:09:03 +00:00
diamonwiggins
a471ad5e74 fixing filepath for kernel modules test 2022-05-12 19:49:04 +00:00
diamonwiggins
7b30e283ea fixing filepath for kernel modules test 2022-05-12 19:43:45 +00:00
diamonwiggins
8c62aadcfc using subdirectory for all host collectors in support bundle 2022-05-12 16:46:24 +00:00
diamonwiggins
e7f2685ed8 fix output file for diskUsage collector 2022-05-12 04:20:24 +00:00
diamonwiggins
3b1ba08a6b hardcoding system hostcollector filenames 2022-05-12 03:39:19 +00:00
diamonwiggins
17fe3db79f adding host collectors to support bundles 2022-05-11 22:50:03 +00:00
Diamon Wiggins
9f527ee6a5 Merge branch 'main' into diamonwiggins/sc-44286/run-pod-spec 2022-05-06 11:08:15 -04:00
Ethan Mosbaugh
2c9a37a4f1 BoolOrString pollutes marshalling, does not respect omitempty (#566)
* BoolOrString pollutes marshalling, does not respect omitempty

* fix panic
2022-05-05 16:10:05 -07:00
Edgar Ochoa
7289134757 Add Mysql variables to collector (#562)
* Add Mysql variables to collector

* Cleanup row scanning and a few updates based on feedback

* Close db connection

* Move defer db.close

* Updates based on feedback

* Use vars in loop instead of struct

* Only pull parameters specified in collector config

Co-authored-by: Ethan Mosbaugh <ethan@replicated.com>
2022-05-04 10:42:37 -07:00
Jalaja Ganapathy
63362d32ee Include osd disk usage (#563)
* Include osd disk usage

* add timeouts to ceph commands

* removed df, collect from host
2022-05-02 09:51:49 -07:00
diamonwiggins
6cdcb36127 change naming for runPod collector object 2022-05-02 03:12:07 +00:00
diamonwiggins
dfe5538132 remove podLabels from run function 2022-05-02 02:53:29 +00:00
diamonwiggins
42902405cd adding new runpod collector and refactoring old run collector to use new code 2022-05-02 02:44:02 +00:00
diamonwiggins
2516924a92 moving imagepullsecret creation into non pod spec declaration 2022-04-19 20:41:28 +00:00
diamonwiggins
ce4165f69e fixing typo 2022-04-19 20:29:28 +00:00
diamonwiggins
648f9b8d35 allow entire podspec to be passed in run collector 2022-04-19 16:25:59 +00:00
Ethan Mosbaugh
8d1a0f6d24 return ReadCloser 2022-04-01 14:22:44 -07:00
Ethan Mosbaugh
0cfd431274 Fix missing longhorn logs 2022-04-01 14:13:32 -07:00
diamonwiggins
2b774e16d7 adding serviceaccountname parameter to run collector 2022-03-03 06:12:42 +00:00
divolgin
3351c289ab Add GVK to k8s objects in cluster-resources files 2022-02-04 01:31:07 +00:00
Salah Aldeen Al Saleh
6f0cf6550d find total memory instead of available (#525) 2022-01-11 17:02:44 -08:00
Salah Aldeen Al Saleh
d1f341b8ed host system packages collector/analyzer (#506)
* host system packages collector/analyzer
2021-12-10 12:05:21 -08:00
Salah Aldeen Al Saleh
e100e7c478 get container logs for unhealthy pods (#469)
* get container logs for unhealthy pods

Co-authored-by: divolgin <dmitriy@replicated.com>
Co-authored-by: divolgin <divolgin@users.noreply.github.com>
2021-10-28 09:21:14 -07:00
divolgin
ada35eb31c Replicaset collector and analyzer 2021-10-27 20:24:14 +00:00
Salah Aldeen Al Saleh
5a8561a31f include logs for init containers as well (#467) 2021-10-27 10:20:55 -07:00
divolgin
20f1b60f11 Include pod logs for pods that are failing 2021-10-26 00:01:26 +00:00
divolgin
072d2d7a36 Fix ceph collector 2021-10-22 23:01:13 +00:00
Andrew Reed
7b36e6a1f8 Copy in longhorn client (#454) 2021-10-22 15:24:07 -05:00
Rishabh Bohra
cf03503216 feat: Collect custom resources (#447)
* feat: Collect custom resources
Co-authored-by: Martin Hrabovcin<mhrabovcin@users.noreply.github.com>

Co-authored-by: Andrew Reed <andrew@replicated.com>
2021-10-21 16:49:59 -05:00
Jalaja Ganapathy
372454651e collector/analyzer for host operating system (#443)
* collector/analyzer for host operating system

* address cr comments

* cleanup

* fix invoking the analyzer
code cleanup

* fix cr comments

* add corner case unit-test

* fix kernel version parsing

* address review comments

* add default case

* parse using regex

* added more testcases and fixed the bug found in cr

* few small things
2021-10-12 14:42:23 -07:00
Simon Croome
dc8b38d249 Handle k8s api deprecations 2021-10-07 18:55:51 +01:00
Simon Croome
977fc438ea Remote host collectors (#392)
* Add collect command and remote host collectors

Adds the ability to run a host collector on a set of remote k8s nodes.
Target nodes can be filtered using the --selector flag, with the same
syntax as kubectl.  Existing flags for --collector-image,
--collector-pullpolicy and --request-timeout are used.  To run on a
specified node, --selector="kubernetes.io/hostname=kind-worker2" could
be used.

The collect command is used by the remote collector to output the
results using a "raw" format, which uses the filename as the key, and
the value the output as a escaped json string.  When run manually it
defaults to fully decoded json. The existing block devices,
ipv4interfaces and services host collectors don't decode properly - the
fix is to convert their slice output to a map (fix not included as
unsure what depends on the existing format).

The collect command is also useful for troubleshooting preflight issues.

Examples are included to show remote collector usage.

```
bin/collect --collector-image=croomes/troubleshoot:latest  examples/collect/remote/memory.yaml --namespace test
{
  "kind-control-plane": {
    "system/memory.json": {
      "total": 1304207360
    }
  },
  "kind-worker": {
    "system/memory.json": {
      "total": 1695780864
    }
  },
  "kind-worker2": {
    "system/memory.json": {
      "total": 1726353408
    }
  }
}
```

The preflight command has been updated to run remote collectors.  To run
a host collector remotely it must be specified in the spec as a
`remoteCollector`:

```
apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: memory
spec:
  remoteCollectors:
    - memory:
        collectorName: memory
  analyzers:
    - memory:
        outcomes:
          - fail:
              when: "< 8Gi"
              message: At least 8Gi of memory is required
          - warn:
              when: "< 32Gi"
              message: At least 32Gi of memory is recommended
          - pass:
              message: The system has as sufficient memory
```

Results for each node are analyzed separately, with the node name
appended to the title:

```
bin/preflight --interactive=false --collector-image=croomes/troubleshoot:latest examples/preflight/remote/memory.yaml --format=json
{memory running 0 1}
{memory completed 1 1}
{
  "fail": [
    {
      "title": "Amount of Memory (kind-worker2)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-worker)",
      "message": "At least 8Gi of memory is required"
    },
    {
      "title": "Amount of Memory (kind-control-plane)",
      "message": "At least 8Gi of memory is required"
    }
  ]
}
```

Also added a host collector to allow preflight checks of required kernel
modules, which is the main driver for this change.
2021-10-06 09:03:53 -05:00
Andrew Reed
4d52760d35 Collector and analyzer for sysctl parameters (#441)
Collector and analyzer for sysctl parameters
2021-10-01 13:43:26 -05:00
divolgin
ca51e92878 Allow memory writers 2021-09-30 18:25:52 +00:00
divolgin
6d0a57b16e Don't panic when no data is collected 2021-09-29 21:25:28 +00:00
divolgin
299497c0c0 Merge pull request #429 from danbudris/copyFromHostForCpNodes
add toleration to copy-from-host daemonset to allow collection from CP nodes
2021-09-29 09:01:14 -07:00
divolgin
0e8bedc281 Save collector data to disk directly 2021-09-29 00:15:02 +00:00
danbudris
67987a4432 add toleration to allow copy-from-host daemonset to run on CP nodes 2021-09-23 17:53:57 -04:00
Salah Aldeen Al Saleh
880c7dc3ea ability to specify a list of namespaces for the cluster resources collector (#424)
* ability to specify a list of namespaces for the cluster resources collector
2021-09-23 08:02:05 -07:00