mirror of
https://github.com/kubeshark/kubeshark.git
synced 2026-05-30 13:04:29 +00:00
Compare commits
56 Commits
docs/add-m
...
v53.3.0
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f97866f747 | ||
|
|
b2a0fb0cea | ||
|
|
2475f6e260 | ||
|
|
cd13d8f89e | ||
|
|
ad9dfbf5f9 | ||
|
|
ed1d2e1a4d | ||
|
|
7b5954ea00 | ||
|
|
8186b7891b | ||
|
|
ab81b0c3a7 | ||
|
|
9f5a1a41c0 | ||
|
|
fef3e8fb05 | ||
|
|
7ae81ccc4b | ||
|
|
27111e48d3 | ||
|
|
863be8f47a | ||
|
|
9e4059bc4d | ||
|
|
f79885bd35 | ||
|
|
31129e570a | ||
|
|
3a1ad64b4c | ||
|
|
fa03da2fd4 | ||
|
|
4de0ac6abd | ||
|
|
9b5ac2821f | ||
|
|
1ba6ed94e0 | ||
|
|
4695acb41e | ||
|
|
b80723edfb | ||
|
|
ddc2e57f12 | ||
|
|
e80fc3319b | ||
|
|
868b4c1f36 | ||
|
|
c63740ec45 | ||
|
|
10dbedf356 | ||
|
|
963b3e4ac2 | ||
|
|
b2813e02bd | ||
|
|
707d7351b6 | ||
|
|
23c86be773 | ||
|
|
3f8a067f9b | ||
|
|
33f5310e8e | ||
|
|
5f2f34e826 | ||
|
|
f9a5fbbb78 | ||
|
|
73f8e3585d | ||
|
|
a6daefc567 | ||
|
|
e6a67cc3b7 | ||
|
|
eb7dc42b6e | ||
|
|
d266408377 | ||
|
|
40ae6c626b | ||
|
|
e3283327f9 | ||
|
|
a46f05c4aa | ||
|
|
dbfd17d901 | ||
|
|
95c18b57a4 | ||
|
|
6fd2e4b1b2 | ||
|
|
686c7eba54 | ||
|
|
1ad61798f6 | ||
|
|
318b35e785 | ||
|
|
fecf290a25 | ||
|
|
a01f7bed74 | ||
|
|
633a17a0e0 | ||
|
|
8fac9a5ad5 | ||
|
|
76c5eb6b59 |
33
.claude-plugin/README.md
Normal file
33
.claude-plugin/README.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Kubeshark Claude Code Plugin
|
||||
|
||||
This directory contains the [Claude Code plugin](https://docs.anthropic.com/en/docs/claude-code/plugins) configuration for Kubeshark.
|
||||
|
||||
## What's here
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `plugin.json` | Plugin manifest — name, version, description, metadata |
|
||||
| `marketplace.json` | Marketplace index — allows discovery via `/plugin marketplace add` |
|
||||
|
||||
## Installing the plugin
|
||||
|
||||
```
|
||||
/plugin marketplace add kubeshark/kubeshark
|
||||
/plugin install kubeshark
|
||||
```
|
||||
|
||||
This loads the Kubeshark AI skills and MCP configuration. Skills appear as
|
||||
`/kubeshark:network-rca` and `/kubeshark:kfl`.
|
||||
|
||||
## What the plugin includes
|
||||
|
||||
- **Skills** from [`skills/`](../skills/) — network root cause analysis and KFL filter expertise
|
||||
- **MCP configuration** from [`.mcp.json`](../.mcp.json) — connects to the Kubeshark MCP server
|
||||
|
||||
## Local development
|
||||
|
||||
Test the plugin without installing:
|
||||
|
||||
```bash
|
||||
claude --plugin-dir /path/to/kubeshark
|
||||
```
|
||||
15
.claude-plugin/marketplace.json
Normal file
15
.claude-plugin/marketplace.json
Normal file
@@ -0,0 +1,15 @@
|
||||
{
|
||||
"name": "kubeshark",
|
||||
"description": "Kubeshark network observability skills for Kubernetes",
|
||||
"plugins": [
|
||||
{
|
||||
"name": "kubeshark",
|
||||
"description": "Network observability skills powered by Kubeshark MCP — root cause analysis, KFL traffic filtering, snapshot forensics, PCAP extraction.",
|
||||
"source": {
|
||||
"source": "github",
|
||||
"owner": "kubeshark",
|
||||
"repo": "kubeshark"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
24
.claude-plugin/plugin.json
Normal file
24
.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,24 @@
|
||||
{
|
||||
"name": "kubeshark",
|
||||
"version": "1.0.0",
|
||||
"description": "Kubernetes network observability skills powered by Kubeshark MCP. Root cause analysis, traffic filtering, snapshot forensics, PCAP extraction, and more.",
|
||||
"author": {
|
||||
"name": "Kubeshark",
|
||||
"url": "https://kubeshark.com"
|
||||
},
|
||||
"homepage": "https://kubeshark.com",
|
||||
"repository": "https://github.com/kubeshark/kubeshark",
|
||||
"license": "Apache-2.0",
|
||||
"keywords": [
|
||||
"kubeshark",
|
||||
"kubernetes",
|
||||
"network",
|
||||
"observability",
|
||||
"traffic",
|
||||
"mcp",
|
||||
"rca",
|
||||
"pcap",
|
||||
"kfl",
|
||||
"ebpf"
|
||||
]
|
||||
}
|
||||
2
.github/workflows/mcp-publish.yml
vendored
2
.github/workflows/mcp-publish.yml
vendored
@@ -168,7 +168,7 @@ jobs:
|
||||
- name: Login to MCP Registry
|
||||
if: github.event_name != 'workflow_dispatch' || github.event.inputs.dry_run != 'true'
|
||||
shell: bash
|
||||
run: mcp-publisher login github
|
||||
run: mcp-publisher login github-oidc
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
|
||||
24
.github/workflows/release-tag.yml
vendored
Normal file
24
.github/workflows/release-tag.yml
vendored
Normal file
@@ -0,0 +1,24 @@
|
||||
name: Auto-tag release
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
types: [closed]
|
||||
branches: [master]
|
||||
|
||||
jobs:
|
||||
tag:
|
||||
if: github.event.pull_request.merged == true && startsWith(github.event.pull_request.head.ref, 'release/v')
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: write
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
fetch-depth: 0
|
||||
|
||||
- name: Create and push tag
|
||||
run: |
|
||||
VERSION="${GITHUB_HEAD_REF#release/}"
|
||||
echo "Creating tag $VERSION on master"
|
||||
git tag "$VERSION"
|
||||
git push origin "$VERSION"
|
||||
51
.github/workflows/test.yml
vendored
51
.github/workflows/test.yml
vendored
@@ -15,7 +15,7 @@ jobs:
|
||||
timeout-minutes: 20
|
||||
steps:
|
||||
- name: Check out code into the Go module directory
|
||||
uses: actions/checkout@v3
|
||||
uses: actions/checkout@v5
|
||||
with:
|
||||
fetch-depth: 2
|
||||
|
||||
@@ -29,3 +29,52 @@ jobs:
|
||||
|
||||
- name: Upload coverage to Codecov
|
||||
uses: codecov/codecov-action@v3
|
||||
|
||||
helm-tests:
|
||||
name: Helm Chart Tests
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 10
|
||||
steps:
|
||||
- name: Check out code
|
||||
uses: actions/checkout@v5
|
||||
|
||||
- name: Set up Helm
|
||||
uses: azure/setup-helm@v4
|
||||
|
||||
- name: Helm lint (default values)
|
||||
run: helm lint ./helm-chart
|
||||
|
||||
- name: Helm lint (S3 values)
|
||||
run: helm lint ./helm-chart -f ./helm-chart/tests/fixtures/values-s3.yaml
|
||||
|
||||
- name: Helm lint (Azure Blob values)
|
||||
run: helm lint ./helm-chart -f ./helm-chart/tests/fixtures/values-azblob.yaml
|
||||
|
||||
- name: Helm lint (GCS values)
|
||||
run: helm lint ./helm-chart -f ./helm-chart/tests/fixtures/values-gcs.yaml
|
||||
|
||||
- name: Helm lint (cloud refs values)
|
||||
run: helm lint ./helm-chart -f ./helm-chart/tests/fixtures/values-cloud-refs.yaml
|
||||
|
||||
- name: Install helm-unittest plugin
|
||||
run: helm plugin install https://github.com/helm-unittest/helm-unittest --verify=false
|
||||
|
||||
- name: Run helm unit tests
|
||||
run: helm unittest ./helm-chart
|
||||
|
||||
- name: Install kubeconform
|
||||
run: |
|
||||
curl -sL https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz | tar xz
|
||||
sudo mv kubeconform /usr/local/bin/
|
||||
|
||||
- name: Validate default template
|
||||
run: helm template kubeshark ./helm-chart | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
|
||||
- name: Validate S3 template
|
||||
run: helm template kubeshark ./helm-chart -f ./helm-chart/tests/fixtures/values-s3.yaml | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
|
||||
- name: Validate Azure Blob template
|
||||
run: helm template kubeshark ./helm-chart -f ./helm-chart/tests/fixtures/values-azblob.yaml | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
|
||||
- name: Validate GCS template
|
||||
run: helm template kubeshark ./helm-chart -f ./helm-chart/tests/fixtures/values-gcs.yaml | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
|
||||
6
.gitignore
vendored
6
.gitignore
vendored
@@ -63,4 +63,8 @@ bin
|
||||
scripts/
|
||||
|
||||
# CWD config YAML
|
||||
kubeshark.yaml
|
||||
kubeshark.yaml
|
||||
|
||||
# Claude Code
|
||||
CLAUDE.md
|
||||
.claude/
|
||||
8
.mcp.json
Normal file
8
.mcp.json
Normal file
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"mcpServers": {
|
||||
"kubeshark": {
|
||||
"command": "kubeshark",
|
||||
"args": ["mcp"]
|
||||
}
|
||||
}
|
||||
}
|
||||
147
Makefile
147
Makefile
@@ -137,6 +137,16 @@ test-integration-short: ## Run quick integration tests (skips long-running tests
|
||||
rm -f $$LOG_FILE; \
|
||||
exit $$status
|
||||
|
||||
helm-test: ## Run Helm lint and unit tests.
|
||||
helm lint ./helm-chart
|
||||
helm unittest ./helm-chart
|
||||
|
||||
helm-test-full: helm-test ## Run Helm tests with kubeconform schema validation.
|
||||
helm template kubeshark ./helm-chart | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
helm template kubeshark ./helm-chart -f ./helm-chart/tests/fixtures/values-s3.yaml | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
helm template kubeshark ./helm-chart -f ./helm-chart/tests/fixtures/values-azblob.yaml | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
helm template kubeshark ./helm-chart -f ./helm-chart/tests/fixtures/values-gcs.yaml | kubeconform -strict -kubernetes-version 1.35.0 -summary
|
||||
|
||||
lint: ## Lint the source code.
|
||||
golangci-lint run
|
||||
|
||||
@@ -242,31 +252,136 @@ proxy:
|
||||
port-forward:
|
||||
kubectl port-forward $$(kubectl get pods | awk '$$1 ~ /^$(POD_PREFIX)/' | awk 'END {print $$1}') $(SRC_PORT):$(DST_PORT)
|
||||
|
||||
release:
|
||||
@cd ../worker && git checkout master && git pull && git tag -d v$(VERSION); git tag v$(VERSION) && git push origin --tags
|
||||
@cd ../tracer && git checkout master && git pull && git tag -d v$(VERSION); git tag v$(VERSION) && git push origin --tags
|
||||
@cd ../hub && git checkout master && git pull && git tag -d v$(VERSION); git tag v$(VERSION) && git push origin --tags
|
||||
@cd ../front && git checkout master && git pull && git tag -d v$(VERSION); git tag v$(VERSION) && git push origin --tags
|
||||
@cd ../kubeshark && git checkout master && git pull && sed -i "s/^version:.*/version: \"$(shell echo $(VERSION) | sed -E 's/^([0-9]+\.[0-9]+\.[0-9]+)\..*/\1/')\"/" helm-chart/Chart.yaml && make
|
||||
release: ## Print release workflow instructions.
|
||||
@echo "Release workflow — each step is idempotent and can be rerun on its own:"
|
||||
@echo ""
|
||||
@echo " 1. make release-siblings VERSION=x.y.z"
|
||||
@echo " Tag worker, hub, front with vx.y.z. Also run standalone when"
|
||||
@echo " rebuilding docker images without cutting a full release."
|
||||
@echo ""
|
||||
@echo " 2. make release-pr-kubeshark VERSION=x.y.z"
|
||||
@echo " Bump Helm Chart.yaml, build, open release PR on kubeshark."
|
||||
@echo ""
|
||||
@echo " 3. make release-pr-helm VERSION=x.y.z"
|
||||
@echo " Sync helm-chart/ into kubeshark.github.io, open helm PR."
|
||||
@echo " Requires release/vx.y.z branch (created by step 2)."
|
||||
@echo ""
|
||||
@echo " Shortcut: make release-pr VERSION=x.y.z runs 1 → 2 → 3."
|
||||
@echo ""
|
||||
@echo " After both PRs merge: tag is created automatically,"
|
||||
@echo " or run: make release-tag VERSION=x.y.z"
|
||||
|
||||
# Internal: validate VERSION before any release-* target runs.
|
||||
_release-check-version:
|
||||
@if [ -z "$(VERSION)" ]; then echo "ERROR: VERSION is required. Usage: make <target> VERSION=x.y.z"; exit 1; fi
|
||||
@echo "$(VERSION)" | grep -Eq '^[0-9]+\.[0-9]+\.[0-9]+' || { echo "ERROR: VERSION must be semver (e.g. 53.2.4)"; exit 1; }
|
||||
|
||||
release-siblings: _release-check-version ## Tag worker, hub, front with v$(VERSION). Idempotent; standalone for docker-image-only updates.
|
||||
@for repo in worker hub front; do \
|
||||
echo "==> $$repo: ensuring v$(VERSION) tag"; \
|
||||
(cd ../$$repo && git checkout master && git pull) || exit 1; \
|
||||
if (cd ../$$repo && git ls-remote --tags origin "refs/tags/v$(VERSION)" | grep -q .); then \
|
||||
echo " v$(VERSION) already on origin — skipping"; \
|
||||
else \
|
||||
(cd ../$$repo && git tag -d v$(VERSION) 2>/dev/null; git tag v$(VERSION) && git push origin "refs/tags/v$(VERSION)") || exit 1; \
|
||||
fi; \
|
||||
done
|
||||
|
||||
release-pr-kubeshark: _release-check-version ## Bump Chart.yaml, build, open release PR on kubeshark.
|
||||
@cd ../kubeshark && git checkout master && git pull
|
||||
@NEW=$$(echo $(VERSION) | sed -E 's/^([0-9]+\.[0-9]+\.[0-9]+).*/\1/'); \
|
||||
CUR=$$(awk '/^version:/ {gsub(/"/,"",$$2); print $$2; exit}' helm-chart/Chart.yaml); \
|
||||
if [ "$$CUR" != "$$NEW" ]; then \
|
||||
sed -i '' "s/^version:.*/version: \"$$NEW\"/" helm-chart/Chart.yaml; \
|
||||
else \
|
||||
echo "Chart.yaml already at $$NEW"; \
|
||||
fi
|
||||
@$(MAKE) build VER=$(VERSION)
|
||||
@if [ "$(shell uname)" = "Darwin" ]; then \
|
||||
codesign --sign - --force --preserve-metadata=entitlements,requirements,flags,runtime ./bin/kubeshark__; \
|
||||
fi
|
||||
@make generate-helm-values && make generate-manifests
|
||||
@git add -A . && git commit -m ":bookmark: Bump the Helm chart version to $(VERSION)" && git push
|
||||
@git tag -d v$(VERSION); git tag v$(VERSION) && git push origin --tags
|
||||
@rm -rf ../kubeshark.github.io/charts/chart && mkdir ../kubeshark.github.io/charts/chart && cp -r helm-chart/ ../kubeshark.github.io/charts/chart/
|
||||
@cd ../kubeshark.github.io/ && git add -A . && git commit -m ":sparkles: Update the Helm chart" && git push
|
||||
@cd ../kubeshark
|
||||
@$(MAKE) generate-helm-values && $(MAKE) generate-manifests
|
||||
@if git show-ref --verify --quiet refs/heads/release/v$(VERSION); then \
|
||||
git branch -D release/v$(VERSION); \
|
||||
fi
|
||||
@git checkout -b release/v$(VERSION)
|
||||
@git add -A .
|
||||
@if ! git diff --cached --quiet; then \
|
||||
git commit -m ":bookmark: Bump the Helm chart version to $(VERSION)"; \
|
||||
else \
|
||||
echo "nothing to commit"; \
|
||||
fi
|
||||
@git push --force-with-lease -u origin release/v$(VERSION)
|
||||
@if gh pr view release/v$(VERSION) --json number >/dev/null 2>&1; then \
|
||||
echo "PR already exists for release/v$(VERSION)"; \
|
||||
else \
|
||||
gh pr create --title ":bookmark: Release v$(VERSION)" \
|
||||
--body "Automated release PR for v$(VERSION)." \
|
||||
--base master \
|
||||
--reviewer corest; \
|
||||
fi
|
||||
|
||||
release-pr-helm: _release-check-version ## Sync helm-chart/ to kubeshark.github.io and open the helm PR. Requires release/v$(VERSION) branch (step 2).
|
||||
@git fetch origin "refs/heads/release/v$(VERSION):refs/heads/release/v$(VERSION)" 2>/dev/null || true
|
||||
@if ! git show-ref --verify --quiet refs/heads/release/v$(VERSION); then \
|
||||
echo "ERROR: release/v$(VERSION) branch not found locally or on origin."; \
|
||||
echo "Run 'make release-pr-kubeshark VERSION=$(VERSION)' first."; \
|
||||
exit 1; \
|
||||
fi
|
||||
@git checkout release/v$(VERSION)
|
||||
@cd ../kubeshark.github.io && git checkout master && git pull \
|
||||
&& rm -rf charts/chart && mkdir -p charts/chart \
|
||||
&& cp -r ../kubeshark/helm-chart/ charts/chart/
|
||||
@cd ../kubeshark.github.io && \
|
||||
if git show-ref --verify --quiet refs/heads/helm-v$(VERSION); then \
|
||||
git branch -D helm-v$(VERSION); \
|
||||
fi && \
|
||||
git checkout -b helm-v$(VERSION) && \
|
||||
git add -A . && \
|
||||
if ! git diff --cached --quiet; then \
|
||||
git commit -m ":sparkles: Update the Helm chart to v$(VERSION)"; \
|
||||
else \
|
||||
echo "nothing to commit"; \
|
||||
fi && \
|
||||
git push --force-with-lease -u origin helm-v$(VERSION) && \
|
||||
if ! gh pr view helm-v$(VERSION) --json number >/dev/null 2>&1; then \
|
||||
gh pr create --title ":sparkles: Helm chart v$(VERSION)" \
|
||||
--body "Update Helm chart for release v$(VERSION)." \
|
||||
--base master \
|
||||
--reviewer corest; \
|
||||
else \
|
||||
echo "PR already exists for helm-v$(VERSION)"; \
|
||||
fi && \
|
||||
git checkout master
|
||||
@cd ../kubeshark && git checkout master && git pull
|
||||
|
||||
release-pr: release-siblings release-pr-kubeshark release-pr-helm ## Run release-siblings, release-pr-kubeshark, and release-pr-helm in sequence.
|
||||
@echo ""
|
||||
@echo "Release PRs created (or already present):"
|
||||
@echo " - kubeshark: Review and merge the release PR."
|
||||
@echo " - kubeshark.github.io: Review and merge the helm chart PR."
|
||||
@echo "Tag will be created automatically, or run: make release-tag VERSION=$(VERSION)"
|
||||
|
||||
release-tag: ## Step 2 (fallback): Tag master after release PR is merged.
|
||||
@echo "Verifying release PR was merged..."
|
||||
@if ! gh pr list --state merged --head release/v$(VERSION) --json number --jq '.[0].number' | grep -q .; then \
|
||||
echo "Error: No merged PR found for release/v$(VERSION). Merge the PR first."; \
|
||||
exit 1; \
|
||||
fi
|
||||
@git checkout master && git pull
|
||||
@git tag -d v$(VERSION) 2>/dev/null; git tag v$(VERSION) && git push origin --tags
|
||||
@echo ""
|
||||
@echo "Tagged v$(VERSION) on master. GitHub Actions will build the release."
|
||||
|
||||
release-dry-run:
|
||||
@cd ../worker && git checkout master && git pull
|
||||
@cd ../tracer && git checkout master && git pull
|
||||
# @cd ../tracer && git checkout master && git pull
|
||||
@cd ../hub && git checkout master && git pull
|
||||
@cd ../front && git checkout master && git pull
|
||||
@cd ../kubeshark && sed -i "s/^version:.*/version: \"$(shell echo $(VERSION) | sed -E 's/^([0-9]+\.[0-9]+\.[0-9]+)\..*/\1/')\"/" helm-chart/Chart.yaml && make
|
||||
@if [ "$(shell uname)" = "Darwin" ]; then \
|
||||
codesign --sign - --force --preserve-metadata=entitlements,requirements,flags,runtime ./bin/kubeshark__; \
|
||||
fi
|
||||
# @if [ "$(shell uname)" = "Darwin" ]; then \
|
||||
# codesign --sign - --force --preserve-metadata=entitlements,requirements,flags,runtime ./bin/kubeshark__; \
|
||||
# fi
|
||||
@make generate-helm-values && make generate-manifests
|
||||
@rm -rf ../kubeshark.github.io/charts/chart && mkdir ../kubeshark.github.io/charts/chart && cp -r helm-chart/ ../kubeshark.github.io/charts/chart/
|
||||
@cd ../kubeshark.github.io/
|
||||
|
||||
221
README.md
221
README.md
@@ -1,120 +1,151 @@
|
||||
<p align="center">
|
||||
<img src="https://raw.githubusercontent.com/kubeshark/assets/master/svg/kubeshark-logo.svg" alt="Kubeshark: Traffic analyzer for Kubernetes." height="128px"/>
|
||||
<img src="https://raw.githubusercontent.com/kubeshark/assets/master/svg/kubeshark-logo.svg" alt="Kubeshark" height="120px"/>
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://github.com/kubeshark/kubeshark/releases/latest">
|
||||
<img alt="GitHub Latest Release" src="https://img.shields.io/github/v/release/kubeshark/kubeshark?logo=GitHub&style=flat-square">
|
||||
</a>
|
||||
<a href="https://hub.docker.com/r/kubeshark/worker">
|
||||
<img alt="Docker pulls" src="https://img.shields.io/docker/pulls/kubeshark/worker?color=%23099cec&logo=Docker&style=flat-square">
|
||||
</a>
|
||||
<a href="https://hub.docker.com/r/kubeshark/worker">
|
||||
<img alt="Image size" src="https://img.shields.io/docker/image-size/kubeshark/kubeshark/latest?logo=Docker&style=flat-square">
|
||||
</a>
|
||||
<a href="https://discord.gg/WkvRGMUcx7">
|
||||
<img alt="Discord" src="https://img.shields.io/discord/1042559155224973352?logo=Discord&style=flat-square&label=discord">
|
||||
</a>
|
||||
<a href="https://join.slack.com/t/kubeshark/shared_invite/zt-3jdcdgxdv-1qNkhBh9c6CFoE7bSPkpBQ">
|
||||
<img alt="Slack" src="https://img.shields.io/badge/slack-join_chat-green?logo=Slack&style=flat-square&label=slack">
|
||||
</a>
|
||||
<a href="https://github.com/kubeshark/kubeshark/releases/latest"><img alt="Release" src="https://img.shields.io/github/v/release/kubeshark/kubeshark?logo=GitHub&style=flat-square"></a>
|
||||
<a href="https://hub.docker.com/r/kubeshark/worker"><img alt="Docker pulls" src="https://img.shields.io/docker/pulls/kubeshark/worker?color=%23099cec&logo=Docker&style=flat-square"></a>
|
||||
<a href="https://discord.gg/WkvRGMUcx7"><img alt="Discord" src="https://img.shields.io/discord/1042559155224973352?logo=Discord&style=flat-square&label=discord"></a>
|
||||
<a href="https://join.slack.com/t/kubeshark/shared_invite/zt-3jdcdgxdv-1qNkhBh9c6CFoE7bSPkpBQ"><img alt="Slack" src="https://img.shields.io/badge/slack-join_chat-green?logo=Slack&style=flat-square"></a>
|
||||
</p>
|
||||
|
||||
<p align="center"><b>Network Observability for SREs & AI Agents</b></p>
|
||||
|
||||
<p align="center">
|
||||
<b>
|
||||
Want to see Kubeshark in action right now? Visit this
|
||||
<a href="https://demo.kubeshark.com/">live demo deployment</a> of Kubeshark.
|
||||
</b>
|
||||
<a href="https://demo.kubeshark.com/">Live Demo</a> · <a href="https://docs.kubeshark.com">Docs</a>
|
||||
</p>
|
||||
|
||||
**Kubeshark** is an API traffic analyzer for Kubernetes, providing deep packet inspection with complete API and Kubernetes contexts, retaining cluster-wide L4 traffic (PCAP), and using minimal production compute resources.
|
||||
---
|
||||
|
||||

|
||||
Kubeshark indexes cluster-wide network traffic at the kernel level using eBPF — delivering instant answers to any query using network, API, and Kubernetes semantics.
|
||||
|
||||
Think [TCPDump](https://en.wikipedia.org/wiki/Tcpdump) and [Wireshark](https://www.wireshark.org/) reimagined for Kubernetes.
|
||||
**What you can do:**
|
||||
|
||||
Access cluster-wide PCAP traffic by pressing a single button, without the need to install `tcpdump` or manually copy files. Understand the traffic context in relation to the API and Kubernetes contexts.
|
||||
- **Download Retrospective PCAPs** — cluster-wide packet captures filtered by nodes, time, workloads, and IPs. Store PCAPs for long-term retention and later investigation.
|
||||
- **Visualize Network Data** — explore traffic matching queries with API, Kubernetes, or network semantics through a real-time dashboard.
|
||||
- **See Encrypted Traffic in Plain Text** — automatically decrypt TLS/mTLS traffic using eBPF, with no key management or sidecars required.
|
||||
- **Integrate with AI** — connect your favorite AI assistant (e.g. Claude, Copilot) to include network data in AI-driven workflows like incident response and root cause analysis.
|
||||
|
||||
#### Service-Map w/Kubernetes Context
|
||||

|
||||
|
||||

|
||||
---
|
||||
|
||||
#### Export Cluster-Wide L4 Traffic (PCAP)
|
||||
## Get Started
|
||||
|
||||
Imagine having a cluster-wide [TCPDump](https://www.tcpdump.org/)-like capability—exporting a single [PCAP](https://www.ietf.org/archive/id/draft-gharris-opsawg-pcap-01.html) file that consolidates traffic from multiple nodes, all accessible with a single click.
|
||||
|
||||
1. Go to the **Snapshots** tab
|
||||
2. Create a new snapshot
|
||||
3. **Optionally** select the nodes (default: all nodes)
|
||||
4. **Optionally** select the time frame (default: last one hour)
|
||||
5. Press **Create**
|
||||
|
||||
<img width="3342" height="1206" alt="image" src="https://github.com/user-attachments/assets/e8e47996-52b7-4028-9698-f059a13ffdb7" />
|
||||
|
||||
|
||||
Once the snapshot is ready, click the PCAP file to export its contents and open it in Wireshark.
|
||||
|
||||
#### AI-Powered Network Analysis (MCP)
|
||||
|
||||
Connect your AI assistant to Kubeshark and query your cluster's network traffic using natural language. Kubeshark implements the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)—an open standard for connecting AI assistants to external data sources.
|
||||
|
||||
```shell
|
||||
# Add Kubeshark to Claude Code
|
||||
claude mcp add kubeshark -- kubeshark mcp --proxy
|
||||
|
||||
# Then ask questions like:
|
||||
# "Show me all HTTP 500 errors in the last hour"
|
||||
# "Which services communicate with payment-service?"
|
||||
# "Investigate why checkout is failing"
|
||||
```
|
||||
|
||||
**What AI can access:**
|
||||
- L7 API transactions (HTTP, gRPC, Redis, Kafka, etc.) with full request/response payloads
|
||||
- L4 TCP/UDP flows with connection metrics and TCP handshake RTT
|
||||
- Kubernetes context for every request (pod, service, namespace)
|
||||
- Snapshots and PCAP exports for forensic analysis
|
||||
|
||||
Works with Claude Code, Claude Desktop, Cursor, GitHub Copilot, and any MCP-compatible AI assistant. See the [MCP documentation](https://docs.kubeshark.com/en/mcp) for setup guides and use cases.
|
||||
|
||||
## Getting Started
|
||||
Download **Kubeshark**'s binary distribution [latest release](https://github.com/kubeshark/kubeshark/releases/latest) or use one of the following methods to deploy **Kubeshark**. The [web-based dashboard](https://docs.kubeshark.com/en/ui) should open in your browser, showing a real-time view of your cluster's traffic.
|
||||
|
||||
### Homebrew
|
||||
|
||||
[Homebrew](https://brew.sh/) :beer: users can install the Kubeshark CLI with:
|
||||
|
||||
```shell
|
||||
brew install kubeshark
|
||||
kubeshark tap
|
||||
```
|
||||
|
||||
To clean up:
|
||||
```shell
|
||||
kubeshark clean
|
||||
```
|
||||
|
||||
### Helm
|
||||
|
||||
Add the Helm repository and install the chart:
|
||||
|
||||
```shell
|
||||
```bash
|
||||
helm repo add kubeshark https://helm.kubeshark.com
|
||||
helm install kubeshark kubeshark/kubeshark
|
||||
```
|
||||
Follow the on-screen instructions how to connect to the dashboard.
|
||||
|
||||
To clean up:
|
||||
```shell
|
||||
helm uninstall kubeshark
|
||||
kubectl port-forward svc/kubeshark-front 8899:80
|
||||
```
|
||||
|
||||
## Building From Source
|
||||
Open `http://localhost:8899` in your browser. You're capturing traffic.
|
||||
|
||||
Clone this repository and run the `make` command to build it. After the build is complete, the executable can be found at `./bin/kubeshark`.
|
||||
> For production use, we recommend using an [ingress controller](https://docs.kubeshark.com/en/ingress) instead of port-forward.
|
||||
|
||||
## Documentation
|
||||
**Connect an AI agent** via MCP:
|
||||
|
||||
To learn more, read the [documentation](https://docs.kubeshark.com).
|
||||
```bash
|
||||
brew install kubeshark
|
||||
claude mcp add kubeshark -- kubeshark mcp
|
||||
```
|
||||
|
||||
[MCP setup guide →](https://docs.kubeshark.com/en/mcp)
|
||||
|
||||
---
|
||||
|
||||
### Network Data for AI Agents
|
||||
|
||||
Kubeshark exposes cluster-wide network data via [MCP](https://docs.kubeshark.com/en/mcp) — enabling AI agents to query traffic, investigate API calls, and perform root cause analysis through natural language.
|
||||
|
||||
> *"Why did checkout fail at 2:15 PM?"*
|
||||
> *"Which services have error rates above 1%?"*
|
||||
> *"Show TCP retransmission rates across all node-to-node paths"*
|
||||
> *"Trace request abc123 through all services"*
|
||||
|
||||
Works with Claude Code, Cursor, and any MCP-compatible AI.
|
||||
|
||||

|
||||
|
||||
[MCP setup guide →](https://docs.kubeshark.com/en/mcp)
|
||||
|
||||
### AI Skills
|
||||
|
||||
Open-source, reusable skills that teach AI agents domain-specific workflows on top of Kubeshark's MCP tools:
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| **[Network RCA](skills/network-rca/)** | Retrospective root cause analysis — snapshots, dissection, PCAP extraction, trend comparison |
|
||||
| **[KFL](skills/kfl/)** | KFL (Kubeshark Filter Language) expert — writes, debugs, and optimizes traffic filters |
|
||||
|
||||
Install as a Claude Code plugin:
|
||||
|
||||
```
|
||||
/plugin marketplace add kubeshark/kubeshark
|
||||
/plugin install kubeshark
|
||||
```
|
||||
|
||||
Or clone and use directly — skills trigger automatically based on conversation context.
|
||||
|
||||
[AI Skills docs →](https://docs.kubeshark.com/en/mcp/skills)
|
||||
|
||||
---
|
||||
|
||||
### Query with API, Kubernetes, and Network Semantics
|
||||
|
||||
Kubeshark indexes cluster-wide network traffic by parsing it according to protocol specifications, with support for HTTP, gRPC, Redis, Kafka, DNS, and more. A single [KFL query](https://docs.kubeshark.com/en/v2/kfl2) can combine all three semantic layers — Kubernetes identity, API context, and network attributes — to pinpoint exactly the traffic you need. No code instrumentation required.
|
||||
|
||||

|
||||
|
||||
[KFL reference →](https://docs.kubeshark.com/en/v2/kfl2) · [Traffic indexing →](https://docs.kubeshark.com/en/v2/l7_api_dissection)
|
||||
|
||||
### Workload Dependency Map
|
||||
|
||||
A visual map of how workloads communicate, showing dependencies, traffic volume, and protocol usage across the cluster.
|
||||
|
||||

|
||||
|
||||
[Learn more →](https://docs.kubeshark.com/en/v2/service_map)
|
||||
|
||||
### Traffic Retention & PCAP Export
|
||||
|
||||
Capture and retain raw network traffic cluster-wide, including decrypted TLS. Download PCAPs scoped by time range, nodes, workloads, and IPs — ready for Wireshark or any PCAP-compatible tool. Store snapshots in cloud storage (S3, Azure Blob, GCS) for long-term retention and cross-cluster sharing.
|
||||
|
||||

|
||||
|
||||
[Snapshots guide →](https://docs.kubeshark.com/en/v2/traffic_snapshots) · [Cloud storage →](https://docs.kubeshark.com/en/snapshots_cloud_storage)
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| [**Traffic Snapshots**](https://docs.kubeshark.com/en/v2/traffic_snapshots) | Point-in-time snapshots with cloud storage (S3, Azure Blob, GCS), PCAP export for Wireshark |
|
||||
| [**Traffic Indexing**](https://docs.kubeshark.com/en/v2/l7_api_dissection) | Real-time and delayed L7 indexing with request/response matching and full payloads |
|
||||
| [**Protocol Support**](https://docs.kubeshark.com/en/protocols) | HTTP, gRPC, GraphQL, Redis, Kafka, DNS, and more |
|
||||
| [**TLS Decryption**](https://docs.kubeshark.com/en/encrypted_traffic) | eBPF-based decryption without key management, included in snapshots |
|
||||
| [**AI Integration**](https://docs.kubeshark.com/en/mcp) | MCP server + open-source AI skills for network RCA and traffic filtering |
|
||||
| [**KFL Query Language**](https://docs.kubeshark.com/en/v2/kfl2) | CEL-based query language with Kubernetes, API, and network semantics |
|
||||
| [**100% On-Premises**](https://docs.kubeshark.com/en/air_gapped) | Air-gapped support, no external dependencies |
|
||||
|
||||
---
|
||||
|
||||
## Install
|
||||
|
||||
| Method | Command |
|
||||
|--------|---------|
|
||||
| Helm | `helm repo add kubeshark https://helm.kubeshark.com && helm install kubeshark kubeshark/kubeshark` |
|
||||
| Homebrew | `brew install kubeshark && kubeshark tap` |
|
||||
| Binary | [Download](https://github.com/kubeshark/kubeshark/releases/latest) |
|
||||
|
||||
[Installation guide →](https://docs.kubeshark.com/en/install)
|
||||
|
||||
---
|
||||
|
||||
## Contributing
|
||||
|
||||
We :heart: pull requests! See [CONTRIBUTING.md](CONTRIBUTING.md) for the contribution guide.
|
||||
We welcome contributions. See [CONTRIBUTING.md](CONTRIBUTING.md).
|
||||
|
||||
## License
|
||||
|
||||
[Apache-2.0](LICENSE)
|
||||
|
||||
202
cmd/mcpRunner.go
202
cmd/mcpRunner.go
@@ -10,6 +10,7 @@ import (
|
||||
"net/http"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
@@ -85,9 +86,9 @@ type mcpContent struct {
|
||||
}
|
||||
|
||||
type mcpPrompt struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Arguments []mcpPromptArg `json:"arguments,omitempty"`
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Arguments []mcpPromptArg `json:"arguments,omitempty"`
|
||||
}
|
||||
|
||||
type mcpPromptArg struct {
|
||||
@@ -116,11 +117,11 @@ type mcpGetPromptResult struct {
|
||||
// Hub MCP API response types
|
||||
|
||||
type hubMCPResponse struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description"`
|
||||
Version string `json:"version"`
|
||||
Tools []hubMCPTool `json:"tools"`
|
||||
Prompts []hubMCPPrompt `json:"prompts"`
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description"`
|
||||
Version string `json:"version"`
|
||||
Tools []hubMCPTool `json:"tools"`
|
||||
Prompts []hubMCPPrompt `json:"prompts"`
|
||||
}
|
||||
|
||||
type hubMCPTool struct {
|
||||
@@ -130,9 +131,9 @@ type hubMCPTool struct {
|
||||
}
|
||||
|
||||
type hubMCPPrompt struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Arguments []hubMCPPromptArg `json:"arguments,omitempty"`
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Arguments []hubMCPPromptArg `json:"arguments,omitempty"`
|
||||
}
|
||||
|
||||
type hubMCPPromptArg struct {
|
||||
@@ -150,10 +151,10 @@ type mcpServer struct {
|
||||
stdout io.Writer
|
||||
backendInitialized bool
|
||||
backendMu sync.Mutex
|
||||
setFlags []string // --set flags to pass to 'kubeshark tap' when starting
|
||||
directURL string // If set, connect directly to this URL (no kubectl/proxy)
|
||||
urlMode bool // True when using direct URL mode
|
||||
allowDestructive bool // If true, enable start/stop tools
|
||||
setFlags []string // --set flags to pass to 'kubeshark tap' when starting
|
||||
directURL string // If set, connect directly to this URL (no kubectl/proxy)
|
||||
urlMode bool // True when using direct URL mode
|
||||
allowDestructive bool // If true, enable start/stop tools
|
||||
cachedHubMCP *hubMCPResponse // Cached tools/prompts from Hub
|
||||
cachedAt time.Time // When the cache was populated
|
||||
hubMCPMu sync.Mutex
|
||||
@@ -324,6 +325,16 @@ func (s *mcpServer) invalidateHubMCPCache() {
|
||||
s.cachedHubMCP = nil
|
||||
}
|
||||
|
||||
// getBaseURL returns the hub API base URL by stripping /mcp from hubBaseURL.
|
||||
// The hub URL is always the frontend URL + /api, and hubBaseURL is frontendURL/api/mcp.
|
||||
// Ensures backend connection is established first.
|
||||
func (s *mcpServer) getBaseURL() (string, error) {
|
||||
if errMsg := s.ensureBackendConnection(); errMsg != "" {
|
||||
return "", fmt.Errorf("%s", errMsg)
|
||||
}
|
||||
return strings.TrimSuffix(s.hubBaseURL, "/mcp"), nil
|
||||
}
|
||||
|
||||
func writeErrorToStderr(format string, args ...any) {
|
||||
fmt.Fprintf(os.Stderr, format+"\n", args...)
|
||||
}
|
||||
@@ -379,6 +390,14 @@ func (s *mcpServer) handleRequest(req *jsonRPCRequest) {
|
||||
|
||||
func (s *mcpServer) handleInitialize(req *jsonRPCRequest) {
|
||||
var instructions string
|
||||
fileDownloadInstructions := `
|
||||
|
||||
Downloading files (e.g., PCAP exports):
|
||||
When a tool like export_snapshot_pcap returns a relative file path, you MUST use the file tools to retrieve the file:
|
||||
- get_file_url: Resolves the relative path to a full download URL you can share with the user.
|
||||
- download_file: Downloads the file to the local filesystem so it can be opened or analyzed.
|
||||
Typical workflow: call export_snapshot_pcap → receive a relative path → call download_file with that path → share the local file path with the user.`
|
||||
|
||||
if s.urlMode {
|
||||
instructions = fmt.Sprintf(`Kubeshark MCP Server - Connected to: %s
|
||||
|
||||
@@ -392,7 +411,7 @@ Available tools for traffic analysis:
|
||||
- get_api_stats: Get aggregated API statistics
|
||||
- And more - use tools/list to see all available tools
|
||||
|
||||
Use the MCP tools directly - do NOT use kubectl or curl to access Kubeshark.`, s.directURL)
|
||||
Use the MCP tools directly - do NOT use kubectl or curl to access Kubeshark.`, s.directURL) + fileDownloadInstructions
|
||||
} else if s.allowDestructive {
|
||||
instructions = `Kubeshark MCP Server - Proxy Mode (Destructive Operations ENABLED)
|
||||
|
||||
@@ -410,7 +429,7 @@ Safe operations:
|
||||
Traffic analysis tools (require Kubeshark to be running):
|
||||
- list_workloads, list_api_calls, list_l4_flows, get_api_stats, and more
|
||||
|
||||
Use the MCP tools - do NOT use kubectl, helm, or curl directly.`
|
||||
Use the MCP tools - do NOT use kubectl, helm, or curl directly.` + fileDownloadInstructions
|
||||
} else {
|
||||
instructions = `Kubeshark MCP Server - Proxy Mode (Read-Only)
|
||||
|
||||
@@ -425,7 +444,7 @@ Available operations:
|
||||
Traffic analysis tools (require Kubeshark to be running):
|
||||
- list_workloads, list_api_calls, list_l4_flows, get_api_stats, and more
|
||||
|
||||
Use the MCP tools - do NOT use kubectl, helm, or curl directly.`
|
||||
Use the MCP tools - do NOT use kubectl, helm, or curl directly.` + fileDownloadInstructions
|
||||
}
|
||||
|
||||
result := mcpInitializeResult{
|
||||
@@ -456,6 +475,40 @@ func (s *mcpServer) handleListTools(req *jsonRPCRequest) {
|
||||
}`),
|
||||
})
|
||||
|
||||
// Add file URL and download tools - available in all modes
|
||||
tools = append(tools, mcpTool{
|
||||
Name: "get_file_url",
|
||||
Description: "When a tool (e.g., export_snapshot_pcap) returns a relative file path, use this tool to resolve it into a fully-qualified download URL. The URL can be shared with the user for manual download.",
|
||||
InputSchema: json.RawMessage(`{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"path": {
|
||||
"type": "string",
|
||||
"description": "The relative file path returned by a Hub tool (e.g., '/snapshots/abc/data.pcap')"
|
||||
}
|
||||
},
|
||||
"required": ["path"]
|
||||
}`),
|
||||
})
|
||||
tools = append(tools, mcpTool{
|
||||
Name: "download_file",
|
||||
Description: "When a tool (e.g., export_snapshot_pcap) returns a relative file path, use this tool to download the file to the local filesystem. This is the preferred way to retrieve PCAP exports and other files from Kubeshark.",
|
||||
InputSchema: json.RawMessage(`{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"path": {
|
||||
"type": "string",
|
||||
"description": "The relative file path returned by a Hub tool (e.g., '/snapshots/abc/data.pcap')"
|
||||
},
|
||||
"dest": {
|
||||
"type": "string",
|
||||
"description": "Local destination file path. If not provided, uses the filename from the path in the current directory."
|
||||
}
|
||||
},
|
||||
"required": ["path"]
|
||||
}`),
|
||||
})
|
||||
|
||||
// Add destructive tools only if --allow-destructive flag was set (and not in URL mode)
|
||||
if !s.urlMode && s.allowDestructive {
|
||||
tools = append(tools, mcpTool{
|
||||
@@ -653,6 +706,20 @@ func (s *mcpServer) handleCallTool(req *jsonRPCRequest) {
|
||||
IsError: isError,
|
||||
})
|
||||
return
|
||||
case "get_file_url":
|
||||
result, isError = s.callGetFileURL(params.Arguments)
|
||||
s.sendResult(req.ID, mcpCallToolResult{
|
||||
Content: []mcpContent{{Type: "text", Text: result}},
|
||||
IsError: isError,
|
||||
})
|
||||
return
|
||||
case "download_file":
|
||||
result, isError = s.callDownloadFile(params.Arguments)
|
||||
s.sendResult(req.ID, mcpCallToolResult{
|
||||
Content: []mcpContent{{Type: "text", Text: result}},
|
||||
IsError: isError,
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
// Forward Hub tools to the API
|
||||
@@ -671,7 +738,7 @@ func (s *mcpServer) callHubTool(toolName string, args map[string]any) (string, b
|
||||
|
||||
// Build the request body
|
||||
requestBody := map[string]any{
|
||||
"tool": toolName,
|
||||
"name": toolName,
|
||||
"arguments": args,
|
||||
}
|
||||
|
||||
@@ -705,6 +772,90 @@ func (s *mcpServer) callHubTool(toolName string, args map[string]any) (string, b
|
||||
return prettyJSON.String(), false
|
||||
}
|
||||
|
||||
func (s *mcpServer) callGetFileURL(args map[string]any) (string, bool) {
|
||||
filePath, _ := args["path"].(string)
|
||||
if filePath == "" {
|
||||
return "Error: 'path' parameter is required", true
|
||||
}
|
||||
|
||||
baseURL, err := s.getBaseURL()
|
||||
if err != nil {
|
||||
return fmt.Sprintf("Error: %v", err), true
|
||||
}
|
||||
|
||||
// Ensure path starts with /
|
||||
if !strings.HasPrefix(filePath, "/") {
|
||||
filePath = "/" + filePath
|
||||
}
|
||||
|
||||
fullURL := strings.TrimSuffix(baseURL, "/") + filePath
|
||||
return fullURL, false
|
||||
}
|
||||
|
||||
func (s *mcpServer) callDownloadFile(args map[string]any) (string, bool) {
|
||||
filePath, _ := args["path"].(string)
|
||||
if filePath == "" {
|
||||
return "Error: 'path' parameter is required", true
|
||||
}
|
||||
|
||||
baseURL, err := s.getBaseURL()
|
||||
if err != nil {
|
||||
return fmt.Sprintf("Error: %v", err), true
|
||||
}
|
||||
|
||||
// Ensure path starts with /
|
||||
if !strings.HasPrefix(filePath, "/") {
|
||||
filePath = "/" + filePath
|
||||
}
|
||||
|
||||
fullURL := strings.TrimSuffix(baseURL, "/") + filePath
|
||||
|
||||
// Determine destination file path
|
||||
dest, _ := args["dest"].(string)
|
||||
if dest == "" {
|
||||
dest = path.Base(filePath)
|
||||
}
|
||||
|
||||
// Use a dedicated HTTP client for file downloads.
|
||||
// The default s.httpClient has a 30s total timeout which would fail for large files (up to 10GB).
|
||||
// This client sets only connection-level timeouts and lets the body stream without a deadline.
|
||||
downloadClient := &http.Client{
|
||||
Transport: &http.Transport{
|
||||
TLSHandshakeTimeout: 10 * time.Second,
|
||||
ResponseHeaderTimeout: 30 * time.Second,
|
||||
},
|
||||
}
|
||||
|
||||
resp, err := downloadClient.Get(fullURL)
|
||||
if err != nil {
|
||||
return fmt.Sprintf("Error downloading file: %v", err), true
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
|
||||
if resp.StatusCode >= 400 {
|
||||
return fmt.Sprintf("Error downloading file: HTTP %d", resp.StatusCode), true
|
||||
}
|
||||
|
||||
// Write to destination
|
||||
outFile, err := os.Create(dest)
|
||||
if err != nil {
|
||||
return fmt.Sprintf("Error creating file %s: %v", dest, err), true
|
||||
}
|
||||
defer func() { _ = outFile.Close() }()
|
||||
|
||||
written, err := io.Copy(outFile, resp.Body)
|
||||
if err != nil {
|
||||
return fmt.Sprintf("Error writing file %s: %v", dest, err), true
|
||||
}
|
||||
|
||||
result := map[string]any{
|
||||
"url": fullURL,
|
||||
"path": dest,
|
||||
"size": written,
|
||||
}
|
||||
resultBytes, _ := json.MarshalIndent(result, "", " ")
|
||||
return string(resultBytes), false
|
||||
}
|
||||
|
||||
func (s *mcpServer) callStartKubeshark(args map[string]any) (string, bool) {
|
||||
// Build the kubeshark tap command
|
||||
@@ -717,8 +868,8 @@ func (s *mcpServer) callStartKubeshark(args map[string]any) (string, bool) {
|
||||
|
||||
// Add namespaces if provided
|
||||
if v, ok := args["namespaces"].(string); ok && v != "" {
|
||||
namespaces := strings.Split(v, ",")
|
||||
for _, ns := range namespaces {
|
||||
namespaces := strings.SplitSeq(v, ",")
|
||||
for ns := range namespaces {
|
||||
ns = strings.TrimSpace(ns)
|
||||
if ns != "" {
|
||||
cmdArgs = append(cmdArgs, "-n", ns)
|
||||
@@ -913,6 +1064,11 @@ func listMCPTools(directURL string) {
|
||||
fmt.Printf("URL Mode: %s\n\n", directURL)
|
||||
fmt.Println("Cluster management tools disabled (Kubeshark managed externally)")
|
||||
fmt.Println()
|
||||
fmt.Println("Local Tools:")
|
||||
fmt.Println(" check_kubeshark_status Check if Kubeshark is running")
|
||||
fmt.Println(" get_file_url Resolve a relative path to a full download URL")
|
||||
fmt.Println(" download_file Download a file from Kubeshark to local disk")
|
||||
fmt.Println()
|
||||
|
||||
hubURL := strings.TrimSuffix(directURL, "/") + "/api/mcp"
|
||||
fetchAndDisplayTools(hubURL, 30*time.Second)
|
||||
@@ -925,6 +1081,10 @@ func listMCPTools(directURL string) {
|
||||
fmt.Println(" start_kubeshark Start Kubeshark to capture traffic")
|
||||
fmt.Println(" stop_kubeshark Stop Kubeshark and clean up resources")
|
||||
fmt.Println()
|
||||
fmt.Println("File Tools:")
|
||||
fmt.Println(" get_file_url Resolve a relative path to a full download URL")
|
||||
fmt.Println(" download_file Download a file from Kubeshark to local disk")
|
||||
fmt.Println()
|
||||
|
||||
// Establish proxy connection to Kubeshark
|
||||
fmt.Println("Connecting to Kubeshark...")
|
||||
|
||||
207
cmd/mcp_test.go
207
cmd/mcp_test.go
@@ -5,6 +5,8 @@ import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
@@ -126,8 +128,18 @@ func TestMCP_ToolsList_CLIOnly(t *testing.T) {
|
||||
t.Fatalf("Unexpected error: %v", resp.Error)
|
||||
}
|
||||
tools := resp.Result.(map[string]any)["tools"].([]any)
|
||||
if len(tools) != 1 || tools[0].(map[string]any)["name"] != "check_kubeshark_status" {
|
||||
t.Error("Expected only check_kubeshark_status tool")
|
||||
// Should have check_kubeshark_status + get_file_url + download_file = 3 tools
|
||||
if len(tools) != 3 {
|
||||
t.Errorf("Expected 3 tools, got %d", len(tools))
|
||||
}
|
||||
toolNames := make(map[string]bool)
|
||||
for _, tool := range tools {
|
||||
toolNames[tool.(map[string]any)["name"].(string)] = true
|
||||
}
|
||||
for _, expected := range []string{"check_kubeshark_status", "get_file_url", "download_file"} {
|
||||
if !toolNames[expected] {
|
||||
t.Errorf("Missing expected tool: %s", expected)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -163,9 +175,9 @@ func TestMCP_ToolsList_WithHubBackend(t *testing.T) {
|
||||
t.Fatalf("Unexpected error: %v", resp.Error)
|
||||
}
|
||||
tools := resp.Result.(map[string]any)["tools"].([]any)
|
||||
// Should have CLI tools (3) + Hub tools (2) = 5 tools
|
||||
if len(tools) < 5 {
|
||||
t.Errorf("Expected at least 5 tools, got %d", len(tools))
|
||||
// Should have CLI tools (3) + file tools (2) + Hub tools (2) = 7 tools
|
||||
if len(tools) < 7 {
|
||||
t.Errorf("Expected at least 7 tools, got %d", len(tools))
|
||||
}
|
||||
}
|
||||
|
||||
@@ -218,7 +230,7 @@ func newTestMCPServerWithMockBackend(handler http.HandlerFunc) (*mcpServer, *htt
|
||||
}
|
||||
|
||||
type hubToolCallRequest struct {
|
||||
Tool string `json:"tool"`
|
||||
Tool string `json:"name"`
|
||||
Arguments map[string]any `json:"arguments"`
|
||||
}
|
||||
|
||||
@@ -405,7 +417,7 @@ func TestMCP_CommandArgs(t *testing.T) {
|
||||
cmdArgs = append(cmdArgs, v)
|
||||
}
|
||||
if v, _ := tc.args["namespaces"].(string); v != "" {
|
||||
for _, ns := range strings.Split(v, ",") {
|
||||
for ns := range strings.SplitSeq(v, ",") {
|
||||
cmdArgs = append(cmdArgs, "-n", strings.TrimSpace(ns))
|
||||
}
|
||||
}
|
||||
@@ -463,6 +475,187 @@ func TestMCP_BackendInitialization_Concurrent(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_GetFileURL_ProxyMode(t *testing.T) {
|
||||
s := &mcpServer{
|
||||
httpClient: &http.Client{},
|
||||
stdin: &bytes.Buffer{},
|
||||
stdout: &bytes.Buffer{},
|
||||
hubBaseURL: "http://127.0.0.1:8899/api/mcp",
|
||||
backendInitialized: true,
|
||||
}
|
||||
resp := parseResponse(t, sendRequest(s, "tools/call", 1, mcpCallToolParams{
|
||||
Name: "get_file_url",
|
||||
Arguments: map[string]any{"path": "/snapshots/abc/data.pcap"},
|
||||
}))
|
||||
if resp.Error != nil {
|
||||
t.Fatalf("Unexpected error: %v", resp.Error)
|
||||
}
|
||||
text := resp.Result.(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
expected := "http://127.0.0.1:8899/api/snapshots/abc/data.pcap"
|
||||
if text != expected {
|
||||
t.Errorf("Expected %q, got %q", expected, text)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_GetFileURL_URLMode(t *testing.T) {
|
||||
s := &mcpServer{
|
||||
httpClient: &http.Client{},
|
||||
stdin: &bytes.Buffer{},
|
||||
stdout: &bytes.Buffer{},
|
||||
hubBaseURL: "https://kubeshark.example.com/api/mcp",
|
||||
backendInitialized: true,
|
||||
urlMode: true,
|
||||
directURL: "https://kubeshark.example.com",
|
||||
}
|
||||
resp := parseResponse(t, sendRequest(s, "tools/call", 1, mcpCallToolParams{
|
||||
Name: "get_file_url",
|
||||
Arguments: map[string]any{"path": "/snapshots/xyz/export.pcap"},
|
||||
}))
|
||||
if resp.Error != nil {
|
||||
t.Fatalf("Unexpected error: %v", resp.Error)
|
||||
}
|
||||
text := resp.Result.(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
expected := "https://kubeshark.example.com/api/snapshots/xyz/export.pcap"
|
||||
if text != expected {
|
||||
t.Errorf("Expected %q, got %q", expected, text)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_GetFileURL_MissingPath(t *testing.T) {
|
||||
s := &mcpServer{
|
||||
httpClient: &http.Client{},
|
||||
stdin: &bytes.Buffer{},
|
||||
stdout: &bytes.Buffer{},
|
||||
hubBaseURL: "http://127.0.0.1:8899/api/mcp",
|
||||
backendInitialized: true,
|
||||
}
|
||||
resp := parseResponse(t, sendRequest(s, "tools/call", 1, mcpCallToolParams{
|
||||
Name: "get_file_url",
|
||||
Arguments: map[string]any{},
|
||||
}))
|
||||
result := resp.Result.(map[string]any)
|
||||
if !result["isError"].(bool) {
|
||||
t.Error("Expected isError=true when path is missing")
|
||||
}
|
||||
text := result["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
if !strings.Contains(text, "path") {
|
||||
t.Error("Error message should mention 'path'")
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_DownloadFile(t *testing.T) {
|
||||
fileContent := "test pcap data content"
|
||||
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if r.URL.Path == "/api/snapshots/abc/data.pcap" {
|
||||
_, _ = w.Write([]byte(fileContent))
|
||||
} else {
|
||||
w.WriteHeader(http.StatusNotFound)
|
||||
}
|
||||
}))
|
||||
defer mockServer.Close()
|
||||
|
||||
// Use temp dir for download destination
|
||||
tmpDir := t.TempDir()
|
||||
dest := filepath.Join(tmpDir, "downloaded.pcap")
|
||||
|
||||
s := &mcpServer{
|
||||
httpClient: &http.Client{},
|
||||
stdin: &bytes.Buffer{},
|
||||
stdout: &bytes.Buffer{},
|
||||
hubBaseURL: mockServer.URL + "/api/mcp",
|
||||
backendInitialized: true,
|
||||
}
|
||||
resp := parseResponse(t, sendRequest(s, "tools/call", 1, mcpCallToolParams{
|
||||
Name: "download_file",
|
||||
Arguments: map[string]any{"path": "/snapshots/abc/data.pcap", "dest": dest},
|
||||
}))
|
||||
if resp.Error != nil {
|
||||
t.Fatalf("Unexpected error: %v", resp.Error)
|
||||
}
|
||||
result := resp.Result.(map[string]any)
|
||||
if result["isError"] != nil && result["isError"].(bool) {
|
||||
t.Fatalf("Expected no error, got: %v", result["content"])
|
||||
}
|
||||
|
||||
text := result["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
var downloadResult map[string]any
|
||||
if err := json.Unmarshal([]byte(text), &downloadResult); err != nil {
|
||||
t.Fatalf("Failed to parse download result JSON: %v", err)
|
||||
}
|
||||
if downloadResult["path"] != dest {
|
||||
t.Errorf("Expected path %q, got %q", dest, downloadResult["path"])
|
||||
}
|
||||
if downloadResult["size"].(float64) != float64(len(fileContent)) {
|
||||
t.Errorf("Expected size %d, got %v", len(fileContent), downloadResult["size"])
|
||||
}
|
||||
|
||||
// Verify the file was actually written
|
||||
content, err := os.ReadFile(dest)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to read downloaded file: %v", err)
|
||||
}
|
||||
if string(content) != fileContent {
|
||||
t.Errorf("Expected file content %q, got %q", fileContent, string(content))
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_DownloadFile_CustomDest(t *testing.T) {
|
||||
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
_, _ = w.Write([]byte("data"))
|
||||
}))
|
||||
defer mockServer.Close()
|
||||
|
||||
tmpDir := t.TempDir()
|
||||
customDest := filepath.Join(tmpDir, "custom-name.pcap")
|
||||
|
||||
s := &mcpServer{
|
||||
httpClient: &http.Client{},
|
||||
stdin: &bytes.Buffer{},
|
||||
stdout: &bytes.Buffer{},
|
||||
hubBaseURL: mockServer.URL + "/api/mcp",
|
||||
backendInitialized: true,
|
||||
}
|
||||
resp := parseResponse(t, sendRequest(s, "tools/call", 1, mcpCallToolParams{
|
||||
Name: "download_file",
|
||||
Arguments: map[string]any{"path": "/snapshots/abc/export.pcap", "dest": customDest},
|
||||
}))
|
||||
result := resp.Result.(map[string]any)
|
||||
if result["isError"] != nil && result["isError"].(bool) {
|
||||
t.Fatalf("Expected no error, got: %v", result["content"])
|
||||
}
|
||||
|
||||
text := result["content"].([]any)[0].(map[string]any)["text"].(string)
|
||||
var downloadResult map[string]any
|
||||
if err := json.Unmarshal([]byte(text), &downloadResult); err != nil {
|
||||
t.Fatalf("Failed to parse download result JSON: %v", err)
|
||||
}
|
||||
if downloadResult["path"] != customDest {
|
||||
t.Errorf("Expected path %q, got %q", customDest, downloadResult["path"])
|
||||
}
|
||||
|
||||
if _, err := os.Stat(customDest); os.IsNotExist(err) {
|
||||
t.Error("Expected file to exist at custom destination")
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_ToolsList_IncludesFileTools(t *testing.T) {
|
||||
s := newTestMCPServer()
|
||||
resp := parseResponse(t, sendRequest(s, "tools/list", 1, nil))
|
||||
if resp.Error != nil {
|
||||
t.Fatalf("Unexpected error: %v", resp.Error)
|
||||
}
|
||||
tools := resp.Result.(map[string]any)["tools"].([]any)
|
||||
toolNames := make(map[string]bool)
|
||||
for _, tool := range tools {
|
||||
toolNames[tool.(map[string]any)["name"].(string)] = true
|
||||
}
|
||||
for _, expected := range []string{"get_file_url", "download_file"} {
|
||||
if !toolNames[expected] {
|
||||
t.Errorf("Missing expected tool: %s", expected)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMCP_FullConversation(t *testing.T) {
|
||||
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if r.URL.Path == "/" {
|
||||
|
||||
@@ -18,7 +18,6 @@ import (
|
||||
corev1 "k8s.io/api/core/v1"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/client-go/kubernetes"
|
||||
clientk8s "k8s.io/client-go/kubernetes"
|
||||
"k8s.io/client-go/rest"
|
||||
"k8s.io/client-go/tools/remotecommand"
|
||||
)
|
||||
@@ -39,7 +38,7 @@ type PodFileInfo struct {
|
||||
}
|
||||
|
||||
// listWorkerPods fetches all worker pods from multiple namespaces
|
||||
func listWorkerPods(ctx context.Context, clientset *clientk8s.Clientset, namespaces []string) ([]*PodFileInfo, error) {
|
||||
func listWorkerPods(ctx context.Context, clientset *kubernetes.Clientset, namespaces []string) ([]*PodFileInfo, error) {
|
||||
var podFileInfos []*PodFileInfo
|
||||
var errs []error
|
||||
labelSelector := label
|
||||
@@ -65,7 +64,7 @@ func listWorkerPods(ctx context.Context, clientset *clientk8s.Clientset, namespa
|
||||
}
|
||||
|
||||
// listFilesInPodDir lists all files in the specified directory inside the pod across multiple namespaces
|
||||
func listFilesInPodDir(ctx context.Context, clientset *clientk8s.Clientset, config *rest.Config, pod *PodFileInfo, cutoffTime *time.Time) error {
|
||||
func listFilesInPodDir(ctx context.Context, clientset *kubernetes.Clientset, config *rest.Config, pod *PodFileInfo, cutoffTime *time.Time) error {
|
||||
nodeName := pod.Pod.Spec.NodeName
|
||||
srcFilePath := filepath.Join("data", nodeName, srcDir)
|
||||
|
||||
|
||||
@@ -62,4 +62,5 @@ func init() {
|
||||
tapCmd.Flags().Bool(configStructs.TelemetryEnabledLabel, defaultTapConfig.Telemetry.Enabled, "Enable/disable Telemetry")
|
||||
tapCmd.Flags().Bool(configStructs.ResourceGuardEnabledLabel, defaultTapConfig.ResourceGuard.Enabled, "Enable/disable resource guard")
|
||||
tapCmd.Flags().Bool(configStructs.WatchdogEnabled, defaultTapConfig.Watchdog.Enabled, "Enable/disable watchdog")
|
||||
tapCmd.Flags().String(configStructs.HelmChartPathLabel, defaultTapConfig.Release.HelmChartPath, "Path to a local Helm chart folder (overrides the remote Helm repo)")
|
||||
}
|
||||
|
||||
@@ -40,9 +40,11 @@ type Readiness struct {
|
||||
}
|
||||
|
||||
var ready *Readiness
|
||||
var proxyOnce sync.Once
|
||||
|
||||
func tap() {
|
||||
ready = &Readiness{}
|
||||
proxyOnce = sync.Once{}
|
||||
state.startTime = time.Now()
|
||||
log.Info().Str("registry", config.Config.Tap.Docker.Registry).Str("tag", config.Config.Tap.Docker.Tag).Msg("Using Docker:")
|
||||
|
||||
@@ -147,11 +149,21 @@ func printNoPodsFoundSuggestion(targetNamespaces []string) {
|
||||
log.Warn().Msg(fmt.Sprintf("Did not find any currently running pods that match the regex argument, %s will automatically target matching pods if any are created later%s", misc.Software, suggestionStr))
|
||||
}
|
||||
|
||||
func isPodReady(pod *core.Pod) bool {
|
||||
for _, condition := range pod.Status.Conditions {
|
||||
if condition.Type == core.PodReady {
|
||||
return condition.Status == core.ConditionTrue
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func watchHubPod(ctx context.Context, kubernetesProvider *kubernetes.Provider, cancel context.CancelFunc) {
|
||||
podExactRegex := regexp.MustCompile(fmt.Sprintf("^%s", kubernetes.HubPodName))
|
||||
podWatchHelper := kubernetes.NewPodWatchHelper(kubernetesProvider, podExactRegex)
|
||||
eventChan, errorChan := kubernetes.FilteredWatch(ctx, podWatchHelper, []string{config.Config.Tap.Release.Namespace}, podWatchHelper)
|
||||
isPodReady := false
|
||||
podReady := false
|
||||
podRunning := false
|
||||
|
||||
timeAfter := time.After(120 * time.Second)
|
||||
for {
|
||||
@@ -183,26 +195,30 @@ func watchHubPod(ctx context.Context, kubernetesProvider *kubernetes.Provider, c
|
||||
Interface("containers-statuses", modifiedPod.Status.ContainerStatuses).
|
||||
Msg("Watching pod.")
|
||||
|
||||
if modifiedPod.Status.Phase == core.PodRunning && !isPodReady {
|
||||
isPodReady = true
|
||||
if isPodReady(modifiedPod) && !podReady {
|
||||
podReady = true
|
||||
|
||||
ready.Lock()
|
||||
ready.Hub = true
|
||||
ready.Unlock()
|
||||
log.Info().Str("pod", kubernetes.HubPodName).Msg("Ready.")
|
||||
} else if modifiedPod.Status.Phase == core.PodRunning && !podRunning {
|
||||
podRunning = true
|
||||
log.Info().Str("pod", kubernetes.HubPodName).Msg("Waiting for readiness...")
|
||||
}
|
||||
|
||||
ready.Lock()
|
||||
proxyDone := ready.Proxy
|
||||
hubPodReady := ready.Hub
|
||||
frontPodReady := ready.Front
|
||||
ready.Unlock()
|
||||
|
||||
if !proxyDone && hubPodReady && frontPodReady {
|
||||
ready.Lock()
|
||||
ready.Proxy = true
|
||||
ready.Unlock()
|
||||
postFrontStarted(ctx, kubernetesProvider, cancel)
|
||||
if hubPodReady && frontPodReady {
|
||||
proxyOnce.Do(func() {
|
||||
ready.Lock()
|
||||
ready.Proxy = true
|
||||
ready.Unlock()
|
||||
postFrontStarted(ctx, kubernetesProvider, cancel)
|
||||
})
|
||||
}
|
||||
case kubernetes.EventBookmark:
|
||||
break
|
||||
@@ -223,7 +239,7 @@ func watchHubPod(ctx context.Context, kubernetesProvider *kubernetes.Provider, c
|
||||
cancel()
|
||||
|
||||
case <-timeAfter:
|
||||
if !isPodReady {
|
||||
if !podReady {
|
||||
log.Error().
|
||||
Str("pod", kubernetes.HubPodName).
|
||||
Msg("Pod was not ready in time.")
|
||||
@@ -242,7 +258,8 @@ func watchFrontPod(ctx context.Context, kubernetesProvider *kubernetes.Provider,
|
||||
podExactRegex := regexp.MustCompile(fmt.Sprintf("^%s", kubernetes.FrontPodName))
|
||||
podWatchHelper := kubernetes.NewPodWatchHelper(kubernetesProvider, podExactRegex)
|
||||
eventChan, errorChan := kubernetes.FilteredWatch(ctx, podWatchHelper, []string{config.Config.Tap.Release.Namespace}, podWatchHelper)
|
||||
isPodReady := false
|
||||
podReady := false
|
||||
podRunning := false
|
||||
|
||||
timeAfter := time.After(120 * time.Second)
|
||||
for {
|
||||
@@ -274,25 +291,29 @@ func watchFrontPod(ctx context.Context, kubernetesProvider *kubernetes.Provider,
|
||||
Interface("containers-statuses", modifiedPod.Status.ContainerStatuses).
|
||||
Msg("Watching pod.")
|
||||
|
||||
if modifiedPod.Status.Phase == core.PodRunning && !isPodReady {
|
||||
isPodReady = true
|
||||
if isPodReady(modifiedPod) && !podReady {
|
||||
podReady = true
|
||||
ready.Lock()
|
||||
ready.Front = true
|
||||
ready.Unlock()
|
||||
log.Info().Str("pod", kubernetes.FrontPodName).Msg("Ready.")
|
||||
} else if modifiedPod.Status.Phase == core.PodRunning && !podRunning {
|
||||
podRunning = true
|
||||
log.Info().Str("pod", kubernetes.FrontPodName).Msg("Waiting for readiness...")
|
||||
}
|
||||
|
||||
ready.Lock()
|
||||
proxyDone := ready.Proxy
|
||||
hubPodReady := ready.Hub
|
||||
frontPodReady := ready.Front
|
||||
ready.Unlock()
|
||||
|
||||
if !proxyDone && hubPodReady && frontPodReady {
|
||||
ready.Lock()
|
||||
ready.Proxy = true
|
||||
ready.Unlock()
|
||||
postFrontStarted(ctx, kubernetesProvider, cancel)
|
||||
if hubPodReady && frontPodReady {
|
||||
proxyOnce.Do(func() {
|
||||
ready.Lock()
|
||||
ready.Proxy = true
|
||||
ready.Unlock()
|
||||
postFrontStarted(ctx, kubernetesProvider, cancel)
|
||||
})
|
||||
}
|
||||
case kubernetes.EventBookmark:
|
||||
break
|
||||
@@ -312,7 +333,7 @@ func watchFrontPod(ctx context.Context, kubernetesProvider *kubernetes.Provider,
|
||||
Msg("Failed creating pod.")
|
||||
|
||||
case <-timeAfter:
|
||||
if !isPodReady {
|
||||
if !podReady {
|
||||
log.Error().
|
||||
Str("pod", kubernetes.FrontPodName).
|
||||
Msg("Pod was not ready in time.")
|
||||
@@ -429,9 +450,6 @@ func postFrontStarted(ctx context.Context, kubernetesProvider *kubernetes.Provid
|
||||
watchScripts(ctx, kubernetesProvider, false)
|
||||
}
|
||||
|
||||
if config.Config.Scripting.Console {
|
||||
go runConsoleWithoutProxy()
|
||||
}
|
||||
}
|
||||
|
||||
func updateConfig(kubernetesProvider *kubernetes.Provider) {
|
||||
|
||||
@@ -102,22 +102,21 @@ func CreateDefaultConfig() ConfigStruct {
|
||||
},
|
||||
},
|
||||
Auth: configStructs.AuthConfig{
|
||||
Saml: configStructs.SamlConfig{
|
||||
RoleAttribute: "role",
|
||||
Roles: map[string]configStructs.Role{
|
||||
"admin": {
|
||||
Filter: "",
|
||||
CanDownloadPCAP: true,
|
||||
CanUseScripting: true,
|
||||
ScriptingPermissions: configStructs.ScriptingPermissions{
|
||||
CanSave: true,
|
||||
CanActivate: true,
|
||||
CanDelete: true,
|
||||
},
|
||||
CanUpdateTargetedPods: true,
|
||||
CanStopTrafficCapturing: true,
|
||||
ShowAdminConsoleLink: true,
|
||||
RolesClaim: "role",
|
||||
Roles: map[string]configStructs.Role{
|
||||
"admin": {
|
||||
Filter: "",
|
||||
CanDownloadPCAP: true,
|
||||
CanUseScripting: true,
|
||||
ScriptingPermissions: configStructs.ScriptingPermissions{
|
||||
CanSave: true,
|
||||
CanActivate: true,
|
||||
CanDelete: true,
|
||||
},
|
||||
CanUpdateTargetedPods: true,
|
||||
CanStopTrafficCapturing: true,
|
||||
CanControlDissection: true,
|
||||
ShowAdminConsoleLink: true,
|
||||
},
|
||||
},
|
||||
},
|
||||
@@ -127,35 +126,44 @@ func CreateDefaultConfig() ConfigStruct {
|
||||
"http",
|
||||
"icmp",
|
||||
"kafka",
|
||||
"mongodb",
|
||||
"mysql",
|
||||
"postgresql",
|
||||
"redis",
|
||||
// "sctp",
|
||||
// "syscall",
|
||||
// "tcp",
|
||||
// "udp",
|
||||
"ws",
|
||||
// "tlsx",
|
||||
"tlsx",
|
||||
"ldap",
|
||||
"radius",
|
||||
"diameter",
|
||||
"udp-flow",
|
||||
"tcp-flow",
|
||||
"udp-flow-full",
|
||||
"tcp-flow-full",
|
||||
"udp-conn",
|
||||
"tcp-conn",
|
||||
},
|
||||
PortMapping: configStructs.PortMapping{
|
||||
HTTP: []uint16{80, 443, 8080},
|
||||
AMQP: []uint16{5671, 5672},
|
||||
KAFKA: []uint16{9092},
|
||||
REDIS: []uint16{6379},
|
||||
MONGODB: []uint16{27017},
|
||||
MYSQL: []uint16{3306},
|
||||
POSTGRESQL: []uint16{5432},
|
||||
REDIS: []uint16{6379},
|
||||
LDAP: []uint16{389},
|
||||
DIAMETER: []uint16{3868},
|
||||
},
|
||||
Dashboard: configStructs.DashboardConfig{
|
||||
CompleteStreamingEnabled: true,
|
||||
ClusterWideMapEnabled: false,
|
||||
},
|
||||
Capture: configStructs.CaptureConfig{
|
||||
Stopped: false,
|
||||
StopAfter: "5m",
|
||||
Dissection: configStructs.DissectionConfig{
|
||||
Enabled: true,
|
||||
StopAfter: "5m",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
@@ -12,6 +12,7 @@ import (
|
||||
)
|
||||
|
||||
type ScriptingConfig struct {
|
||||
Enabled bool `yaml:"enabled" json:"enabled" default:"false"`
|
||||
Env map[string]interface{} `yaml:"env" json:"env" default:"{}"`
|
||||
Source string `yaml:"source" json:"source" default:""`
|
||||
Sources []string `yaml:"sources" json:"sources" default:"[]"`
|
||||
|
||||
@@ -45,6 +45,7 @@ const (
|
||||
PcapDumpEnabled = "enabled"
|
||||
PcapTime = "time"
|
||||
WatchdogEnabled = "watchdogEnabled"
|
||||
HelmChartPathLabel = "release-helmChartPath"
|
||||
)
|
||||
|
||||
type ResourceLimitsHub struct {
|
||||
@@ -167,21 +168,34 @@ type Role struct {
|
||||
ScriptingPermissions ScriptingPermissions `yaml:"scriptingPermissions" json:"scriptingPermissions"`
|
||||
CanUpdateTargetedPods bool `yaml:"canUpdateTargetedPods" json:"canUpdateTargetedPods" default:"false"`
|
||||
CanStopTrafficCapturing bool `yaml:"canStopTrafficCapturing" json:"canStopTrafficCapturing" default:"false"`
|
||||
CanControlDissection bool `yaml:"canControlDissection" json:"canControlDissection" default:"false"`
|
||||
ShowAdminConsoleLink bool `yaml:"showAdminConsoleLink" json:"showAdminConsoleLink" default:"false"`
|
||||
}
|
||||
|
||||
type SamlConfig struct {
|
||||
IdpMetadataUrl string `yaml:"idpMetadataUrl" json:"idpMetadataUrl"`
|
||||
X509crt string `yaml:"x509crt" json:"x509crt"`
|
||||
X509key string `yaml:"x509key" json:"x509key"`
|
||||
RoleAttribute string `yaml:"roleAttribute" json:"roleAttribute"`
|
||||
Roles map[string]Role `yaml:"roles" json:"roles"`
|
||||
IdpMetadataUrl string `yaml:"idpMetadataUrl" json:"idpMetadataUrl"`
|
||||
X509crt string `yaml:"x509crt" json:"x509crt"`
|
||||
X509key string `yaml:"x509key" json:"x509key"`
|
||||
}
|
||||
|
||||
type AuthConfig struct {
|
||||
Enabled bool `yaml:"enabled" json:"enabled" default:"false"`
|
||||
Type string `yaml:"type" json:"type" default:"saml"`
|
||||
Saml SamlConfig `yaml:"saml" json:"saml"`
|
||||
Enabled bool `yaml:"enabled" json:"enabled" default:"false"`
|
||||
// Type selects the authentication backend. Valid values:
|
||||
// saml — SAML 2.0 SSO
|
||||
// oidc — generic OIDC (Dex, Okta, Auth0, Keycloak, Azure AD, …)
|
||||
// dex — permanent alias of oidc (kept for back-compat)
|
||||
// descope — Descope SDK
|
||||
// default — also routes to Descope (kept, not deprecated)
|
||||
//
|
||||
// NOTE: prior releases routed `oidc` to Descope. If you were using `oidc`
|
||||
// to mean Descope, switch to `descope` (or `default`). The rename is a
|
||||
// breaking change documented in the release notes.
|
||||
Type string `yaml:"type" json:"type" default:"saml"`
|
||||
Roles map[string]Role `yaml:"roles" json:"roles"`
|
||||
RolesClaim string `yaml:"rolesClaim" json:"rolesClaim"`
|
||||
DefaultRole string `yaml:"defaultRole" json:"defaultRole"`
|
||||
DefaultFilter string `yaml:"defaultFilter" json:"defaultFilter"`
|
||||
Saml SamlConfig `yaml:"saml" json:"saml"`
|
||||
}
|
||||
|
||||
type IngressConfig struct {
|
||||
@@ -200,6 +214,8 @@ type RoutingConfig struct {
|
||||
type DashboardConfig struct {
|
||||
StreamingType string `yaml:"streamingType" json:"streamingType" default:"connect-rpc"`
|
||||
CompleteStreamingEnabled bool `yaml:"completeStreamingEnabled" json:"completeStreamingEnabled" default:"true"`
|
||||
ClusterWideMapEnabled bool `yaml:"clusterWideMapEnabled" json:"clusterWideMapEnabled" default:"false"`
|
||||
EntriesLimit string `yaml:"entriesLimit" json:"entriesLimit" default:"300000"`
|
||||
}
|
||||
|
||||
type FrontRoutingConfig struct {
|
||||
@@ -207,9 +223,10 @@ type FrontRoutingConfig struct {
|
||||
}
|
||||
|
||||
type ReleaseConfig struct {
|
||||
Repo string `yaml:"repo" json:"repo" default:"https://helm.kubeshark.com"`
|
||||
Name string `yaml:"name" json:"name" default:"kubeshark"`
|
||||
Namespace string `yaml:"namespace" json:"namespace" default:"default"`
|
||||
Repo string `yaml:"repo" json:"repo" default:"https://helm.kubeshark.com"`
|
||||
Name string `yaml:"name" json:"name" default:"kubeshark"`
|
||||
Namespace string `yaml:"namespace" json:"namespace" default:"default"`
|
||||
HelmChartPath string `yaml:"helmChartPath" json:"helmChartPath" default:""`
|
||||
}
|
||||
|
||||
type TelemetryConfig struct {
|
||||
@@ -260,6 +277,8 @@ type MiscConfig struct {
|
||||
DuplicateTimeframe string `yaml:"duplicateTimeframe" json:"duplicateTimeframe" default:"200ms"`
|
||||
DetectDuplicates bool `yaml:"detectDuplicates" json:"detectDuplicates" default:"false"`
|
||||
StaleTimeoutSeconds int `yaml:"staleTimeoutSeconds" json:"staleTimeoutSeconds" default:"30"`
|
||||
TcpFlowTimeout int `yaml:"tcpFlowTimeout" json:"tcpFlowTimeout" default:"1200"`
|
||||
UdpFlowTimeout int `yaml:"udpFlowTimeout" json:"udpFlowTimeout" default:"1200"`
|
||||
}
|
||||
|
||||
type PcapDumpConfig struct {
|
||||
@@ -276,7 +295,10 @@ type PortMapping struct {
|
||||
HTTP []uint16 `yaml:"http" json:"http"`
|
||||
AMQP []uint16 `yaml:"amqp" json:"amqp"`
|
||||
KAFKA []uint16 `yaml:"kafka" json:"kafka"`
|
||||
REDIS []uint16 `yaml:"redis" json:"redis"`
|
||||
MONGODB []uint16 `yaml:"mongodb" json:"mongodb"`
|
||||
MYSQL []uint16 `yaml:"mysql" json:"mysql"`
|
||||
POSTGRESQL []uint16 `yaml:"postgresql" json:"postgresql"`
|
||||
REDIS []uint16 `yaml:"redis" json:"redis"`
|
||||
LDAP []uint16 `yaml:"ldap" json:"ldap"`
|
||||
DIAMETER []uint16 `yaml:"diameter" json:"diameter"`
|
||||
}
|
||||
@@ -305,20 +327,61 @@ type RawCaptureConfig struct {
|
||||
StorageSize string `yaml:"storageSize" json:"storageSize" default:"1Gi"`
|
||||
}
|
||||
|
||||
type SnapshotsConfig struct {
|
||||
type SnapshotsLocalConfig struct {
|
||||
StorageClass string `yaml:"storageClass" json:"storageClass" default:""`
|
||||
StorageSize string `yaml:"storageSize" json:"storageSize" default:"20Gi"`
|
||||
}
|
||||
|
||||
type SnapshotsCloudS3Config struct {
|
||||
Bucket string `yaml:"bucket" json:"bucket" default:""`
|
||||
Region string `yaml:"region" json:"region" default:""`
|
||||
AccessKey string `yaml:"accessKey" json:"accessKey" default:""`
|
||||
SecretKey string `yaml:"secretKey" json:"secretKey" default:""`
|
||||
RoleArn string `yaml:"roleArn" json:"roleArn" default:""`
|
||||
ExternalId string `yaml:"externalId" json:"externalId" default:""`
|
||||
}
|
||||
|
||||
type SnapshotsCloudAzblobConfig struct {
|
||||
StorageAccount string `yaml:"storageAccount" json:"storageAccount" default:""`
|
||||
Container string `yaml:"container" json:"container" default:""`
|
||||
StorageKey string `yaml:"storageKey" json:"storageKey" default:""`
|
||||
}
|
||||
|
||||
type SnapshotsCloudGCSConfig struct {
|
||||
Bucket string `yaml:"bucket" json:"bucket" default:""`
|
||||
Project string `yaml:"project" json:"project" default:""`
|
||||
CredentialsJson string `yaml:"credentialsJson" json:"credentialsJson" default:""`
|
||||
}
|
||||
|
||||
type SnapshotsCloudConfig struct {
|
||||
Provider string `yaml:"provider" json:"provider" default:""`
|
||||
Prefix string `yaml:"prefix" json:"prefix" default:""`
|
||||
ConfigMaps []string `yaml:"configMaps" json:"configMaps" default:"[]"`
|
||||
Secrets []string `yaml:"secrets" json:"secrets" default:"[]"`
|
||||
S3 SnapshotsCloudS3Config `yaml:"s3" json:"s3"`
|
||||
Azblob SnapshotsCloudAzblobConfig `yaml:"azblob" json:"azblob"`
|
||||
GCS SnapshotsCloudGCSConfig `yaml:"gcs" json:"gcs"`
|
||||
}
|
||||
|
||||
type SnapshotsConfig struct {
|
||||
Local SnapshotsLocalConfig `yaml:"local" json:"local"`
|
||||
Cloud SnapshotsCloudConfig `yaml:"cloud" json:"cloud"`
|
||||
}
|
||||
|
||||
type DelayedDissectionConfig struct {
|
||||
Image string `yaml:"image" json:"image" default:"kubeshark/worker:master"`
|
||||
CPU string `yaml:"cpu" json:"cpu" default:"1"`
|
||||
Memory string `yaml:"memory" json:"memory" default:"4Gi"`
|
||||
CPU string `yaml:"cpu" json:"cpu" default:"1"`
|
||||
Memory string `yaml:"memory" json:"memory" default:"4Gi"`
|
||||
StorageSize string `yaml:"storageSize" json:"storageSize" default:""`
|
||||
StorageClass string `yaml:"storageClass" json:"storageClass" default:""`
|
||||
}
|
||||
|
||||
type DissectionConfig struct {
|
||||
Enabled bool `yaml:"enabled" json:"enabled" default:"true"`
|
||||
StopAfter string `yaml:"stopAfter" json:"stopAfter" default:"5m"`
|
||||
}
|
||||
|
||||
type CaptureConfig struct {
|
||||
Stopped bool `yaml:"stopped" json:"stopped" default:"false"`
|
||||
StopAfter string `yaml:"stopAfter" json:"stopAfter" default:"5m"`
|
||||
Dissection DissectionConfig `yaml:"dissection" json:"dissection"`
|
||||
CaptureSelf bool `yaml:"captureSelf" json:"captureSelf" default:"false"`
|
||||
Raw RawCaptureConfig `yaml:"raw" json:"raw"`
|
||||
DbMaxSize string `yaml:"dbMaxSize" json:"dbMaxSize" default:"500Mi"`
|
||||
@@ -367,7 +430,6 @@ type TapConfig struct {
|
||||
Gitops GitopsConfig `yaml:"gitops" json:"gitops"`
|
||||
Sentry SentryConfig `yaml:"sentry" json:"sentry"`
|
||||
DefaultFilter string `yaml:"defaultFilter" json:"defaultFilter" default:""`
|
||||
LiveConfigMapChangesDisabled bool `yaml:"liveConfigMapChangesDisabled" json:"liveConfigMapChangesDisabled" default:"false"`
|
||||
GlobalFilter string `yaml:"globalFilter" json:"globalFilter" default:""`
|
||||
EnabledDissectors []string `yaml:"enabledDissectors" json:"enabledDissectors"`
|
||||
PortMapping PortMapping `yaml:"portMapping" json:"portMapping"`
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
apiVersion: v2
|
||||
name: kubeshark
|
||||
version: "52.12.0"
|
||||
version: "53.3.0"
|
||||
description: The API Traffic Analyzer for Kubernetes
|
||||
home: https://kubeshark.com
|
||||
keywords:
|
||||
|
||||
@@ -138,13 +138,34 @@ Example for overriding image names:
|
||||
| `tap.namespaces` | Target pods in namespaces | `[]` |
|
||||
| `tap.excludedNamespaces` | Exclude pods in namespaces | `[]` |
|
||||
| `tap.bpfOverride` | When using AF_PACKET as a traffic capture backend, override any existing pod targeting rules and set explicit BPF expression (e.g. `net 0.0.0.0/0`). | `[]` |
|
||||
| `tap.capture.stopped` | Set to `false` to have traffic processing start automatically. When set to `true`, traffic processing is stopped by default, resulting in almost no resource consumption (e.g. Kubeshark is dormant). This property can be dynamically control via the dashboard. | `false` |
|
||||
| `tap.capture.stopAfter` | Set to a duration (e.g. `30s`) to have traffic processing stop after no websocket activity between worker and hub. | `30s` |
|
||||
| `tap.capture.dissection.enabled` | Set to `true` to have L7 protocol dissection start automatically. When set to `false`, dissection is disabled by default. This property can be dynamically controlled via the dashboard. | `true` |
|
||||
| `tap.capture.dissection.stopAfter` | Set to a duration (e.g. `30s`) to have L7 dissection stop after no activity. | `5m` |
|
||||
| `tap.capture.raw.enabled` | Enable raw capture of packets and syscalls to disk for offline analysis | `true` |
|
||||
| `tap.capture.raw.storageSize` | Maximum storage size for raw capture files (supports K8s quantity format: `1Gi`, `500Mi`, etc.) | `1Gi` |
|
||||
| `tap.capture.dbMaxSize` | Maximum size for capture database (e.g., `4Gi`, `2000Mi`). When empty, automatically uses 80% of allocated storage (`tap.storageLimit`). | `""` |
|
||||
| `tap.snapshots.storageClass` | Storage class for snapshots volume. When empty, uses `emptyDir`. When set, creates a PVC with this storage class | `""` |
|
||||
| `tap.snapshots.storageSize` | Storage size for snapshots volume (supports K8s quantity format: `1Gi`, `500Mi`, etc.) | `10Gi` |
|
||||
| `tap.capture.captureSelf` | Include Kubeshark's own traffic in capture | `false` |
|
||||
| `tap.capture.dbMaxSize` | Maximum size for capture database (e.g., `4Gi`, `2000Mi`). | `500Mi` |
|
||||
| `tap.snapshots.local.storageClass` | Storage class for local snapshots volume. When empty, uses `emptyDir`. When set, creates a PVC with this storage class | `""` |
|
||||
| `tap.snapshots.local.storageSize` | Storage size for local snapshots volume (supports K8s quantity format: `1Gi`, `500Mi`, etc.) | `20Gi` |
|
||||
| `tap.snapshots.cloud.provider` | Cloud storage provider for snapshots: `s3`, `azblob`, or `gcs`. Empty string disables cloud storage. See [Cloud Storage docs](docs/snapshots_cloud_storage.md). | `""` |
|
||||
| `tap.snapshots.cloud.prefix` | Key prefix in the bucket/container (e.g. `snapshots/`). See [Cloud Storage docs](docs/snapshots_cloud_storage.md). | `""` |
|
||||
| `tap.snapshots.cloud.configMaps` | Names of pre-existing ConfigMaps with cloud storage env vars. Alternative to inline `s3`/`azblob`/`gcs` values below. See [Cloud Storage docs](docs/snapshots_cloud_storage.md). | `[]` |
|
||||
| `tap.snapshots.cloud.secrets` | Names of pre-existing Secrets with cloud storage credentials. Alternative to inline `s3`/`azblob`/`gcs` values below. See [Cloud Storage docs](docs/snapshots_cloud_storage.md). | `[]` |
|
||||
| `tap.snapshots.cloud.s3.bucket` | S3 bucket name. When set, the chart auto-creates a ConfigMap with `SNAPSHOT_AWS_BUCKET`. | `""` |
|
||||
| `tap.snapshots.cloud.s3.region` | AWS region for the S3 bucket. | `""` |
|
||||
| `tap.snapshots.cloud.s3.accessKey` | AWS access key ID. When set, the chart auto-creates a Secret with `SNAPSHOT_AWS_ACCESS_KEY`. | `""` |
|
||||
| `tap.snapshots.cloud.s3.secretKey` | AWS secret access key. When set, the chart auto-creates a Secret with `SNAPSHOT_AWS_SECRET_KEY`. | `""` |
|
||||
| `tap.snapshots.cloud.s3.roleArn` | IAM role ARN to assume via STS for cross-account S3 access. | `""` |
|
||||
| `tap.snapshots.cloud.s3.externalId` | External ID for the STS AssumeRole call. | `""` |
|
||||
| `tap.snapshots.cloud.azblob.storageAccount` | Azure storage account name. When set, the chart auto-creates a ConfigMap with `SNAPSHOT_AZBLOB_STORAGE_ACCOUNT`. | `""` |
|
||||
| `tap.snapshots.cloud.azblob.container` | Azure blob container name. | `""` |
|
||||
| `tap.snapshots.cloud.azblob.storageKey` | Azure storage account access key. When set, the chart auto-creates a Secret with `SNAPSHOT_AZBLOB_STORAGE_KEY`. | `""` |
|
||||
| `tap.snapshots.cloud.gcs.bucket` | GCS bucket name. When set, the chart auto-creates a ConfigMap with `SNAPSHOT_GCS_BUCKET`. | `""` |
|
||||
| `tap.snapshots.cloud.gcs.project` | GCP project ID. | `""` |
|
||||
| `tap.snapshots.cloud.gcs.credentialsJson` | Service account JSON key. When set, the chart auto-creates a Secret with `SNAPSHOT_GCS_CREDENTIALS_JSON`. | `""` |
|
||||
| `tap.delayedDissection.cpu` | CPU allocation for delayed dissection jobs | `1` |
|
||||
| `tap.delayedDissection.memory` | Memory allocation for delayed dissection jobs | `4Gi` |
|
||||
| `tap.delayedDissection.storageSize` | Storage size for dissection job PVC. When empty, falls back to `tap.snapshots.local.storageSize`. When the resolved value is non-empty, a PVC is created; otherwise an `emptyDir` is used. | `""` |
|
||||
| `tap.delayedDissection.storageClass` | Storage class for dissection job PVC. When empty, falls back to `tap.snapshots.local.storageClass`. | `""` |
|
||||
| `tap.release.repo` | URL of the Helm chart repository | `https://helm.kubeshark.com` |
|
||||
| `tap.release.name` | Helm release name | `kubeshark` |
|
||||
| `tap.release.namespace` | Helm release namespace | `default` |
|
||||
@@ -152,30 +173,30 @@ Example for overriding image names:
|
||||
| `tap.persistentStorageStatic` | Use static persistent volume provisioning (explicitly defined `PersistentVolume` ) | `false` |
|
||||
| `tap.persistentStoragePvcVolumeMode` | Set the pvc volume mode (Filesystem\|Block) | `Filesystem` |
|
||||
| `tap.efsFileSytemIdAndPath` | [EFS file system ID and, optionally, subpath and/or access point](https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/access_points/README.md) `<FileSystemId>:<Path>:<AccessPointId>` | "" |
|
||||
| `tap.storageLimit` | Limit of either the `emptyDir` or `persistentVolumeClaim` | `5Gi` |
|
||||
| `tap.storageLimit` | Limit of either the `emptyDir` or `persistentVolumeClaim` | `10Gi` |
|
||||
| `tap.storageClass` | Storage class of the `PersistentVolumeClaim` | `standard` |
|
||||
| `tap.dryRun` | Preview of all pods matching the regex, without tapping them | `false` |
|
||||
| `tap.dnsConfig.nameservers` | Nameservers to use for DNS resolution | `[]` |
|
||||
| `tap.dnsConfig.searches` | Search domains to use for DNS resolution | `[]` |
|
||||
| `tap.dnsConfig.options` | DNS options to use for DNS resolution | `[]` |
|
||||
| `tap.dns.nameservers` | Nameservers to use for DNS resolution | `[]` |
|
||||
| `tap.dns.searches` | Search domains to use for DNS resolution | `[]` |
|
||||
| `tap.dns.options` | DNS options to use for DNS resolution | `[]` |
|
||||
| `tap.resources.hub.limits.cpu` | CPU limit for hub | `""` (no limit) |
|
||||
| `tap.resources.hub.limits.memory` | Memory limit for hub | `5Gi` |
|
||||
| `tap.resources.hub.requests.cpu` | CPU request for hub | `50m` |
|
||||
| `tap.resources.hub.requests.memory` | Memory request for hub | `50Mi` |
|
||||
| `tap.resources.sniffer.limits.cpu` | CPU limit for sniffer | `""` (no limit) |
|
||||
| `tap.resources.sniffer.limits.memory` | Memory limit for sniffer | `3Gi` |
|
||||
| `tap.resources.sniffer.limits.memory` | Memory limit for sniffer | `5Gi` |
|
||||
| `tap.resources.sniffer.requests.cpu` | CPU request for sniffer | `50m` |
|
||||
| `tap.resources.sniffer.requests.memory` | Memory request for sniffer | `50Mi` |
|
||||
| `tap.resources.tracer.limits.cpu` | CPU limit for tracer | `""` (no limit) |
|
||||
| `tap.resources.tracer.limits.memory` | Memory limit for tracer | `3Gi` |
|
||||
| `tap.resources.tracer.limits.memory` | Memory limit for tracer | `5Gi` |
|
||||
| `tap.resources.tracer.requests.cpu` | CPU request for tracer | `50m` |
|
||||
| `tap.resources.tracer.requests.memory` | Memory request for tracer | `50Mi` |
|
||||
| `tap.probes.hub.initialDelaySeconds` | Initial delay before probing the hub | `15` |
|
||||
| `tap.probes.hub.periodSeconds` | Period between probes for the hub | `10` |
|
||||
| `tap.probes.hub.initialDelaySeconds` | Initial delay before probing the hub | `5` |
|
||||
| `tap.probes.hub.periodSeconds` | Period between probes for the hub | `5` |
|
||||
| `tap.probes.hub.successThreshold` | Number of successful probes before considering the hub healthy | `1` |
|
||||
| `tap.probes.hub.failureThreshold` | Number of failed probes before considering the hub unhealthy | `3` |
|
||||
| `tap.probes.sniffer.initialDelaySeconds` | Initial delay before probing the sniffer | `15` |
|
||||
| `tap.probes.sniffer.periodSeconds` | Period between probes for the sniffer | `10` |
|
||||
| `tap.probes.sniffer.initialDelaySeconds` | Initial delay before probing the sniffer | `5` |
|
||||
| `tap.probes.sniffer.periodSeconds` | Period between probes for the sniffer | `5` |
|
||||
| `tap.probes.sniffer.successThreshold` | Number of successful probes before considering the sniffer healthy | `1` |
|
||||
| `tap.probes.sniffer.failureThreshold` | Number of failed probes before considering the sniffer unhealthy | `3` |
|
||||
| `tap.serviceMesh` | Capture traffic from service meshes like Istio, Linkerd, Consul, etc. | `true` |
|
||||
@@ -191,14 +212,15 @@ Example for overriding image names:
|
||||
| `tap.tolerations.hub` | Tolerations for hub component | `[]` |
|
||||
| `tap.tolerations.front` | Tolerations for front-end component | `[]` |
|
||||
| `tap.auth.enabled` | Enable authentication | `false` |
|
||||
| `tap.auth.type` | Authentication type (1 option available: `saml`) | `saml` |
|
||||
| `tap.auth.type` | Authentication backend. Valid values: `saml`, `oidc` (generic OIDC — Dex, Okta, Auth0, Keycloak, Azure AD, Google, …), `dex` (permanent alias of `oidc`), `descope`, `default` (also routes to Descope). **Breaking**: prior releases routed `oidc` to Descope — if you were using it for Descope, switch to `descope` or `default`. | `saml` |
|
||||
| `tap.auth.approvedEmails` | List of approved email addresses for authentication | `[]` |
|
||||
| `tap.auth.approvedDomains` | List of approved email domains for authentication | `[]` |
|
||||
| `tap.auth.saml.idpMetadataUrl` | SAML IDP metadata URL <br/>(effective, if `tap.auth.type = saml`) | `` |
|
||||
| `tap.auth.rolesClaim` | Name of the JWT claim (OIDC) or SAML attribute carrying role memberships. | `role` |
|
||||
| `tap.auth.defaultRole` | Optional role name inside `tap.auth.roles` applied as fallback when an authenticated user has no matching role. Empty string = no fallback, zero-valued permissions. | `""` |
|
||||
| `tap.auth.roles` | Backend-neutral role map shared by SAML and OIDC. Each role's `namespaces` is a comma-separated list controlling which Kubernetes namespaces the role's users see traffic for: `""` = deny all, `"*"` = allow all, `"foo"` = literal namespace, `"foo,bar"` = OR over literals, `"foo-*"` = glob expansion against the cluster's known namespaces. Empty/unset `tap.auth.roles` grants nothing — admins opt into elevated access by populating this map. | `{"admin":{"namespaces":"*","canDownloadPCAP":true,"canUpdateTargetedPods":true,"canUseScripting":true,"scriptingPermissions":{"canSave":true,"canActivate":true,"canDelete":true},"canStopTrafficCapturing":true,"canControlDissection":true,"showAdminConsoleLink":true}}` |
|
||||
| `tap.auth.saml.idpMetadataUrl` | SAML IDP metadata URL <br/>(effective, if `tap.auth.type = saml`) | `` |
|
||||
| `tap.auth.saml.x509crt` | A self-signed X.509 `.cert` contents <br/>(effective, if `tap.auth.type = saml`) | `` |
|
||||
| `tap.auth.saml.x509key` | A self-signed X.509 `.key` contents <br/>(effective, if `tap.auth.type = saml`) | `` |
|
||||
| `tap.auth.saml.roleAttribute` | A SAML attribute name corresponding to user's authorization role <br/>(effective, if `tap.auth.type = saml`) | `role` |
|
||||
| `tap.auth.saml.roles` | A list of SAML authorization roles and their permissions <br/>(effective, if `tap.auth.type = saml`) | `{"admin":{"canDownloadPCAP":true,"canUpdateTargetedPods":true,"canUseScripting":true, "scriptingPermissions":{"canSave":true, "canActivate":true, "canDelete":true}, "canStopTrafficCapturing":true, "filter":"","showAdminConsoleLink":true}}` |
|
||||
| `tap.ingress.enabled` | Enable `Ingress` | `false` |
|
||||
| `tap.ingress.className` | Ingress class name | `""` |
|
||||
| `tap.ingress.host` | Host of the `Ingress` | `ks.svc.cluster.local` |
|
||||
@@ -210,16 +232,20 @@ Example for overriding image names:
|
||||
| `tap.telemetry.enabled` | Enable anonymous usage statistics collection | `true` |
|
||||
| `tap.resourceGuard.enabled` | Enable resource guard worker process, which watches RAM/disk usage and enables/disables traffic capture based on available resources | `false` |
|
||||
| `tap.secrets` | List of secrets to be used as source for environment variables (e.g. `kubeshark-license`) | `[]` |
|
||||
| `tap.sentry.enabled` | Enable sending of error logs to Sentry | `true` (only for qualified users) |
|
||||
| `tap.sentry.enabled` | Enable sending of error logs to Sentry | `false` |
|
||||
| `tap.sentry.environment` | Sentry environment to label error logs with | `production` |
|
||||
| `tap.defaultFilter` | Sets the default dashboard KFL filter (e.g. `http`). By default, this value is set to filter out noisy protocols such as DNS, UDP, ICMP and TCP. The user can easily change this, **temporarily**, in the Dashboard. For a permanent change, you should change this value in the `values.yaml` or `config.yaml` file. | `""` |
|
||||
| `tap.liveConfigMapChangesDisabled` | If set to `true`, all user functionality (scripting, targeting settings, global & default KFL modification, traffic recording, traffic capturing on/off, protocol dissectors) involving dynamic ConfigMap changes from UI will be disabled | `false` |
|
||||
| `tap.globalFilter` | Prepends to any KFL filter and can be used to limit what is visible in the dashboard. For example, `redact("request.headers.Authorization")` will redact the appropriate field. Another example `!dns` will not show any DNS traffic. | `""` |
|
||||
| `tap.metrics.port` | Pod port used to expose Prometheus metrics | `49100` |
|
||||
| `tap.enabledDissectors` | This is an array of strings representing the list of supported protocols. Remove or comment out redundant protocols (e.g., dns).| The default list excludes: `udp` and `tcp` |
|
||||
| `tap.mountBpf` | BPF filesystem needs to be mounted for eBPF to work properly. This helm value determines whether Kubeshark will attempt to mount the filesystem. This option is not required if filesystem is already mounts. │ `true`|
|
||||
| `tap.hostNetwork` | Enable host network mode for worker DaemonSet pods. When enabled, worker pods use the host's network namespace for direct network access. | `true` |
|
||||
| `tap.packetCapture` | Packet capture backend: `best`, `af_packet`, or `pf_ring` | `best` |
|
||||
| `tap.misc.trafficSampleRate` | Percentage of traffic to process (0-100) | `100` |
|
||||
| `tap.misc.tcpStreamChannelTimeoutMs` | Timeout in milliseconds for TCP stream channel | `10000` |
|
||||
| `tap.gitops.enabled` | Enable GitOps functionality. This will allow you to use GitOps to manage your Kubeshark configuration. | `false` |
|
||||
| `tap.misc.tcpFlowTimeout` | TCP flow aggregation timeout in seconds. Controls how long the worker waits before finalizing a TCP flow. | `1200` |
|
||||
| `tap.misc.udpFlowTimeout` | UDP flow aggregation timeout in seconds. Controls how long the worker waits before finalizing a UDP flow. | `1200` |
|
||||
| `logs.file` | Logs dump path | `""` |
|
||||
| `pcapdump.enabled` | Enable recording of all traffic captured according to other parameters. Whatever Kubeshark captures, considering pod targeting rules, will be stored in pcap files ready to be viewed by tools | `false` |
|
||||
| `pcapdump.maxTime` | The time window into the past that will be stored. Older traffic will be discarded. | `2h` |
|
||||
@@ -229,6 +255,7 @@ Example for overriding image names:
|
||||
| `dumpLogs` | Enable dumping of logs | `false` |
|
||||
| `headless` | Enable running in headless mode | `false` |
|
||||
| `license` | License key for the Pro/Enterprise edition | `""` |
|
||||
| `scripting.enabled` | Enables scripting | `false` |
|
||||
| `scripting.env` | Environment variables for the scripting | `{}` |
|
||||
| `scripting.source` | Source directory of the scripts | `""` |
|
||||
| `scripting.watchScripts` | Enable watch mode for the scripts in source directory | `true` |
|
||||
@@ -236,10 +263,6 @@ Example for overriding image names:
|
||||
| `supportChatEnabled` | Enable real-time support chat channel based on Intercom | `false` |
|
||||
| `internetConnectivity` | Turns off API requests that are dependent on Internet connectivity such as `telemetry` and `online-support`. | `true` |
|
||||
|
||||
KernelMapping pairs kernel versions with a
|
||||
DriverContainer image. Kernel versions can be matched
|
||||
literally or using a regular expression
|
||||
|
||||
# Installing with SAML enabled
|
||||
|
||||
### Prerequisites:
|
||||
@@ -355,8 +378,8 @@ Add these helm values to set up OIDC authentication powered by your Dex IdP:
|
||||
tap:
|
||||
auth:
|
||||
enabled: true
|
||||
type: dex
|
||||
dexOidc:
|
||||
type: oidc # canonical; `dex` is accepted as a permanent alias
|
||||
oidc:
|
||||
issuer: <put Dex IdP issuer URL here>
|
||||
clientId: kubeshark
|
||||
clientSecret: create your own client password
|
||||
@@ -368,7 +391,7 @@ tap:
|
||||
---
|
||||
|
||||
**Note:**<br/>
|
||||
Set `tap.auth.dexOidc.bypassSslCaCheck: true`
|
||||
Set `tap.auth.oidc.bypassSslCaCheck: true`
|
||||
to allow Kubeshark communication with Dex IdP having an unknown SSL Certificate Authority.
|
||||
|
||||
This setting allows you to prevent such SSL CA-related errors:<br/>
|
||||
@@ -407,7 +430,7 @@ The following Dex settings will have these values:
|
||||
|
||||
| Setting | Value |
|
||||
|-------------------------------------------------------|----------------------------------------------|
|
||||
| `tap.auth.dexOidc.issuer` | `https://ks.example.com/dex` |
|
||||
| `tap.auth.oidc.issuer` | `https://ks.example.com/dex` |
|
||||
| `tap.auth.dexConfig.issuer` | `https://ks.example.com/dex` |
|
||||
| `tap.auth.dexConfig.staticClients -> redirectURIs` | `https://ks.example.com/api/oauth2/callback` |
|
||||
| `tap.auth.dexConfig.connectors -> config.redirectURI` | `https://ks.example.com/dex/callback` |
|
||||
@@ -425,16 +448,16 @@ Please, make sure to prepare the following things first.
|
||||
- You will need to specify storage settings in `tap.auth.dexConfig.storage`
|
||||
- default: `memory`
|
||||
3. Decide on the OAuth2 `?state=` param expiration time:
|
||||
- field: `tap.auth.dexOidc.oauth2StateParamExpiry`
|
||||
- field: `tap.auth.oidc.oauth2StateParamExpiry`
|
||||
- default: `10m` (10 minutes)
|
||||
- valid time units are `s`, `m`, `h`
|
||||
4. Decide on the refresh token expiration:
|
||||
- field 1: `tap.auth.dexOidc.expiry.refreshTokenLifetime`
|
||||
- field 1: `tap.auth.oidc.expiry.refreshTokenLifetime`
|
||||
- field 2: `tap.auth.dexConfig.expiry.refreshTokens.absoluteLifetime`
|
||||
- default: `3960h` (165 days)
|
||||
- valid time units are `s`, `m`, `h`
|
||||
5. Create a unique & secure password to set in these fields:
|
||||
- field 1: `tap.auth.dexOidc.clientSecret`
|
||||
- field 1: `tap.auth.oidc.clientSecret`
|
||||
- field 2: `tap.auth.dexConfig.staticClients -> secret`
|
||||
- password must be the same for these 2 fields
|
||||
6. Discover more possibilities of **[Dex Configuration](https://dexidp.io/docs/configuration/)**
|
||||
@@ -456,8 +479,8 @@ Helm `values.yaml`:
|
||||
tap:
|
||||
auth:
|
||||
enabled: true
|
||||
type: dex
|
||||
dexOidc:
|
||||
type: oidc # canonical; `dex` is accepted as a permanent alias
|
||||
oidc:
|
||||
issuer: https://<your-ingress-hostname>/dex
|
||||
|
||||
# Client ID/secret must be taken from `tap.auth.dexConfig.staticClients -> id/secret`
|
||||
|
||||
583
helm-chart/docs/snapshots_cloud_storage.md
Normal file
583
helm-chart/docs/snapshots_cloud_storage.md
Normal file
@@ -0,0 +1,583 @@
|
||||
# Cloud Storage for Snapshots
|
||||
|
||||
Kubeshark can upload and download snapshots to cloud object storage, enabling cross-cluster sharing, backup/restore, and long-term retention.
|
||||
|
||||
Supported providers: **Amazon S3** (`s3`), **Azure Blob Storage** (`azblob`), and **Google Cloud Storage** (`gcs`).
|
||||
|
||||
## Helm Values
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "" # "s3", "azblob", or "gcs" (empty = disabled)
|
||||
prefix: "" # key prefix in the bucket/container (e.g. "snapshots/")
|
||||
configMaps: [] # names of pre-existing ConfigMaps with cloud config env vars
|
||||
secrets: [] # names of pre-existing Secrets with cloud credentials
|
||||
s3:
|
||||
bucket: ""
|
||||
region: ""
|
||||
accessKey: ""
|
||||
secretKey: ""
|
||||
roleArn: ""
|
||||
externalId: ""
|
||||
azblob:
|
||||
storageAccount: ""
|
||||
container: ""
|
||||
storageKey: ""
|
||||
gcs:
|
||||
bucket: ""
|
||||
project: ""
|
||||
credentialsJson: ""
|
||||
```
|
||||
|
||||
- `provider` selects which cloud backend to use. Leave empty to disable cloud storage.
|
||||
- `configMaps` and `secrets` are lists of names of existing ConfigMap/Secret resources. They are mounted as `envFrom` on the hub pod, injecting all their keys as environment variables.
|
||||
|
||||
### Inline Values (Alternative to External ConfigMaps/Secrets)
|
||||
|
||||
Instead of creating ConfigMap and Secret resources manually, you can set cloud storage configuration directly in `values.yaml` or via `--set` flags. The Helm chart will automatically create the necessary ConfigMap and Secret resources.
|
||||
|
||||
Both approaches can be used together — inline values are additive to external `configMaps`/`secrets` references.
|
||||
|
||||
---
|
||||
|
||||
## Amazon S3
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `SNAPSHOT_AWS_BUCKET` | Yes | S3 bucket name |
|
||||
| `SNAPSHOT_AWS_REGION` | No | AWS region (uses SDK default if empty) |
|
||||
| `SNAPSHOT_AWS_ACCESS_KEY` | No | Static access key ID (empty = use default credential chain) |
|
||||
| `SNAPSHOT_AWS_SECRET_KEY` | No | Static secret access key |
|
||||
| `SNAPSHOT_AWS_ROLE_ARN` | No | IAM role ARN to assume via STS (for cross-account access) |
|
||||
| `SNAPSHOT_AWS_EXTERNAL_ID` | No | External ID for the STS AssumeRole call |
|
||||
| `SNAPSHOT_CLOUD_PREFIX` | No | Key prefix in the bucket (e.g. `snapshots/`) |
|
||||
|
||||
### Authentication Methods
|
||||
|
||||
Credentials are resolved in this order:
|
||||
|
||||
1. **Static credentials** -- If `SNAPSHOT_AWS_ACCESS_KEY` is set, static credentials are used directly.
|
||||
2. **STS AssumeRole** -- If `SNAPSHOT_AWS_ROLE_ARN` is also set, the static (or default) credentials are used to assume the given IAM role. This is useful for cross-account S3 access.
|
||||
3. **AWS default credential chain** -- When no static credentials are provided, the SDK default chain is used:
|
||||
- **IRSA** (EKS service account token) -- recommended for production on EKS
|
||||
- EC2 instance profile
|
||||
- Standard AWS environment variables (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, etc.)
|
||||
- Shared credentials file (`~/.aws/credentials`)
|
||||
|
||||
The provider validates bucket access on startup via `HeadBucket`. If the bucket is inaccessible, the hub will fail to start.
|
||||
|
||||
### Example: Inline Values (simplest approach)
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "s3"
|
||||
s3:
|
||||
bucket: my-kubeshark-snapshots
|
||||
region: us-east-1
|
||||
```
|
||||
|
||||
Or with static credentials via `--set`:
|
||||
|
||||
```bash
|
||||
helm install kubeshark kubeshark/kubeshark \
|
||||
--set tap.snapshots.cloud.provider=s3 \
|
||||
--set tap.snapshots.cloud.s3.bucket=my-kubeshark-snapshots \
|
||||
--set tap.snapshots.cloud.s3.region=us-east-1 \
|
||||
--set tap.snapshots.cloud.s3.accessKey=AKIA... \
|
||||
--set tap.snapshots.cloud.s3.secretKey=wJal...
|
||||
```
|
||||
|
||||
### Example: IRSA (recommended for EKS)
|
||||
|
||||
[IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) lets EKS pods assume an IAM role without static credentials. EKS injects a short-lived token into the pod automatically.
|
||||
|
||||
**Prerequisites:**
|
||||
|
||||
1. Your EKS cluster must have an [OIDC provider](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html) associated with it.
|
||||
2. An IAM role with a trust policy that allows the Kubeshark service account to assume it.
|
||||
|
||||
**Step 1 — Create an IAM policy scoped to your bucket:**
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:PutObject",
|
||||
"s3:DeleteObject",
|
||||
"s3:GetObjectVersion",
|
||||
"s3:DeleteObjectVersion",
|
||||
"s3:ListBucket",
|
||||
"s3:ListBucketVersions",
|
||||
"s3:GetBucketLocation",
|
||||
"s3:GetBucketVersioning"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:s3:::my-kubeshark-snapshots",
|
||||
"arn:aws:s3:::my-kubeshark-snapshots/*"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
> For read-only access, remove `s3:PutObject`, `s3:DeleteObject`, and `s3:DeleteObjectVersion`.
|
||||
|
||||
**Step 2 — Create an IAM role with IRSA trust policy:**
|
||||
|
||||
```bash
|
||||
# Get your cluster's OIDC provider URL
|
||||
OIDC_PROVIDER=$(aws eks describe-cluster --name CLUSTER_NAME \
|
||||
--query "cluster.identity.oidc.issuer" --output text | sed 's|https://||')
|
||||
|
||||
# Create a trust policy
|
||||
# The default K8s SA name is "<release-name>-service-account" (e.g. "kubeshark-service-account")
|
||||
cat > trust-policy.json <<EOF
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Principal": {
|
||||
"Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/${OIDC_PROVIDER}"
|
||||
},
|
||||
"Action": "sts:AssumeRoleWithWebIdentity",
|
||||
"Condition": {
|
||||
"StringEquals": {
|
||||
"${OIDC_PROVIDER}:sub": "system:serviceaccount:NAMESPACE:kubeshark-service-account",
|
||||
"${OIDC_PROVIDER}:aud": "sts.amazonaws.com"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
EOF
|
||||
|
||||
# Create the role and attach your policy
|
||||
aws iam create-role \
|
||||
--role-name KubesharkS3Role \
|
||||
--assume-role-policy-document file://trust-policy.json
|
||||
|
||||
aws iam put-role-policy \
|
||||
--role-name KubesharkS3Role \
|
||||
--policy-name KubesharkSnapshotsBucketAccess \
|
||||
--policy-document file://bucket-policy.json
|
||||
```
|
||||
|
||||
**Step 3 — Create a ConfigMap with bucket configuration:**
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-s3-config
|
||||
data:
|
||||
SNAPSHOT_AWS_BUCKET: my-kubeshark-snapshots
|
||||
SNAPSHOT_AWS_REGION: us-east-1
|
||||
```
|
||||
|
||||
**Step 4 — Set Helm values with `tap.annotations` to annotate the service account:**
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
annotations:
|
||||
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/KubesharkS3Role
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "s3"
|
||||
configMaps:
|
||||
- kubeshark-s3-config
|
||||
```
|
||||
|
||||
Or via `--set`:
|
||||
|
||||
```bash
|
||||
helm install kubeshark kubeshark/kubeshark \
|
||||
--set tap.snapshots.cloud.provider=s3 \
|
||||
--set tap.snapshots.cloud.s3.bucket=my-kubeshark-snapshots \
|
||||
--set tap.snapshots.cloud.s3.region=us-east-1 \
|
||||
--set tap.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::ACCOUNT_ID:role/KubesharkS3Role
|
||||
```
|
||||
|
||||
No `accessKey`/`secretKey` is needed — EKS injects credentials automatically via the IRSA token.
|
||||
|
||||
### Example: Static Credentials
|
||||
|
||||
Create a Secret with credentials:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: kubeshark-s3-creds
|
||||
type: Opaque
|
||||
stringData:
|
||||
SNAPSHOT_AWS_ACCESS_KEY: AKIA...
|
||||
SNAPSHOT_AWS_SECRET_KEY: wJal...
|
||||
```
|
||||
|
||||
Create a ConfigMap with bucket configuration:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-s3-config
|
||||
data:
|
||||
SNAPSHOT_AWS_BUCKET: my-kubeshark-snapshots
|
||||
SNAPSHOT_AWS_REGION: us-east-1
|
||||
```
|
||||
|
||||
Set Helm values:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "s3"
|
||||
configMaps:
|
||||
- kubeshark-s3-config
|
||||
secrets:
|
||||
- kubeshark-s3-creds
|
||||
```
|
||||
|
||||
### Example: Cross-Account Access via AssumeRole
|
||||
|
||||
Add the role ARN to your ConfigMap:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-s3-config
|
||||
data:
|
||||
SNAPSHOT_AWS_BUCKET: other-account-bucket
|
||||
SNAPSHOT_AWS_REGION: eu-west-1
|
||||
SNAPSHOT_AWS_ROLE_ARN: arn:aws:iam::123456789012:role/KubesharkCrossAccountRole
|
||||
SNAPSHOT_AWS_EXTERNAL_ID: my-external-id # optional, if required by the trust policy
|
||||
```
|
||||
|
||||
The hub will first authenticate using its own credentials (IRSA, static, or default chain), then assume the specified role to access the bucket.
|
||||
|
||||
---
|
||||
|
||||
## Azure Blob Storage
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `SNAPSHOT_AZBLOB_STORAGE_ACCOUNT` | Yes | Azure storage account name |
|
||||
| `SNAPSHOT_AZBLOB_CONTAINER` | Yes | Blob container name |
|
||||
| `SNAPSHOT_AZBLOB_STORAGE_KEY` | No | Storage account access key (empty = use DefaultAzureCredential) |
|
||||
| `SNAPSHOT_CLOUD_PREFIX` | No | Key prefix in the container (e.g. `snapshots/`) |
|
||||
|
||||
### Authentication Methods
|
||||
|
||||
Credentials are resolved in this order:
|
||||
|
||||
1. **Shared Key** -- If `SNAPSHOT_AZBLOB_STORAGE_KEY` is set, the storage account key is used directly.
|
||||
2. **DefaultAzureCredential** -- When no storage key is provided, the Azure SDK default credential chain is used:
|
||||
- **Workload Identity** (AKS pod identity) -- recommended for production on AKS
|
||||
- Managed Identity (system or user-assigned)
|
||||
- Azure CLI credentials
|
||||
- Environment variables (`AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, `AZURE_CLIENT_SECRET`)
|
||||
|
||||
The provider validates container access on startup via `GetProperties`. If the container is inaccessible, the hub will fail to start.
|
||||
|
||||
### Example: Inline Values
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "azblob"
|
||||
azblob:
|
||||
storageAccount: mykubesharksa
|
||||
container: snapshots
|
||||
storageKey: "base64-encoded-storage-key..." # optional, omit for DefaultAzureCredential
|
||||
```
|
||||
|
||||
### Example: Workload Identity (recommended for AKS)
|
||||
|
||||
Create a ConfigMap with storage configuration:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-azblob-config
|
||||
data:
|
||||
SNAPSHOT_AZBLOB_STORAGE_ACCOUNT: mykubesharksa
|
||||
SNAPSHOT_AZBLOB_CONTAINER: snapshots
|
||||
```
|
||||
|
||||
Set Helm values:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "azblob"
|
||||
configMaps:
|
||||
- kubeshark-azblob-config
|
||||
```
|
||||
|
||||
The hub pod's service account must be configured for AKS Workload Identity with a managed identity that has the **Storage Blob Data Contributor** role on the container.
|
||||
|
||||
### Example: Storage Account Key
|
||||
|
||||
Create a Secret with the storage key:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: kubeshark-azblob-creds
|
||||
type: Opaque
|
||||
stringData:
|
||||
SNAPSHOT_AZBLOB_STORAGE_KEY: "base64-encoded-storage-key..."
|
||||
```
|
||||
|
||||
Create a ConfigMap with storage configuration:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-azblob-config
|
||||
data:
|
||||
SNAPSHOT_AZBLOB_STORAGE_ACCOUNT: mykubesharksa
|
||||
SNAPSHOT_AZBLOB_CONTAINER: snapshots
|
||||
```
|
||||
|
||||
Set Helm values:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "azblob"
|
||||
configMaps:
|
||||
- kubeshark-azblob-config
|
||||
secrets:
|
||||
- kubeshark-azblob-creds
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Google Cloud Storage
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `SNAPSHOT_GCS_BUCKET` | Yes | GCS bucket name |
|
||||
| `SNAPSHOT_GCS_PROJECT` | No | GCP project ID |
|
||||
| `SNAPSHOT_GCS_CREDENTIALS_JSON` | No | Service account JSON key (empty = use Application Default Credentials) |
|
||||
| `SNAPSHOT_CLOUD_PREFIX` | No | Key prefix in the bucket (e.g. `snapshots/`) |
|
||||
|
||||
### Authentication Methods
|
||||
|
||||
Credentials are resolved in this order:
|
||||
|
||||
1. **Service Account JSON Key** -- If `SNAPSHOT_GCS_CREDENTIALS_JSON` is set, the provided JSON key is used directly.
|
||||
2. **Application Default Credentials** -- When no JSON key is provided, the GCP SDK default credential chain is used:
|
||||
- **Workload Identity** (GKE pod identity) -- recommended for production on GKE
|
||||
- GCE instance metadata (Compute Engine default service account)
|
||||
- Standard GCP environment variables (`GOOGLE_APPLICATION_CREDENTIALS`)
|
||||
- `gcloud` CLI credentials
|
||||
|
||||
The provider validates bucket access on startup via `Bucket.Attrs()`. If the bucket is inaccessible, the hub will fail to start.
|
||||
|
||||
### Required IAM Permissions
|
||||
|
||||
The service account needs different IAM roles depending on the access level:
|
||||
|
||||
**Read-only** (download, list, and sync snapshots from cloud):
|
||||
|
||||
| Role | Permissions provided | Purpose |
|
||||
|------|---------------------|---------|
|
||||
| `roles/storage.legacyBucketReader` | `storage.buckets.get`, `storage.objects.list` | Hub startup (bucket validation) + listing snapshots |
|
||||
| `roles/storage.objectViewer` | `storage.objects.get`, `storage.objects.list` | Downloading snapshots, checking existence, reading metadata |
|
||||
|
||||
```bash
|
||||
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
|
||||
--member="serviceAccount:SA_EMAIL" \
|
||||
--role="roles/storage.legacyBucketReader"
|
||||
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
|
||||
--member="serviceAccount:SA_EMAIL" \
|
||||
--role="roles/storage.objectViewer"
|
||||
```
|
||||
|
||||
**Read-write** (upload and delete snapshots in addition to read):
|
||||
|
||||
Add `roles/storage.objectAdmin` instead of `roles/storage.objectViewer` to also grant `storage.objects.create` and `storage.objects.delete`:
|
||||
|
||||
| Role | Permissions provided | Purpose |
|
||||
|------|---------------------|---------|
|
||||
| `roles/storage.legacyBucketReader` | `storage.buckets.get`, `storage.objects.list` | Hub startup (bucket validation) + listing snapshots |
|
||||
| `roles/storage.objectAdmin` | `storage.objects.*` | Full object CRUD (upload, download, delete, list, metadata) |
|
||||
|
||||
```bash
|
||||
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
|
||||
--member="serviceAccount:SA_EMAIL" \
|
||||
--role="roles/storage.legacyBucketReader"
|
||||
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
|
||||
--member="serviceAccount:SA_EMAIL" \
|
||||
--role="roles/storage.objectAdmin"
|
||||
```
|
||||
|
||||
### Example: Inline Values (simplest approach)
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "gcs"
|
||||
gcs:
|
||||
bucket: my-kubeshark-snapshots
|
||||
project: my-gcp-project
|
||||
```
|
||||
|
||||
Or with a service account key via `--set`:
|
||||
|
||||
```bash
|
||||
helm install kubeshark kubeshark/kubeshark \
|
||||
--set tap.snapshots.cloud.provider=gcs \
|
||||
--set tap.snapshots.cloud.gcs.bucket=my-kubeshark-snapshots \
|
||||
--set tap.snapshots.cloud.gcs.project=my-gcp-project \
|
||||
--set-file tap.snapshots.cloud.gcs.credentialsJson=service-account.json
|
||||
```
|
||||
|
||||
### Example: Workload Identity (recommended for GKE)
|
||||
|
||||
Create a ConfigMap with bucket configuration:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-gcs-config
|
||||
data:
|
||||
SNAPSHOT_GCS_BUCKET: my-kubeshark-snapshots
|
||||
SNAPSHOT_GCS_PROJECT: my-gcp-project
|
||||
```
|
||||
|
||||
Set Helm values:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "gcs"
|
||||
configMaps:
|
||||
- kubeshark-gcs-config
|
||||
```
|
||||
|
||||
Configure GKE Workload Identity to allow the Kubernetes service account to impersonate the GCP service account:
|
||||
|
||||
```bash
|
||||
# Ensure the GKE cluster has Workload Identity enabled
|
||||
# (--workload-pool=PROJECT_ID.svc.id.goog at cluster creation)
|
||||
|
||||
# Create a GCP service account (if not already created)
|
||||
gcloud iam service-accounts create kubeshark-gcs \
|
||||
--display-name="Kubeshark GCS Snapshots"
|
||||
|
||||
# Grant bucket access (read-write — see Required IAM Permissions above)
|
||||
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
|
||||
--member="serviceAccount:kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com" \
|
||||
--role="roles/storage.legacyBucketReader"
|
||||
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
|
||||
--member="serviceAccount:kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com" \
|
||||
--role="roles/storage.objectAdmin"
|
||||
|
||||
# Allow the K8s service account to impersonate the GCP service account
|
||||
# Note: the K8s SA name is "<release-name>-service-account" (default: "kubeshark-service-account")
|
||||
gcloud iam service-accounts add-iam-policy-binding \
|
||||
kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com \
|
||||
--role="roles/iam.workloadIdentityUser" \
|
||||
--member="serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/kubeshark-service-account]"
|
||||
```
|
||||
|
||||
Set Helm values — the `tap.annotations` field adds the Workload Identity annotation to the service account:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
annotations:
|
||||
iam.gke.io/gcp-service-account: kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "gcs"
|
||||
configMaps:
|
||||
- kubeshark-gcs-config
|
||||
```
|
||||
|
||||
Or via `--set`:
|
||||
|
||||
```bash
|
||||
helm install kubeshark kubeshark/kubeshark \
|
||||
--set tap.snapshots.cloud.provider=gcs \
|
||||
--set tap.snapshots.cloud.gcs.bucket=BUCKET_NAME \
|
||||
--set tap.snapshots.cloud.gcs.project=PROJECT_ID \
|
||||
--set tap.annotations."iam\.gke\.io/gcp-service-account"=kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com
|
||||
```
|
||||
|
||||
No `credentialsJson` secret is needed — GKE injects credentials automatically via the Workload Identity metadata server.
|
||||
|
||||
### Example: Service Account Key
|
||||
|
||||
Create a Secret with the service account JSON key:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: kubeshark-gcs-creds
|
||||
type: Opaque
|
||||
stringData:
|
||||
SNAPSHOT_GCS_CREDENTIALS_JSON: |
|
||||
{
|
||||
"type": "service_account",
|
||||
"project_id": "my-gcp-project",
|
||||
"private_key_id": "...",
|
||||
"private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
|
||||
"client_email": "kubeshark@my-gcp-project.iam.gserviceaccount.com",
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
Create a ConfigMap with bucket configuration:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: kubeshark-gcs-config
|
||||
data:
|
||||
SNAPSHOT_GCS_BUCKET: my-kubeshark-snapshots
|
||||
SNAPSHOT_GCS_PROJECT: my-gcp-project
|
||||
```
|
||||
|
||||
Set Helm values:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "gcs"
|
||||
configMaps:
|
||||
- kubeshark-gcs-config
|
||||
secrets:
|
||||
- kubeshark-gcs-creds
|
||||
```
|
||||
@@ -44,6 +44,12 @@ rules:
|
||||
- create
|
||||
- update
|
||||
- delete
|
||||
- apiGroups:
|
||||
- authentication.k8s.io
|
||||
resources:
|
||||
- tokenreviews
|
||||
verbs:
|
||||
- create
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
@@ -86,6 +92,15 @@ rules:
|
||||
verbs:
|
||||
- create
|
||||
- get
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- persistentvolumeclaims
|
||||
verbs:
|
||||
- create
|
||||
- get
|
||||
- list
|
||||
- delete
|
||||
- apiGroups:
|
||||
- batch
|
||||
resources:
|
||||
|
||||
@@ -37,13 +37,17 @@ spec:
|
||||
- -loglevel
|
||||
- '{{ .Values.logLevel | default "warning" }}'
|
||||
- -capture-stop-after
|
||||
- "{{ if hasKey .Values.tap.capture "stopAfter" }}{{ .Values.tap.capture.stopAfter }}{{ else }}5m{{ end }}"
|
||||
- "{{ if hasKey .Values.tap.capture.dissection "stopAfter" }}{{ .Values.tap.capture.dissection.stopAfter }}{{ else }}5m{{ end }}"
|
||||
- -snapshot-size-limit
|
||||
- '{{ .Values.tap.snapshots.storageSize }}'
|
||||
{{- if .Values.tap.delayedDissection.image }}
|
||||
- '{{ .Values.tap.snapshots.local.storageSize }}'
|
||||
- -dissector-image
|
||||
- '{{ .Values.tap.delayedDissection.image }}'
|
||||
{{- end }}
|
||||
{{- if .Values.tap.docker.overrideImage.worker }}
|
||||
- '{{ .Values.tap.docker.overrideImage.worker }}'
|
||||
{{- else if .Values.tap.docker.overrideTag.worker }}
|
||||
- '{{ .Values.tap.docker.registry }}/worker:{{ .Values.tap.docker.overrideTag.worker }}'
|
||||
{{- else }}
|
||||
- '{{ .Values.tap.docker.registry }}/worker:{{ not (eq .Values.tap.docker.tag "") | ternary .Values.tap.docker.tag (include "kubeshark.defaultVersion" .) }}'
|
||||
{{- end }}
|
||||
{{- if .Values.tap.delayedDissection.cpu }}
|
||||
- -dissector-cpu
|
||||
- '{{ .Values.tap.delayedDissection.cpu }}'
|
||||
@@ -52,17 +56,49 @@ spec:
|
||||
- -dissector-memory
|
||||
- '{{ .Values.tap.delayedDissection.memory }}'
|
||||
{{- end }}
|
||||
{{- $dissectorStorageSize := .Values.tap.delayedDissection.storageSize | default .Values.tap.snapshots.local.storageSize }}
|
||||
{{- if $dissectorStorageSize }}
|
||||
- -dissector-storage-size
|
||||
- '{{ $dissectorStorageSize }}'
|
||||
{{- end }}
|
||||
{{- $dissectorStorageClass := .Values.tap.delayedDissection.storageClass | default .Values.tap.snapshots.local.storageClass }}
|
||||
{{- if $dissectorStorageClass }}
|
||||
- -dissector-storage-class
|
||||
- '{{ $dissectorStorageClass }}'
|
||||
{{- end }}
|
||||
{{- if .Values.tap.gitops.enabled }}
|
||||
- -gitops
|
||||
{{- end }}
|
||||
- -cloud-api-url
|
||||
- '{{ .Values.cloudApiUrl }}'
|
||||
{{- if .Values.tap.secrets }}
|
||||
{{- if .Values.tap.snapshots.cloud.provider }}
|
||||
- -cloud-storage-provider
|
||||
- '{{ .Values.tap.snapshots.cloud.provider }}'
|
||||
{{- end }}
|
||||
{{- $hasInlineConfig := or .Values.tap.snapshots.cloud.prefix .Values.tap.snapshots.cloud.s3.bucket .Values.tap.snapshots.cloud.s3.region .Values.tap.snapshots.cloud.s3.roleArn .Values.tap.snapshots.cloud.s3.externalId .Values.tap.snapshots.cloud.azblob.storageAccount .Values.tap.snapshots.cloud.azblob.container .Values.tap.snapshots.cloud.gcs.bucket .Values.tap.snapshots.cloud.gcs.project }}
|
||||
{{- $hasInlineSecrets := or .Values.tap.snapshots.cloud.s3.accessKey .Values.tap.snapshots.cloud.s3.secretKey .Values.tap.snapshots.cloud.azblob.storageKey .Values.tap.snapshots.cloud.gcs.credentialsJson }}
|
||||
{{- if or .Values.tap.secrets .Values.tap.snapshots.cloud.configMaps .Values.tap.snapshots.cloud.secrets $hasInlineConfig $hasInlineSecrets }}
|
||||
envFrom:
|
||||
{{- range .Values.tap.secrets }}
|
||||
- secretRef:
|
||||
name: {{ . }}
|
||||
{{- end }}
|
||||
{{- range .Values.tap.snapshots.cloud.configMaps }}
|
||||
- configMapRef:
|
||||
name: {{ . }}
|
||||
{{- end }}
|
||||
{{- range .Values.tap.snapshots.cloud.secrets }}
|
||||
- secretRef:
|
||||
name: {{ . }}
|
||||
{{- end }}
|
||||
{{- if $hasInlineConfig }}
|
||||
- configMapRef:
|
||||
name: {{ include "kubeshark.name" . }}-cloud-config
|
||||
{{- end }}
|
||||
{{- if $hasInlineSecrets }}
|
||||
- secretRef:
|
||||
name: {{ include "kubeshark.name" . }}-cloud-secret
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
env:
|
||||
- name: POD_NAME
|
||||
@@ -184,10 +220,10 @@ spec:
|
||||
- key: AUTH_SAML_X509_KEY
|
||||
path: kubeshark.key
|
||||
- name: snapshots-volume
|
||||
{{- if .Values.tap.snapshots.storageClass }}
|
||||
{{- if .Values.tap.snapshots.local.storageClass }}
|
||||
persistentVolumeClaim:
|
||||
claimName: {{ include "kubeshark.name" . }}-snapshots-pvc
|
||||
{{- else }}
|
||||
emptyDir:
|
||||
sizeLimit: {{ .Values.tap.snapshots.storageSize }}
|
||||
sizeLimit: {{ .Values.tap.snapshots.local.storageSize }}
|
||||
{{- end }}
|
||||
|
||||
@@ -26,15 +26,15 @@ spec:
|
||||
- env:
|
||||
- name: REACT_APP_AUTH_ENABLED
|
||||
value: '{{- if or (and .Values.cloudLicenseEnabled (not (empty .Values.license))) (not .Values.internetConnectivity) -}}
|
||||
{{ (and .Values.tap.auth.enabled (eq .Values.tap.auth.type "dex")) | ternary true false }}
|
||||
{{ (default false .Values.demoModeEnabled) | ternary true ((and .Values.tap.auth.enabled (or (eq .Values.tap.auth.type "oidc") (eq .Values.tap.auth.type "dex"))) | ternary true false) }}
|
||||
{{- else -}}
|
||||
{{ .Values.cloudLicenseEnabled | ternary "true" .Values.tap.auth.enabled }}
|
||||
{{ .Values.cloudLicenseEnabled | ternary "true" ((default false .Values.demoModeEnabled) | ternary "true" .Values.tap.auth.enabled) }}
|
||||
{{- end }}'
|
||||
- name: REACT_APP_AUTH_TYPE
|
||||
value: '{{- if and .Values.cloudLicenseEnabled (not (eq .Values.tap.auth.type "dex")) -}}
|
||||
value: '{{- if and .Values.cloudLicenseEnabled (not (or (eq .Values.tap.auth.type "oidc") (eq .Values.tap.auth.type "dex"))) -}}
|
||||
default
|
||||
{{- else -}}
|
||||
{{ .Values.tap.auth.type }}
|
||||
{{ (default false .Values.demoModeEnabled) | ternary "default" .Values.tap.auth.type }}
|
||||
{{- end }}'
|
||||
- name: REACT_APP_COMPLETE_STREAMING_ENABLED
|
||||
value: '{{- if and (hasKey .Values.tap "dashboard") (hasKey .Values.tap.dashboard "completeStreamingEnabled") -}}
|
||||
@@ -48,29 +48,29 @@ spec:
|
||||
value: '{{ not (eq .Values.tap.auth.saml.idpMetadataUrl "") | ternary .Values.tap.auth.saml.idpMetadataUrl " " }}'
|
||||
- name: REACT_APP_TIMEZONE
|
||||
value: '{{ not (eq .Values.timezone "") | ternary .Values.timezone " " }}'
|
||||
- name: REACT_APP_SCRIPTING_DISABLED
|
||||
value: '{{- if .Values.tap.liveConfigMapChangesDisabled -}}
|
||||
{{- if .Values.demoModeEnabled -}}
|
||||
{{ .Values.demoModeEnabled | ternary false true }}
|
||||
{{- else -}}
|
||||
true
|
||||
{{- end }}
|
||||
- name: REACT_APP_SCRIPTING_HIDDEN
|
||||
value: '{{- if and .Values.scripting (eq (.Values.scripting.enabled | toString) "false") -}}
|
||||
true
|
||||
{{- else -}}
|
||||
false
|
||||
{{- end }}'
|
||||
- name: REACT_APP_SCRIPTING_DISABLED
|
||||
value: '{{ default false .Values.demoModeEnabled }}'
|
||||
- name: REACT_APP_TARGETED_PODS_UPDATE_DISABLED
|
||||
value: '{{ .Values.tap.liveConfigMapChangesDisabled }}'
|
||||
value: '{{ default false .Values.demoModeEnabled }}'
|
||||
- name: REACT_APP_PRESET_FILTERS_CHANGING_ENABLED
|
||||
value: '{{ .Values.tap.liveConfigMapChangesDisabled | ternary "false" "true" }}'
|
||||
value: '{{ not (default false .Values.demoModeEnabled) }}'
|
||||
- name: REACT_APP_BPF_OVERRIDE_DISABLED
|
||||
value: '{{ eq .Values.tap.packetCapture "af_packet" | ternary "false" "true" }}'
|
||||
- name: REACT_APP_RECORDING_DISABLED
|
||||
value: '{{ .Values.tap.liveConfigMapChangesDisabled }}'
|
||||
- name: REACT_APP_STOP_TRAFFIC_CAPTURING_DISABLED
|
||||
value: '{{- if and .Values.tap.liveConfigMapChangesDisabled .Values.tap.capture.stopped -}}
|
||||
false
|
||||
value: '{{ default false .Values.demoModeEnabled }}'
|
||||
- name: REACT_APP_DISSECTION_ENABLED
|
||||
value: '{{ .Values.tap.capture.dissection.enabled | ternary "true" "false" }}'
|
||||
- name: REACT_APP_DISSECTION_CONTROL_ENABLED
|
||||
value: '{{- if and (not .Values.demoModeEnabled) (not .Values.tap.capture.dissection.enabled) -}}
|
||||
true
|
||||
{{- else -}}
|
||||
{{ .Values.tap.liveConfigMapChangesDisabled | ternary "true" "false" }}
|
||||
{{ (default false .Values.demoModeEnabled) | ternary false true }}
|
||||
{{- end -}}'
|
||||
- name: 'REACT_APP_CLOUD_LICENSE_ENABLED'
|
||||
value: '{{- if or (and .Values.cloudLicenseEnabled (not (empty .Values.license))) (not .Values.internetConnectivity) -}}
|
||||
@@ -83,9 +83,17 @@ spec:
|
||||
- name: REACT_APP_BETA_ENABLED
|
||||
value: '{{ default false .Values.betaEnabled | ternary "true" "false" }}'
|
||||
- name: REACT_APP_DISSECTORS_UPDATING_ENABLED
|
||||
value: '{{ .Values.tap.liveConfigMapChangesDisabled | ternary "false" "true" }}'
|
||||
value: '{{ not (default false .Values.demoModeEnabled) }}'
|
||||
- name: REACT_APP_SNAPSHOTS_UPDATING_ENABLED
|
||||
value: '{{ not (default false .Values.demoModeEnabled) }}'
|
||||
- name: REACT_APP_DEMO_MODE_ENABLED
|
||||
value: '{{ default false .Values.demoModeEnabled }}'
|
||||
- name: REACT_APP_CLUSTER_WIDE_MAP_ENABLED
|
||||
value: '{{ default false (((.Values).tap).dashboard).clusterWideMapEnabled }}'
|
||||
- name: REACT_APP_RAW_CAPTURE_ENABLED
|
||||
value: '{{ .Values.tap.capture.raw.enabled | ternary "true" "false" }}'
|
||||
- name: REACT_APP_ENTRIES_LIMIT
|
||||
value: '{{ default 300000 (((.Values).tap).dashboard).entriesLimit }}'
|
||||
- name: REACT_APP_SENTRY_ENABLED
|
||||
value: '{{ (include "sentry.enabled" .) }}'
|
||||
- name: REACT_APP_SENTRY_ENVIRONMENT
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
{{- if .Values.tap.snapshots.storageClass }}
|
||||
{{- if .Values.tap.snapshots.local.storageClass }}
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
@@ -16,7 +16,7 @@ spec:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: {{ .Values.tap.snapshots.storageSize }}
|
||||
storageClassName: {{ .Values.tap.snapshots.storageClass }}
|
||||
storage: {{ .Values.tap.snapshots.local.storageSize }}
|
||||
storageClassName: {{ .Values.tap.snapshots.local.storageClass }}
|
||||
status: {}
|
||||
{{- end }}
|
||||
|
||||
@@ -21,6 +21,7 @@ spec:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: worker
|
||||
kubeshark.io/internal-auth: "true"
|
||||
{{- include "kubeshark.labels" . | nindent 8 }}
|
||||
name: kubeshark-worker-daemon-set
|
||||
namespace: kubeshark
|
||||
@@ -99,6 +100,10 @@ spec:
|
||||
- '{{ .Values.tap.misc.resolutionStrategy }}'
|
||||
- -staletimeout
|
||||
- '{{ .Values.tap.misc.staleTimeoutSeconds }}'
|
||||
- -tcp-flow-full-timeout
|
||||
- '{{ .Values.tap.misc.tcpFlowTimeout }}'
|
||||
- -udp-flow-full-timeout
|
||||
- '{{ .Values.tap.misc.udpFlowTimeout }}'
|
||||
- -storage-size
|
||||
- '{{ .Values.tap.storageLimit }}'
|
||||
- -capture-db-max-size
|
||||
@@ -127,6 +132,10 @@ spec:
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: metadata.namespace
|
||||
- name: NODE_NAME
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
- name: TCP_STREAM_CHANNEL_TIMEOUT_MS
|
||||
value: '{{ .Values.tap.misc.tcpStreamChannelTimeoutMs }}'
|
||||
- name: TCP_STREAM_CHANNEL_TIMEOUT_SHOW
|
||||
@@ -223,6 +232,9 @@ spec:
|
||||
mountPropagation: HostToContainer
|
||||
- mountPath: /app/data
|
||||
name: data
|
||||
{{- if .Values.tap.persistentStorage }}
|
||||
subPathExpr: $(NODE_NAME)
|
||||
{{- end }}
|
||||
{{- if .Values.tap.tls }}
|
||||
- command:
|
||||
- ./tracer
|
||||
@@ -253,6 +265,10 @@ spec:
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: metadata.namespace
|
||||
- name: NODE_NAME
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
- name: PROFILING_ENABLED
|
||||
value: '{{ .Values.tap.pprof.enabled }}'
|
||||
- name: SENTRY_ENABLED
|
||||
@@ -324,6 +340,9 @@ spec:
|
||||
mountPropagation: HostToContainer
|
||||
- mountPath: /app/data
|
||||
name: data
|
||||
{{- if .Values.tap.persistentStorage }}
|
||||
subPathExpr: $(NODE_NAME)
|
||||
{{- end }}
|
||||
- mountPath: /etc/os-release
|
||||
name: os-release
|
||||
readOnly: true
|
||||
|
||||
@@ -20,6 +20,10 @@ data:
|
||||
client_header_buffer_size 32k;
|
||||
large_client_header_buffers 8 64k;
|
||||
|
||||
proxy_buffer_size 64k;
|
||||
proxy_buffers 4 128k;
|
||||
proxy_busy_buffers_size 128k;
|
||||
|
||||
location {{ default "" (((.Values.tap).routing).front).basePath }}/api {
|
||||
rewrite ^{{ default "" (((.Values.tap).routing).front).basePath }}/api(.*)$ $1 break;
|
||||
proxy_pass http://kubeshark-hub;
|
||||
@@ -30,8 +34,10 @@ data:
|
||||
proxy_set_header Authorization $http_authorization;
|
||||
proxy_pass_header Authorization;
|
||||
proxy_connect_timeout 4s;
|
||||
proxy_read_timeout 120s;
|
||||
proxy_send_timeout 12s;
|
||||
# Disable buffering for gRPC/Connect streaming
|
||||
client_max_body_size 0;
|
||||
proxy_request_buffering off;
|
||||
proxy_buffering off;
|
||||
proxy_pass_request_headers on;
|
||||
}
|
||||
|
||||
@@ -86,4 +92,3 @@ data:
|
||||
root /usr/share/nginx/html;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ data:
|
||||
NAMESPACES: '{{ gt (len .Values.tap.namespaces) 0 | ternary (join "," .Values.tap.namespaces) "" }}'
|
||||
EXCLUDED_NAMESPACES: '{{ gt (len .Values.tap.excludedNamespaces) 0 | ternary (join "," .Values.tap.excludedNamespaces) "" }}'
|
||||
BPF_OVERRIDE: '{{ .Values.tap.bpfOverride }}'
|
||||
STOPPED: '{{ .Values.tap.capture.stopped | ternary "true" "false" }}'
|
||||
DISSECTION_ENABLED: '{{ .Values.tap.capture.dissection.enabled | ternary "true" "false" }}'
|
||||
CAPTURE_SELF: '{{ .Values.tap.capture.captureSelf | ternary "true" "false" }}'
|
||||
SCRIPTING_SCRIPTS: '{}'
|
||||
SCRIPTING_ACTIVE_SCRIPTS: '{{ gt (len .Values.scripting.active) 0 | ternary (join "," .Values.scripting.active) "" }}'
|
||||
@@ -19,48 +19,41 @@ data:
|
||||
INGRESS_HOST: '{{ .Values.tap.ingress.host }}'
|
||||
PROXY_FRONT_PORT: '{{ .Values.tap.proxy.front.port }}'
|
||||
AUTH_ENABLED: '{{- if and .Values.cloudLicenseEnabled (not (empty .Values.license)) -}}
|
||||
{{ and .Values.tap.auth.enabled (eq .Values.tap.auth.type "dex") | ternary true false }}
|
||||
{{ (default false .Values.demoModeEnabled) | ternary true ((and .Values.tap.auth.enabled (or (eq .Values.tap.auth.type "oidc") (eq .Values.tap.auth.type "dex"))) | ternary true false) }}
|
||||
{{- else -}}
|
||||
{{ .Values.cloudLicenseEnabled | ternary "true" (.Values.tap.auth.enabled | ternary "true" "") }}
|
||||
{{ .Values.cloudLicenseEnabled | ternary "true" ((default false .Values.demoModeEnabled) | ternary "true" .Values.tap.auth.enabled) }}
|
||||
{{- end }}'
|
||||
AUTH_TYPE: '{{- if and .Values.cloudLicenseEnabled (not (eq .Values.tap.auth.type "dex")) -}}
|
||||
AUTH_TYPE: '{{- if and .Values.cloudLicenseEnabled (not (or (eq .Values.tap.auth.type "oidc") (eq .Values.tap.auth.type "dex"))) -}}
|
||||
default
|
||||
{{- else -}}
|
||||
{{ .Values.tap.auth.type }}
|
||||
{{ (default false .Values.demoModeEnabled) | ternary "default" .Values.tap.auth.type }}
|
||||
{{- end }}'
|
||||
AUTH_SAML_IDP_METADATA_URL: '{{ .Values.tap.auth.saml.idpMetadataUrl }}'
|
||||
AUTH_SAML_ROLE_ATTRIBUTE: '{{ .Values.tap.auth.saml.roleAttribute }}'
|
||||
AUTH_SAML_ROLES: '{{ .Values.tap.auth.saml.roles | toJson }}'
|
||||
AUTH_OIDC_ISSUER: '{{ default "not set" (((.Values.tap).auth).dexOidc).issuer }}'
|
||||
AUTH_OIDC_REFRESH_TOKEN_LIFETIME: '{{ default "3960h" (((.Values.tap).auth).dexOidc).refreshTokenLifetime }}'
|
||||
AUTH_OIDC_STATE_PARAM_EXPIRY: '{{ default "10m" (((.Values.tap).auth).dexOidc).oauth2StateParamExpiry }}'
|
||||
AUTH_ROLES: '{{ .Values.tap.auth.roles | toJson }}'
|
||||
AUTH_ROLES_CLAIM: '{{ .Values.tap.auth.rolesClaim }}'
|
||||
AUTH_DEFAULT_ROLE: '{{ default "" .Values.tap.auth.defaultRole }}'
|
||||
AUTH_OIDC_ISSUER: '{{ default "not set" (((.Values.tap).auth).oidc).issuer }}'
|
||||
AUTH_OIDC_REFRESH_TOKEN_LIFETIME: '{{ default "3960h" (((.Values.tap).auth).oidc).refreshTokenLifetime }}'
|
||||
AUTH_OIDC_STATE_PARAM_EXPIRY: '{{ default "10m" (((.Values.tap).auth).oidc).oauth2StateParamExpiry }}'
|
||||
AUTH_OIDC_BYPASS_SSL_CA_CHECK: '{{- if and
|
||||
(hasKey .Values.tap "auth")
|
||||
(hasKey .Values.tap.auth "dexOidc")
|
||||
(hasKey .Values.tap.auth.dexOidc "bypassSslCaCheck")
|
||||
(hasKey .Values.tap.auth "oidc")
|
||||
(hasKey .Values.tap.auth.oidc "bypassSslCaCheck")
|
||||
-}}
|
||||
{{ eq .Values.tap.auth.dexOidc.bypassSslCaCheck true | ternary "true" "false" }}
|
||||
{{ eq .Values.tap.auth.oidc.bypassSslCaCheck true | ternary "true" "false" }}
|
||||
{{- else -}}
|
||||
false
|
||||
{{- end }}'
|
||||
TELEMETRY_DISABLED: '{{ not .Values.internetConnectivity | ternary "true" (not .Values.tap.telemetry.enabled | ternary "true" "false") }}'
|
||||
SCRIPTING_DISABLED: '{{- if .Values.tap.liveConfigMapChangesDisabled -}}
|
||||
{{- if .Values.demoModeEnabled -}}
|
||||
{{ .Values.demoModeEnabled | ternary false true }}
|
||||
{{- else -}}
|
||||
true
|
||||
{{- end }}
|
||||
{{- else -}}
|
||||
false
|
||||
{{- end }}'
|
||||
TARGETED_PODS_UPDATE_DISABLED: '{{ .Values.tap.liveConfigMapChangesDisabled | ternary "true" "" }}'
|
||||
PRESET_FILTERS_CHANGING_ENABLED: '{{ .Values.tap.liveConfigMapChangesDisabled | ternary "false" "true" }}'
|
||||
RECORDING_DISABLED: '{{ .Values.tap.liveConfigMapChangesDisabled | ternary "true" "" }}'
|
||||
STOP_TRAFFIC_CAPTURING_DISABLED: '{{- if and .Values.tap.liveConfigMapChangesDisabled .Values.tap.capture.stopped -}}
|
||||
false
|
||||
{{- else -}}
|
||||
{{ .Values.tap.liveConfigMapChangesDisabled | ternary "true" "false" }}
|
||||
{{- end }}'
|
||||
SCRIPTING_DISABLED: '{{ default false .Values.demoModeEnabled }}'
|
||||
TARGETED_PODS_UPDATE_DISABLED: '{{ default false .Values.demoModeEnabled }}'
|
||||
PRESET_FILTERS_CHANGING_ENABLED: '{{ not (default false .Values.demoModeEnabled) }}'
|
||||
RECORDING_DISABLED: '{{ (default false .Values.demoModeEnabled) | ternary true false }}'
|
||||
DISSECTION_CONTROL_ENABLED: '{{- if and (not .Values.demoModeEnabled) (not .Values.tap.capture.dissection.enabled) -}}
|
||||
true
|
||||
{{- else -}}
|
||||
{{ (default false .Values.demoModeEnabled) | ternary false true }}
|
||||
{{- end }}'
|
||||
GLOBAL_FILTER: {{ include "kubeshark.escapeDoubleQuotes" .Values.tap.globalFilter | quote }}
|
||||
DEFAULT_FILTER: {{ include "kubeshark.escapeDoubleQuotes" .Values.tap.defaultFilter | quote }}
|
||||
TRAFFIC_SAMPLE_RATE: '{{ .Values.tap.misc.trafficSampleRate }}'
|
||||
@@ -76,12 +69,14 @@ data:
|
||||
DUPLICATE_TIMEFRAME: '{{ .Values.tap.misc.duplicateTimeframe }}'
|
||||
ENABLED_DISSECTORS: '{{ gt (len .Values.tap.enabledDissectors) 0 | ternary (join "," .Values.tap.enabledDissectors) "" }}'
|
||||
CUSTOM_MACROS: '{{ toJson .Values.tap.customMacros }}'
|
||||
DISSECTORS_UPDATING_ENABLED: '{{ .Values.tap.liveConfigMapChangesDisabled | ternary "false" "true" }}'
|
||||
DISSECTORS_UPDATING_ENABLED: '{{ not (default false .Values.demoModeEnabled) }}'
|
||||
SNAPSHOTS_UPDATING_ENABLED: '{{ not (default false .Values.demoModeEnabled) }}'
|
||||
DEMO_MODE_ENABLED: '{{ default false .Values.demoModeEnabled }}'
|
||||
DETECT_DUPLICATES: '{{ .Values.tap.misc.detectDuplicates | ternary "true" "false" }}'
|
||||
PCAP_DUMP_ENABLE: '{{ .Values.pcapdump.enabled }}'
|
||||
PCAP_TIME_INTERVAL: '{{ .Values.pcapdump.timeInterval }}'
|
||||
PCAP_MAX_TIME: '{{ .Values.pcapdump.maxTime }}'
|
||||
PCAP_MAX_SIZE: '{{ .Values.pcapdump.maxSize }}'
|
||||
PORT_MAPPING: '{{ toJson .Values.tap.portMapping }}'
|
||||
RAW_CAPTURE: '{{ .Values.tap.capture.raw.enabled | ternary "true" "false" }}'
|
||||
RAW_CAPTURE_ENABLED: '{{ .Values.tap.capture.raw.enabled | ternary "true" "false" }}'
|
||||
RAW_CAPTURE_STORAGE_SIZE: '{{ .Values.tap.capture.raw.storageSize }}'
|
||||
|
||||
@@ -9,8 +9,8 @@ metadata:
|
||||
stringData:
|
||||
LICENSE: '{{ .Values.license }}'
|
||||
SCRIPTING_ENV: '{{ .Values.scripting.env | toJson }}'
|
||||
OIDC_CLIENT_ID: '{{ default "not set" (((.Values.tap).auth).dexOidc).clientId }}'
|
||||
OIDC_CLIENT_SECRET: '{{ default "not set" (((.Values.tap).auth).dexOidc).clientSecret }}'
|
||||
OIDC_CLIENT_ID: '{{ default "not set" (((.Values.tap).auth).oidc).clientId }}'
|
||||
OIDC_CLIENT_SECRET: '{{ default "not set" (((.Values.tap).auth).oidc).clientSecret }}'
|
||||
|
||||
---
|
||||
|
||||
|
||||
64
helm-chart/templates/21-cloud-storage.yaml
Normal file
64
helm-chart/templates/21-cloud-storage.yaml
Normal file
@@ -0,0 +1,64 @@
|
||||
{{- $hasConfigValues := or .Values.tap.snapshots.cloud.prefix .Values.tap.snapshots.cloud.s3.bucket .Values.tap.snapshots.cloud.s3.region .Values.tap.snapshots.cloud.s3.roleArn .Values.tap.snapshots.cloud.s3.externalId .Values.tap.snapshots.cloud.azblob.storageAccount .Values.tap.snapshots.cloud.azblob.container .Values.tap.snapshots.cloud.gcs.bucket .Values.tap.snapshots.cloud.gcs.project -}}
|
||||
{{- $hasSecretValues := or .Values.tap.snapshots.cloud.s3.accessKey .Values.tap.snapshots.cloud.s3.secretKey .Values.tap.snapshots.cloud.azblob.storageKey .Values.tap.snapshots.cloud.gcs.credentialsJson -}}
|
||||
{{- if $hasConfigValues }}
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
labels:
|
||||
{{- include "kubeshark.labels" . | nindent 4 }}
|
||||
name: {{ include "kubeshark.name" . }}-cloud-config
|
||||
namespace: {{ .Release.Namespace }}
|
||||
data:
|
||||
{{- if .Values.tap.snapshots.cloud.prefix }}
|
||||
SNAPSHOT_CLOUD_PREFIX: {{ .Values.tap.snapshots.cloud.prefix | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.s3.bucket }}
|
||||
SNAPSHOT_AWS_BUCKET: {{ .Values.tap.snapshots.cloud.s3.bucket | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.s3.region }}
|
||||
SNAPSHOT_AWS_REGION: {{ .Values.tap.snapshots.cloud.s3.region | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.s3.roleArn }}
|
||||
SNAPSHOT_AWS_ROLE_ARN: {{ .Values.tap.snapshots.cloud.s3.roleArn | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.s3.externalId }}
|
||||
SNAPSHOT_AWS_EXTERNAL_ID: {{ .Values.tap.snapshots.cloud.s3.externalId | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.azblob.storageAccount }}
|
||||
SNAPSHOT_AZBLOB_STORAGE_ACCOUNT: {{ .Values.tap.snapshots.cloud.azblob.storageAccount | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.azblob.container }}
|
||||
SNAPSHOT_AZBLOB_CONTAINER: {{ .Values.tap.snapshots.cloud.azblob.container | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.gcs.bucket }}
|
||||
SNAPSHOT_GCS_BUCKET: {{ .Values.tap.snapshots.cloud.gcs.bucket | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.gcs.project }}
|
||||
SNAPSHOT_GCS_PROJECT: {{ .Values.tap.snapshots.cloud.gcs.project | quote }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if $hasSecretValues }}
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
labels:
|
||||
{{- include "kubeshark.labels" . | nindent 4 }}
|
||||
name: {{ include "kubeshark.name" . }}-cloud-secret
|
||||
namespace: {{ .Release.Namespace }}
|
||||
type: Opaque
|
||||
stringData:
|
||||
{{- if .Values.tap.snapshots.cloud.s3.accessKey }}
|
||||
SNAPSHOT_AWS_ACCESS_KEY: {{ .Values.tap.snapshots.cloud.s3.accessKey | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.s3.secretKey }}
|
||||
SNAPSHOT_AWS_SECRET_KEY: {{ .Values.tap.snapshots.cloud.s3.secretKey | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.azblob.storageKey }}
|
||||
SNAPSHOT_AZBLOB_STORAGE_KEY: {{ .Values.tap.snapshots.cloud.azblob.storageKey | quote }}
|
||||
{{- end }}
|
||||
{{- if .Values.tap.snapshots.cloud.gcs.credentialsJson }}
|
||||
SNAPSHOT_GCS_CREDENTIALS_JSON: {{ .Values.tap.snapshots.cloud.gcs.credentialsJson | quote }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
248
helm-chart/tests/cloud_storage_test.yaml
Normal file
248
helm-chart/tests/cloud_storage_test.yaml
Normal file
@@ -0,0 +1,248 @@
|
||||
suite: cloud storage template
|
||||
templates:
|
||||
- templates/21-cloud-storage.yaml
|
||||
tests:
|
||||
- it: should render nothing with default values
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 0
|
||||
|
||||
- it: should render ConfigMap with S3 config only
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.region: us-east-1
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: metadata.name
|
||||
value: RELEASE-NAME-cloud-config
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_BUCKET
|
||||
value: "my-bucket"
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_REGION
|
||||
value: "us-east-1"
|
||||
documentIndex: 0
|
||||
- notExists:
|
||||
path: data.SNAPSHOT_AWS_ACCESS_KEY
|
||||
documentIndex: 0
|
||||
|
||||
- it: should render ConfigMap and Secret with S3 config and credentials
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.region: us-east-1
|
||||
tap.snapshots.cloud.s3.accessKey: AKIAIOSFODNN7EXAMPLE
|
||||
tap.snapshots.cloud.s3.secretKey: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 2
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_BUCKET
|
||||
value: "my-bucket"
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_REGION
|
||||
value: "us-east-1"
|
||||
documentIndex: 0
|
||||
- isKind:
|
||||
of: Secret
|
||||
documentIndex: 1
|
||||
- equal:
|
||||
path: metadata.name
|
||||
value: RELEASE-NAME-cloud-secret
|
||||
documentIndex: 1
|
||||
- equal:
|
||||
path: stringData.SNAPSHOT_AWS_ACCESS_KEY
|
||||
value: "AKIAIOSFODNN7EXAMPLE"
|
||||
documentIndex: 1
|
||||
- equal:
|
||||
path: stringData.SNAPSHOT_AWS_SECRET_KEY
|
||||
value: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
|
||||
documentIndex: 1
|
||||
|
||||
- it: should render ConfigMap with Azure Blob config only
|
||||
set:
|
||||
tap.snapshots.cloud.azblob.storageAccount: myaccount
|
||||
tap.snapshots.cloud.azblob.container: mycontainer
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AZBLOB_STORAGE_ACCOUNT
|
||||
value: "myaccount"
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AZBLOB_CONTAINER
|
||||
value: "mycontainer"
|
||||
documentIndex: 0
|
||||
|
||||
- it: should render ConfigMap and Secret with Azure Blob config and storage key
|
||||
set:
|
||||
tap.snapshots.cloud.azblob.storageAccount: myaccount
|
||||
tap.snapshots.cloud.azblob.container: mycontainer
|
||||
tap.snapshots.cloud.azblob.storageKey: c29tZWtleQ==
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 2
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AZBLOB_STORAGE_ACCOUNT
|
||||
value: "myaccount"
|
||||
documentIndex: 0
|
||||
- isKind:
|
||||
of: Secret
|
||||
documentIndex: 1
|
||||
- equal:
|
||||
path: stringData.SNAPSHOT_AZBLOB_STORAGE_KEY
|
||||
value: "c29tZWtleQ=="
|
||||
documentIndex: 1
|
||||
|
||||
- it: should render ConfigMap with GCS config only
|
||||
set:
|
||||
tap.snapshots.cloud.gcs.bucket: my-gcs-bucket
|
||||
tap.snapshots.cloud.gcs.project: my-gcp-project
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_GCS_BUCKET
|
||||
value: "my-gcs-bucket"
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_GCS_PROJECT
|
||||
value: "my-gcp-project"
|
||||
documentIndex: 0
|
||||
- notExists:
|
||||
path: data.SNAPSHOT_GCS_CREDENTIALS_JSON
|
||||
documentIndex: 0
|
||||
|
||||
- it: should render ConfigMap and Secret with GCS config and credentials
|
||||
set:
|
||||
tap.snapshots.cloud.gcs.bucket: my-gcs-bucket
|
||||
tap.snapshots.cloud.gcs.project: my-gcp-project
|
||||
tap.snapshots.cloud.gcs.credentialsJson: '{"type":"service_account"}'
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 2
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_GCS_BUCKET
|
||||
value: "my-gcs-bucket"
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_GCS_PROJECT
|
||||
value: "my-gcp-project"
|
||||
documentIndex: 0
|
||||
- isKind:
|
||||
of: Secret
|
||||
documentIndex: 1
|
||||
- equal:
|
||||
path: metadata.name
|
||||
value: RELEASE-NAME-cloud-secret
|
||||
documentIndex: 1
|
||||
- equal:
|
||||
path: stringData.SNAPSHOT_GCS_CREDENTIALS_JSON
|
||||
value: '{"type":"service_account"}'
|
||||
documentIndex: 1
|
||||
|
||||
- it: should render ConfigMap with GCS bucket only (no project)
|
||||
set:
|
||||
tap.snapshots.cloud.gcs.bucket: my-gcs-bucket
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_GCS_BUCKET
|
||||
value: "my-gcs-bucket"
|
||||
documentIndex: 0
|
||||
- notExists:
|
||||
path: data.SNAPSHOT_GCS_PROJECT
|
||||
documentIndex: 0
|
||||
|
||||
- it: should render ConfigMap with only prefix
|
||||
set:
|
||||
tap.snapshots.cloud.prefix: snapshots/prod
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_CLOUD_PREFIX
|
||||
value: "snapshots/prod"
|
||||
documentIndex: 0
|
||||
- notExists:
|
||||
path: data.SNAPSHOT_AWS_BUCKET
|
||||
documentIndex: 0
|
||||
- notExists:
|
||||
path: data.SNAPSHOT_AZBLOB_STORAGE_ACCOUNT
|
||||
documentIndex: 0
|
||||
- notExists:
|
||||
path: data.SNAPSHOT_GCS_BUCKET
|
||||
documentIndex: 0
|
||||
|
||||
- it: should render ConfigMap with role ARN without credentials (IAM auth)
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.region: us-east-1
|
||||
tap.snapshots.cloud.s3.roleArn: arn:aws:iam::123456789012:role/my-role
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- isKind:
|
||||
of: ConfigMap
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_ROLE_ARN
|
||||
value: "arn:aws:iam::123456789012:role/my-role"
|
||||
documentIndex: 0
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_BUCKET
|
||||
value: "my-bucket"
|
||||
documentIndex: 0
|
||||
|
||||
- it: should render ConfigMap with externalId
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.externalId: ext-12345
|
||||
asserts:
|
||||
- hasDocuments:
|
||||
count: 1
|
||||
- equal:
|
||||
path: data.SNAPSHOT_AWS_EXTERNAL_ID
|
||||
value: "ext-12345"
|
||||
documentIndex: 0
|
||||
|
||||
- it: should set correct namespace
|
||||
release:
|
||||
namespace: kubeshark-ns
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
asserts:
|
||||
- equal:
|
||||
path: metadata.namespace
|
||||
value: kubeshark-ns
|
||||
documentIndex: 0
|
||||
127
helm-chart/tests/dissection_storage_test.yaml
Normal file
127
helm-chart/tests/dissection_storage_test.yaml
Normal file
@@ -0,0 +1,127 @@
|
||||
suite: dissection storage configuration
|
||||
templates:
|
||||
- templates/04-hub-deployment.yaml
|
||||
tests:
|
||||
- it: should fallback to snapshot storageSize when dissection storageSize is empty
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-size
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: "20Gi"
|
||||
|
||||
- it: should fallback to snapshot storageClass when dissection storageClass is empty
|
||||
set:
|
||||
tap.snapshots.local.storageClass: gp2
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-class
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: gp2
|
||||
|
||||
- it: should not render dissector-storage-class when both dissection and snapshot storageClass are empty
|
||||
asserts:
|
||||
- notContains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-class
|
||||
|
||||
- it: should prefer dissection storageSize over snapshot storageSize
|
||||
set:
|
||||
tap.delayedDissection.storageSize: 100Gi
|
||||
tap.snapshots.local.storageSize: 50Gi
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-size
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: "100Gi"
|
||||
|
||||
- it: should prefer dissection storageClass over snapshot storageClass
|
||||
set:
|
||||
tap.delayedDissection.storageClass: io2
|
||||
tap.snapshots.local.storageClass: gp2
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-class
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: io2
|
||||
|
||||
- it: should fallback to snapshot config for both storageSize and storageClass
|
||||
set:
|
||||
tap.snapshots.local.storageSize: 30Gi
|
||||
tap.snapshots.local.storageClass: gp3
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-size
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: "30Gi"
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-class
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: gp3
|
||||
|
||||
- it: should not render dissector-storage-size when both dissection and snapshot storageSize are empty
|
||||
set:
|
||||
tap.delayedDissection.storageSize: ""
|
||||
tap.snapshots.local.storageSize: ""
|
||||
asserts:
|
||||
- notContains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-size
|
||||
|
||||
- it: should render all dissector args together with custom values
|
||||
set:
|
||||
tap.delayedDissection.cpu: "4"
|
||||
tap.delayedDissection.memory: 8Gi
|
||||
tap.delayedDissection.storageSize: 200Gi
|
||||
tap.delayedDissection.storageClass: local-path
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-cpu
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: "4"
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-memory
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: 8Gi
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-size
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: "200Gi"
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-storage-class
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: local-path
|
||||
|
||||
- it: should still render existing dissector-cpu and dissector-memory args
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-cpu
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: "1"
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -dissector-memory
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: 4Gi
|
||||
9
helm-chart/tests/fixtures/values-azblob.yaml
vendored
Normal file
9
helm-chart/tests/fixtures/values-azblob.yaml
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: azblob
|
||||
prefix: snapshots/
|
||||
azblob:
|
||||
storageAccount: kubesharkstore
|
||||
container: snapshots
|
||||
storageKey: c29tZWtleWhlcmU=
|
||||
8
helm-chart/tests/fixtures/values-cloud-refs.yaml
vendored
Normal file
8
helm-chart/tests/fixtures/values-cloud-refs.yaml
vendored
Normal file
@@ -0,0 +1,8 @@
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: s3
|
||||
configMaps:
|
||||
- my-cloud-config
|
||||
secrets:
|
||||
- my-cloud-secret
|
||||
9
helm-chart/tests/fixtures/values-gcs.yaml
vendored
Normal file
9
helm-chart/tests/fixtures/values-gcs.yaml
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: gcs
|
||||
prefix: snapshots/
|
||||
gcs:
|
||||
bucket: kubeshark-snapshots
|
||||
project: my-gcp-project
|
||||
credentialsJson: '{"type":"service_account","project_id":"my-gcp-project"}'
|
||||
10
helm-chart/tests/fixtures/values-s3.yaml
vendored
Normal file
10
helm-chart/tests/fixtures/values-s3.yaml
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: s3
|
||||
prefix: snapshots/
|
||||
s3:
|
||||
bucket: kubeshark-snapshots
|
||||
region: us-east-1
|
||||
accessKey: AKIAIOSFODNN7EXAMPLE
|
||||
secretKey: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
|
||||
167
helm-chart/tests/hub_deployment_test.yaml
Normal file
167
helm-chart/tests/hub_deployment_test.yaml
Normal file
@@ -0,0 +1,167 @@
|
||||
suite: hub deployment cloud integration
|
||||
templates:
|
||||
- templates/04-hub-deployment.yaml
|
||||
tests:
|
||||
- it: should not render envFrom with default values
|
||||
asserts:
|
||||
- isKind:
|
||||
of: Deployment
|
||||
- notContains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
any: true
|
||||
content:
|
||||
configMapRef:
|
||||
name: RELEASE-NAME-cloud-config
|
||||
|
||||
- it: should render envFrom with inline S3 config
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.region: us-east-1
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: RELEASE-NAME-cloud-config
|
||||
|
||||
- it: should render envFrom secret ref with inline credentials
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.accessKey: AKIAIOSFODNN7EXAMPLE
|
||||
tap.snapshots.cloud.s3.secretKey: secret
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: RELEASE-NAME-cloud-config
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
secretRef:
|
||||
name: RELEASE-NAME-cloud-secret
|
||||
|
||||
- it: should render envFrom with inline GCS config
|
||||
set:
|
||||
tap.snapshots.cloud.gcs.bucket: my-gcs-bucket
|
||||
tap.snapshots.cloud.gcs.project: my-gcp-project
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: RELEASE-NAME-cloud-config
|
||||
|
||||
- it: should render envFrom secret ref with inline GCS credentials
|
||||
set:
|
||||
tap.snapshots.cloud.gcs.bucket: my-gcs-bucket
|
||||
tap.snapshots.cloud.gcs.credentialsJson: '{"type":"service_account"}'
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: RELEASE-NAME-cloud-config
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
secretRef:
|
||||
name: RELEASE-NAME-cloud-secret
|
||||
|
||||
- it: should render cloud-storage-provider arg when provider is gcs
|
||||
set:
|
||||
tap.snapshots.cloud.provider: gcs
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -cloud-storage-provider
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: gcs
|
||||
|
||||
- it: should render envFrom with external configMaps
|
||||
set:
|
||||
tap.snapshots.cloud.configMaps:
|
||||
- my-cloud-config
|
||||
- my-other-config
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: my-cloud-config
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: my-other-config
|
||||
|
||||
- it: should render envFrom with external secrets
|
||||
set:
|
||||
tap.snapshots.cloud.secrets:
|
||||
- my-cloud-secret
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
secretRef:
|
||||
name: my-cloud-secret
|
||||
|
||||
- it: should render cloud-storage-provider arg when provider is set
|
||||
set:
|
||||
tap.snapshots.cloud.provider: s3
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -cloud-storage-provider
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: s3
|
||||
|
||||
- it: should not render cloud-storage-provider arg with default values
|
||||
asserts:
|
||||
- notContains:
|
||||
path: spec.template.spec.containers[0].command
|
||||
content: -cloud-storage-provider
|
||||
|
||||
- it: should render envFrom with tap.secrets
|
||||
set:
|
||||
tap.secrets:
|
||||
- my-existing-secret
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
secretRef:
|
||||
name: my-existing-secret
|
||||
|
||||
- it: should render both inline and external refs together
|
||||
set:
|
||||
tap.snapshots.cloud.s3.bucket: my-bucket
|
||||
tap.snapshots.cloud.s3.accessKey: key
|
||||
tap.snapshots.cloud.s3.secretKey: secret
|
||||
tap.snapshots.cloud.configMaps:
|
||||
- ext-config
|
||||
tap.snapshots.cloud.secrets:
|
||||
- ext-secret
|
||||
asserts:
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: ext-config
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
secretRef:
|
||||
name: ext-secret
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
configMapRef:
|
||||
name: RELEASE-NAME-cloud-config
|
||||
- contains:
|
||||
path: spec.template.spec.containers[0].envFrom
|
||||
content:
|
||||
secretRef:
|
||||
name: RELEASE-NAME-cloud-secret
|
||||
@@ -26,24 +26,48 @@ tap:
|
||||
excludedNamespaces: []
|
||||
bpfOverride: ""
|
||||
capture:
|
||||
stopped: false
|
||||
stopAfter: 5m
|
||||
dissection:
|
||||
enabled: true
|
||||
stopAfter: 5m
|
||||
captureSelf: false
|
||||
raw:
|
||||
enabled: true
|
||||
storageSize: 1Gi
|
||||
dbMaxSize: 500Mi
|
||||
delayedDissection:
|
||||
image: kubeshark/worker:master
|
||||
cpu: "1"
|
||||
memory: 4Gi
|
||||
snapshots:
|
||||
storageSize: ""
|
||||
storageClass: ""
|
||||
storageSize: 20Gi
|
||||
snapshots:
|
||||
local:
|
||||
storageClass: ""
|
||||
storageSize: 20Gi
|
||||
cloud:
|
||||
provider: ""
|
||||
prefix: ""
|
||||
configMaps: []
|
||||
secrets: []
|
||||
s3:
|
||||
bucket: ""
|
||||
region: ""
|
||||
accessKey: ""
|
||||
secretKey: ""
|
||||
roleArn: ""
|
||||
externalId: ""
|
||||
azblob:
|
||||
storageAccount: ""
|
||||
container: ""
|
||||
storageKey: ""
|
||||
gcs:
|
||||
bucket: ""
|
||||
project: ""
|
||||
credentialsJson: ""
|
||||
release:
|
||||
repo: https://helm.kubeshark.com
|
||||
name: kubeshark
|
||||
namespace: default
|
||||
helmChartPath: ""
|
||||
persistentStorage: false
|
||||
persistentStorageStatic: false
|
||||
persistentStoragePvcVolumeMode: FileSystem
|
||||
@@ -129,23 +153,26 @@ tap:
|
||||
auth:
|
||||
enabled: false
|
||||
type: saml
|
||||
roles:
|
||||
admin:
|
||||
filter: ""
|
||||
canDownloadPCAP: true
|
||||
canUseScripting: true
|
||||
scriptingPermissions:
|
||||
canSave: true
|
||||
canActivate: true
|
||||
canDelete: true
|
||||
canUpdateTargetedPods: true
|
||||
canStopTrafficCapturing: true
|
||||
canControlDissection: true
|
||||
showAdminConsoleLink: true
|
||||
rolesClaim: role
|
||||
defaultRole: ""
|
||||
defaultFilter: ""
|
||||
saml:
|
||||
idpMetadataUrl: ""
|
||||
x509crt: ""
|
||||
x509key: ""
|
||||
roleAttribute: role
|
||||
roles:
|
||||
admin:
|
||||
filter: ""
|
||||
canDownloadPCAP: true
|
||||
canUseScripting: true
|
||||
scriptingPermissions:
|
||||
canSave: true
|
||||
canActivate: true
|
||||
canDelete: true
|
||||
canUpdateTargetedPods: true
|
||||
canStopTrafficCapturing: true
|
||||
showAdminConsoleLink: true
|
||||
ingress:
|
||||
enabled: false
|
||||
className: ""
|
||||
@@ -162,6 +189,8 @@ tap:
|
||||
dashboard:
|
||||
streamingType: connect-rpc
|
||||
completeStreamingEnabled: true
|
||||
clusterWideMapEnabled: false
|
||||
entriesLimit: "300000"
|
||||
telemetry:
|
||||
enabled: true
|
||||
resourceGuard:
|
||||
@@ -174,7 +203,6 @@ tap:
|
||||
enabled: false
|
||||
environment: production
|
||||
defaultFilter: ""
|
||||
liveConfigMapChangesDisabled: false
|
||||
globalFilter: ""
|
||||
enabledDissectors:
|
||||
- amqp
|
||||
@@ -182,15 +210,19 @@ tap:
|
||||
- http
|
||||
- icmp
|
||||
- kafka
|
||||
- mongodb
|
||||
- mysql
|
||||
- postgresql
|
||||
- redis
|
||||
- ws
|
||||
- tlsx
|
||||
- ldap
|
||||
- radius
|
||||
- diameter
|
||||
- udp-flow
|
||||
- tcp-flow
|
||||
- tcp-flow-full
|
||||
- udp-flow-full
|
||||
- udp-conn
|
||||
- tcp-conn
|
||||
portMapping:
|
||||
http:
|
||||
- 80
|
||||
@@ -201,6 +233,12 @@ tap:
|
||||
- 5672
|
||||
kafka:
|
||||
- 9092
|
||||
mongodb:
|
||||
- 27017
|
||||
mysql:
|
||||
- 3306
|
||||
postgresql:
|
||||
- 5432
|
||||
redis:
|
||||
- 6379
|
||||
ldap:
|
||||
@@ -226,6 +264,8 @@ tap:
|
||||
duplicateTimeframe: 200ms
|
||||
detectDuplicates: false
|
||||
staleTimeoutSeconds: 30
|
||||
tcpFlowTimeout: 1200
|
||||
udpFlowTimeout: 1200
|
||||
securityContext:
|
||||
privileged: true
|
||||
appArmorProfile:
|
||||
@@ -268,13 +308,14 @@ kube:
|
||||
dumpLogs: false
|
||||
headless: false
|
||||
license: ""
|
||||
cloudApiUrl: "https://api.kubeshark.com"
|
||||
cloudApiUrl: https://api.kubeshark.com
|
||||
cloudLicenseEnabled: true
|
||||
demoModeEnabled: false
|
||||
supportChatEnabled: false
|
||||
betaEnabled: false
|
||||
internetConnectivity: true
|
||||
scripting:
|
||||
enabled: false
|
||||
env: {}
|
||||
source: ""
|
||||
sources: []
|
||||
|
||||
@@ -67,7 +67,10 @@ func (h *Helm) Install() (rel *release.Release, err error) {
|
||||
client.Namespace = h.releaseNamespace
|
||||
client.ReleaseName = h.releaseName
|
||||
|
||||
chartPath := os.Getenv(fmt.Sprintf("%s_HELM_CHART_PATH", strings.ToUpper(misc.Program)))
|
||||
chartPath := config.Config.Tap.Release.HelmChartPath
|
||||
if chartPath == "" {
|
||||
chartPath = os.Getenv(fmt.Sprintf("%s_HELM_CHART_PATH", strings.ToUpper(misc.Program)))
|
||||
}
|
||||
if chartPath == "" {
|
||||
var chartURL string
|
||||
chartURL, err = repo.FindChartInRepoURL(h.repo, h.releaseName, "", "", "", "", getter.All(&cli.EnvSettings{}))
|
||||
|
||||
@@ -4,10 +4,10 @@ apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-hub-network-policy
|
||||
namespace: default
|
||||
@@ -33,10 +33,10 @@ apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
name: kubeshark-front-network-policy
|
||||
@@ -60,10 +60,10 @@ apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
name: kubeshark-dex-network-policy
|
||||
@@ -87,10 +87,10 @@ apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
name: kubeshark-worker-network-policy
|
||||
@@ -116,10 +116,10 @@ apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-service-account
|
||||
namespace: default
|
||||
@@ -132,10 +132,10 @@ metadata:
|
||||
namespace: default
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
stringData:
|
||||
LICENSE: ''
|
||||
@@ -151,10 +151,10 @@ metadata:
|
||||
namespace: default
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
stringData:
|
||||
AUTH_SAML_X509_CRT: |
|
||||
@@ -167,10 +167,10 @@ metadata:
|
||||
namespace: default
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
stringData:
|
||||
AUTH_SAML_X509_KEY: |
|
||||
@@ -182,10 +182,10 @@ metadata:
|
||||
name: kubeshark-nginx-config-map
|
||||
namespace: default
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
data:
|
||||
default.conf: |
|
||||
@@ -199,6 +199,10 @@ data:
|
||||
client_header_buffer_size 32k;
|
||||
large_client_header_buffers 8 64k;
|
||||
|
||||
proxy_buffer_size 64k;
|
||||
proxy_buffers 4 128k;
|
||||
proxy_busy_buffers_size 128k;
|
||||
|
||||
location /api {
|
||||
rewrite ^/api(.*)$ $1 break;
|
||||
proxy_pass http://kubeshark-hub;
|
||||
@@ -209,8 +213,10 @@ data:
|
||||
proxy_set_header Authorization $http_authorization;
|
||||
proxy_pass_header Authorization;
|
||||
proxy_connect_timeout 4s;
|
||||
proxy_read_timeout 120s;
|
||||
proxy_send_timeout 12s;
|
||||
# Disable buffering for gRPC/Connect streaming
|
||||
client_max_body_size 0;
|
||||
proxy_request_buffering off;
|
||||
proxy_buffering off;
|
||||
proxy_pass_request_headers on;
|
||||
}
|
||||
|
||||
@@ -246,17 +252,18 @@ metadata:
|
||||
namespace: default
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
data:
|
||||
POD_REGEX: '.*'
|
||||
NAMESPACES: ''
|
||||
EXCLUDED_NAMESPACES: ''
|
||||
BPF_OVERRIDE: ''
|
||||
STOPPED: 'false'
|
||||
DISSECTION_ENABLED: 'true'
|
||||
CAPTURE_SELF: 'false'
|
||||
SCRIPTING_SCRIPTS: '{}'
|
||||
SCRIPTING_ACTIVE_SCRIPTS: ''
|
||||
INGRESS_ENABLED: 'false'
|
||||
@@ -265,18 +272,19 @@ data:
|
||||
AUTH_ENABLED: 'true'
|
||||
AUTH_TYPE: 'default'
|
||||
AUTH_SAML_IDP_METADATA_URL: ''
|
||||
AUTH_SAML_ROLE_ATTRIBUTE: 'role'
|
||||
AUTH_SAML_ROLES: '{"admin":{"canDownloadPCAP":true,"canStopTrafficCapturing":true,"canUpdateTargetedPods":true,"canUseScripting":true,"filter":"","scriptingPermissions":{"canActivate":true,"canDelete":true,"canSave":true},"showAdminConsoleLink":true}}'
|
||||
AUTH_ROLES: '{"admin":{"canControlDissection":true,"canDownloadPCAP":true,"canStopTrafficCapturing":true,"canUpdateTargetedPods":true,"canUseScripting":true,"filter":"","scriptingPermissions":{"canActivate":true,"canDelete":true,"canSave":true},"showAdminConsoleLink":true}}'
|
||||
AUTH_ROLES_CLAIM: 'role'
|
||||
AUTH_DEFAULT_ROLE: ''
|
||||
AUTH_OIDC_ISSUER: 'not set'
|
||||
AUTH_OIDC_REFRESH_TOKEN_LIFETIME: '3960h'
|
||||
AUTH_OIDC_STATE_PARAM_EXPIRY: '10m'
|
||||
AUTH_OIDC_BYPASS_SSL_CA_CHECK: 'false'
|
||||
TELEMETRY_DISABLED: 'false'
|
||||
SCRIPTING_DISABLED: 'false'
|
||||
TARGETED_PODS_UPDATE_DISABLED: ''
|
||||
TARGETED_PODS_UPDATE_DISABLED: 'false'
|
||||
PRESET_FILTERS_CHANGING_ENABLED: 'true'
|
||||
RECORDING_DISABLED: ''
|
||||
STOP_TRAFFIC_CAPTURING_DISABLED: 'false'
|
||||
RECORDING_DISABLED: 'false'
|
||||
DISSECTION_CONTROL_ENABLED: 'true'
|
||||
GLOBAL_FILTER: ""
|
||||
DEFAULT_FILTER: ""
|
||||
TRAFFIC_SAMPLE_RATE: '100'
|
||||
@@ -285,18 +293,19 @@ data:
|
||||
PCAP_ERROR_TTL: '0'
|
||||
TIMEZONE: ' '
|
||||
CLOUD_LICENSE_ENABLED: 'true'
|
||||
AI_ASSISTANT_ENABLED: 'true'
|
||||
DUPLICATE_TIMEFRAME: '200ms'
|
||||
ENABLED_DISSECTORS: 'amqp,dns,http,icmp,kafka,redis,ws,ldap,radius,diameter,udp-flow,tcp-flow'
|
||||
ENABLED_DISSECTORS: 'amqp,dns,http,icmp,kafka,mongodb,mysql,postgresql,redis,ws,tlsx,ldap,radius,diameter,udp-flow,tcp-flow,udp-conn,tcp-conn'
|
||||
CUSTOM_MACROS: '{"https":"tls and (http or http2)"}'
|
||||
DISSECTORS_UPDATING_ENABLED: 'true'
|
||||
SNAPSHOTS_UPDATING_ENABLED: 'true'
|
||||
DEMO_MODE_ENABLED: 'false'
|
||||
DETECT_DUPLICATES: 'false'
|
||||
PCAP_DUMP_ENABLE: 'false'
|
||||
PCAP_TIME_INTERVAL: '1m'
|
||||
PCAP_MAX_TIME: '1h'
|
||||
PCAP_MAX_SIZE: '500MB'
|
||||
PORT_MAPPING: '{"amqp":[5671,5672],"diameter":[3868],"http":[80,443,8080],"kafka":[9092],"ldap":[389],"redis":[6379]}'
|
||||
RAW_CAPTURE: 'true'
|
||||
PORT_MAPPING: '{"amqp":[5671,5672],"diameter":[3868],"http":[80,443,8080],"kafka":[9092],"ldap":[389],"mongodb":[27017],"mysql":[3306],"postgresql":[5432],"redis":[6379]}'
|
||||
RAW_CAPTURE_ENABLED: 'true'
|
||||
RAW_CAPTURE_STORAGE_SIZE: '1Gi'
|
||||
---
|
||||
# Source: kubeshark/templates/02-cluster-role.yaml
|
||||
@@ -304,10 +313,10 @@ apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-cluster-role-default
|
||||
namespace: default
|
||||
@@ -345,16 +354,22 @@ rules:
|
||||
- create
|
||||
- update
|
||||
- delete
|
||||
- apiGroups:
|
||||
- authentication.k8s.io
|
||||
resources:
|
||||
- tokenreviews
|
||||
verbs:
|
||||
- create
|
||||
---
|
||||
# Source: kubeshark/templates/03-cluster-role-binding.yaml
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-cluster-role-binding-default
|
||||
namespace: default
|
||||
@@ -372,10 +387,10 @@ apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
name: kubeshark-self-config-role
|
||||
@@ -410,6 +425,15 @@ rules:
|
||||
verbs:
|
||||
- create
|
||||
- get
|
||||
- apiGroups:
|
||||
- ""
|
||||
resources:
|
||||
- persistentvolumeclaims
|
||||
verbs:
|
||||
- create
|
||||
- get
|
||||
- list
|
||||
- delete
|
||||
- apiGroups:
|
||||
- batch
|
||||
resources:
|
||||
@@ -422,10 +446,10 @@ apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: RoleBinding
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
name: kubeshark-self-config-role-binding
|
||||
@@ -445,10 +469,10 @@ kind: Service
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-hub
|
||||
namespace: default
|
||||
@@ -466,10 +490,10 @@ apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-front
|
||||
namespace: default
|
||||
@@ -487,10 +511,10 @@ kind: Service
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
@@ -500,10 +524,10 @@ metadata:
|
||||
spec:
|
||||
selector:
|
||||
app.kubeshark.com/app: worker
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
ports:
|
||||
- name: metrics
|
||||
@@ -516,10 +540,10 @@ kind: Service
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
labels:
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
annotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
@@ -529,10 +553,10 @@ metadata:
|
||||
spec:
|
||||
selector:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
ports:
|
||||
- name: metrics
|
||||
@@ -547,10 +571,10 @@ metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: worker
|
||||
sidecar.istio.io/inject: "false"
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-worker-daemon-set
|
||||
namespace: default
|
||||
@@ -564,10 +588,11 @@ spec:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: worker
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
kubeshark.io/internal-auth: "true"
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-worker-daemon-set
|
||||
namespace: kubeshark
|
||||
@@ -577,7 +602,7 @@ spec:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- mkdir -p /sys/fs/bpf && mount | grep -q '/sys/fs/bpf' || mount -t bpf bpf /sys/fs/bpf
|
||||
image: 'docker.io/kubeshark/worker:v52.12'
|
||||
image: 'docker.io/kubeshark/worker:v53.3'
|
||||
imagePullPolicy: Always
|
||||
name: mount-bpf
|
||||
securityContext:
|
||||
@@ -606,11 +631,17 @@ spec:
|
||||
- 'auto'
|
||||
- -staletimeout
|
||||
- '30'
|
||||
- -tcp-flow-full-timeout
|
||||
- '1200'
|
||||
- -udp-flow-full-timeout
|
||||
- '1200'
|
||||
- -storage-size
|
||||
- '10Gi'
|
||||
- -capture-db-max-size
|
||||
- '500Mi'
|
||||
image: 'docker.io/kubeshark/worker:v52.12'
|
||||
- -cloud-api-url
|
||||
- 'https://api.kubeshark.com'
|
||||
image: 'docker.io/kubeshark/worker:v53.3'
|
||||
imagePullPolicy: Always
|
||||
name: sniffer
|
||||
ports:
|
||||
@@ -626,12 +657,14 @@ spec:
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: metadata.namespace
|
||||
- name: NODE_NAME
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
- name: TCP_STREAM_CHANNEL_TIMEOUT_MS
|
||||
value: '10000'
|
||||
- name: TCP_STREAM_CHANNEL_TIMEOUT_SHOW
|
||||
value: 'false'
|
||||
- name: KUBESHARK_CLOUD_API_URL
|
||||
value: 'https://api.kubeshark.com'
|
||||
- name: PROFILING_ENABLED
|
||||
value: 'false'
|
||||
- name: SENTRY_ENABLED
|
||||
@@ -684,7 +717,7 @@ spec:
|
||||
- -disable-tls-log
|
||||
- -loglevel
|
||||
- 'warning'
|
||||
image: 'docker.io/kubeshark/worker:v52.12'
|
||||
image: 'docker.io/kubeshark/worker:v53.3'
|
||||
imagePullPolicy: Always
|
||||
name: tracer
|
||||
env:
|
||||
@@ -696,6 +729,10 @@ spec:
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: metadata.namespace
|
||||
- name: NODE_NAME
|
||||
valueFrom:
|
||||
fieldRef:
|
||||
fieldPath: spec.nodeName
|
||||
- name: PROFILING_ENABLED
|
||||
value: 'false'
|
||||
- name: SENTRY_ENABLED
|
||||
@@ -776,10 +813,10 @@ kind: Deployment
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-hub
|
||||
namespace: default
|
||||
@@ -794,10 +831,10 @@ spec:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: hub
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
dnsPolicy: ClusterFirstWithHostNet
|
||||
@@ -815,11 +852,15 @@ spec:
|
||||
- -snapshot-size-limit
|
||||
- '20Gi'
|
||||
- -dissector-image
|
||||
- 'kubeshark/worker:master'
|
||||
- 'docker.io/kubeshark/worker:v53.3'
|
||||
- -dissector-cpu
|
||||
- '1'
|
||||
- -dissector-memory
|
||||
- '4Gi'
|
||||
- -dissector-storage-size
|
||||
- '20Gi'
|
||||
- -cloud-api-url
|
||||
- 'https://api.kubeshark.com'
|
||||
env:
|
||||
- name: POD_NAME
|
||||
valueFrom:
|
||||
@@ -833,11 +874,9 @@ spec:
|
||||
value: 'false'
|
||||
- name: SENTRY_ENVIRONMENT
|
||||
value: 'production'
|
||||
- name: KUBESHARK_CLOUD_API_URL
|
||||
value: 'https://api.kubeshark.com'
|
||||
- name: PROFILING_ENABLED
|
||||
value: 'false'
|
||||
image: 'docker.io/kubeshark/hub:v52.12'
|
||||
image: 'docker.io/kubeshark/hub:v53.3'
|
||||
imagePullPolicy: Always
|
||||
readinessProbe:
|
||||
periodSeconds: 5
|
||||
@@ -905,10 +944,10 @@ kind: Deployment
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: front
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
name: kubeshark-front
|
||||
namespace: default
|
||||
@@ -923,10 +962,10 @@ spec:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubeshark.com/app: front
|
||||
helm.sh/chart: kubeshark-52.12.0
|
||||
helm.sh/chart: kubeshark-53.3.0
|
||||
app.kubernetes.io/name: kubeshark
|
||||
app.kubernetes.io/instance: kubeshark
|
||||
app.kubernetes.io/version: "52.12.0"
|
||||
app.kubernetes.io/version: "53.3.0"
|
||||
app.kubernetes.io/managed-by: Helm
|
||||
spec:
|
||||
containers:
|
||||
@@ -943,6 +982,8 @@ spec:
|
||||
value: ' '
|
||||
- name: REACT_APP_TIMEZONE
|
||||
value: ' '
|
||||
- name: REACT_APP_SCRIPTING_HIDDEN
|
||||
value: 'true'
|
||||
- name: REACT_APP_SCRIPTING_DISABLED
|
||||
value: 'false'
|
||||
- name: REACT_APP_TARGETED_PODS_UPDATE_DISABLED
|
||||
@@ -953,11 +994,11 @@ spec:
|
||||
value: 'true'
|
||||
- name: REACT_APP_RECORDING_DISABLED
|
||||
value: 'false'
|
||||
- name: REACT_APP_STOP_TRAFFIC_CAPTURING_DISABLED
|
||||
value: 'false'
|
||||
- name: 'REACT_APP_CLOUD_LICENSE_ENABLED'
|
||||
- name: REACT_APP_DISSECTION_ENABLED
|
||||
value: 'true'
|
||||
- name: 'REACT_APP_AI_ASSISTANT_ENABLED'
|
||||
- name: REACT_APP_DISSECTION_CONTROL_ENABLED
|
||||
value: 'true'
|
||||
- name: 'REACT_APP_CLOUD_LICENSE_ENABLED'
|
||||
value: 'true'
|
||||
- name: REACT_APP_SUPPORT_CHAT_ENABLED
|
||||
value: 'false'
|
||||
@@ -965,13 +1006,21 @@ spec:
|
||||
value: 'false'
|
||||
- name: REACT_APP_DISSECTORS_UPDATING_ENABLED
|
||||
value: 'true'
|
||||
- name: REACT_APP_SNAPSHOTS_UPDATING_ENABLED
|
||||
value: 'true'
|
||||
- name: REACT_APP_DEMO_MODE_ENABLED
|
||||
value: 'false'
|
||||
- name: REACT_APP_CLUSTER_WIDE_MAP_ENABLED
|
||||
value: 'false'
|
||||
- name: REACT_APP_RAW_CAPTURE_ENABLED
|
||||
value: 'true'
|
||||
- name: REACT_APP_ENTRIES_LIMIT
|
||||
value: '300000'
|
||||
- name: REACT_APP_SENTRY_ENABLED
|
||||
value: 'false'
|
||||
- name: REACT_APP_SENTRY_ENVIRONMENT
|
||||
value: 'production'
|
||||
image: 'docker.io/kubeshark/front:v52.12'
|
||||
image: 'docker.io/kubeshark/front:v53.3'
|
||||
imagePullPolicy: Always
|
||||
name: kubeshark-front
|
||||
livenessProbe:
|
||||
|
||||
@@ -2,6 +2,18 @@
|
||||
|
||||
[Kubeshark](https://kubeshark.com) MCP (Model Context Protocol) server enables AI assistants like Claude Desktop, Cursor, and other MCP-compatible clients to query real-time Kubernetes network traffic.
|
||||
|
||||
## AI Skills
|
||||
|
||||
The MCP provides the tools — [AI skills](../skills/) teach agents how to use them.
|
||||
Skills turn raw MCP capabilities into domain-specific workflows like root cause
|
||||
analysis, traffic filtering, and forensic investigation. See the
|
||||
[skills README](../skills/README.md) for installation and usage.
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| [`network-rca`](../skills/network-rca/) | Network Root Cause Analysis — snapshot-based retrospective investigation with PCAP and dissection routes |
|
||||
| [`kfl`](../skills/kfl/) | KFL2 filter expert — write, debug, and optimize traffic queries across all supported protocols |
|
||||
|
||||
## Features
|
||||
|
||||
- **L7 API Traffic Analysis**: Query HTTP, gRPC, Redis, Kafka, DNS transactions
|
||||
@@ -34,20 +46,20 @@ Add to your Claude Desktop configuration:
|
||||
**macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
|
||||
**Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
|
||||
|
||||
#### URL Mode (Recommended for existing deployments)
|
||||
#### Default (requires kubectl access / kube context)
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"kubeshark": {
|
||||
"command": "kubeshark",
|
||||
"args": ["mcp", "--url", "https://kubeshark.example.com"]
|
||||
"args": ["mcp"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Proxy Mode (Requires kubectl access)
|
||||
With an explicit kubeconfig path:
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -59,14 +71,18 @@ Add to your Claude Desktop configuration:
|
||||
}
|
||||
}
|
||||
```
|
||||
or:
|
||||
|
||||
#### URL Mode (no kubectl required)
|
||||
|
||||
Use this when the machine doesn't have kubectl access or a kube context.
|
||||
Connect directly to an existing Kubeshark deployment:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"kubeshark": {
|
||||
"command": "kubeshark",
|
||||
"args": ["mcp"]
|
||||
"args": ["mcp", "--url", "https://kubeshark.example.com"]
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -188,7 +204,7 @@ http and src.namespace == "default" and response.status == 500
|
||||
|
||||
## MCP Registry
|
||||
|
||||
Kubeshark is published to the [MCP Registry](https://registry.mcp.io) automatically on each release.
|
||||
Kubeshark is published to the [MCP Registry](https://registry.modelcontextprotocol.io/) automatically on each release.
|
||||
|
||||
The `server.json` in this directory is a reference file. The actual registry metadata (version, SHA256 hashes) is auto-generated during the release workflow. See [`.github/workflows/release.yml`](../.github/workflows/release.yml) for details.
|
||||
|
||||
@@ -197,7 +213,7 @@ The `server.json` in this directory is a reference file. The actual registry met
|
||||
- [Documentation](https://docs.kubeshark.com/en/mcp)
|
||||
- [GitHub](https://github.com/kubeshark/kubeshark)
|
||||
- [Website](https://kubeshark.com)
|
||||
- [MCP Registry](https://registry.mcp.io)
|
||||
- [MCP Registry](https://registry.modelcontextprotocol.io/)
|
||||
|
||||
## License
|
||||
|
||||
|
||||
121
skills/README.md
Normal file
121
skills/README.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Kubeshark AI Skills
|
||||
|
||||
Open-source AI skills that work with the [Kubeshark MCP](https://github.com/kubeshark/kubeshark).
|
||||
Skills teach AI agents how to use Kubeshark's MCP tools for specific workflows
|
||||
like root cause analysis, traffic filtering, and forensic investigation.
|
||||
|
||||
Skills use the open [Agent Skills](https://github.com/anthropics/skills) format
|
||||
and work with Claude Code, OpenAI Codex CLI, Gemini CLI, Cursor, and other
|
||||
compatible agents.
|
||||
|
||||
## Available Skills
|
||||
|
||||
| Skill | Description |
|
||||
|-------|-------------|
|
||||
| [`network-rca`](network-rca/) | Network Root Cause Analysis. Retrospective traffic analysis via snapshots, with two investigation routes: PCAP (for Wireshark/compliance) and Dissection (for AI-driven API-level investigation). |
|
||||
| [`kfl`](kfl/) | KFL2 (Kubeshark Filter Language) expert. Complete reference for writing, debugging, and optimizing CEL-based traffic filters across all supported protocols. |
|
||||
| [`security-audit`](security-audit/) | Network Security Audit. Systematic 8-phase threat detection across MITRE ATT&CK tactics — C2, exfiltration, lateral movement, credential theft, cryptomining, protocol abuse — using snapshot-based traffic analysis. |
|
||||
|
||||
## Prerequisites
|
||||
|
||||
All skills require the Kubeshark MCP:
|
||||
|
||||
```bash
|
||||
# Claude Code
|
||||
claude mcp add kubeshark -- kubeshark mcp
|
||||
|
||||
# Without kubectl access (direct URL)
|
||||
claude mcp add kubeshark -- kubeshark mcp --url https://kubeshark.example.com
|
||||
```
|
||||
|
||||
For Claude Desktop, add to `claude_desktop_config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"kubeshark": {
|
||||
"command": "kubeshark",
|
||||
"args": ["mcp"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
### Option 1: Plugin (recommended)
|
||||
|
||||
Install as a Claude Code plugin directly from GitHub:
|
||||
|
||||
```
|
||||
/plugin marketplace add kubeshark/kubeshark
|
||||
/plugin install kubeshark
|
||||
```
|
||||
|
||||
Skills appear as `/kubeshark:network-rca` and `/kubeshark:kfl`. The plugin
|
||||
also bundles the Kubeshark MCP configuration automatically.
|
||||
|
||||
### Option 2: Clone and run
|
||||
|
||||
```bash
|
||||
git clone https://github.com/kubeshark/kubeshark
|
||||
cd kubeshark
|
||||
claude
|
||||
```
|
||||
|
||||
Skills trigger automatically based on your conversation.
|
||||
|
||||
### Option 3: Manual installation
|
||||
|
||||
Clone the repo (if you haven't already), then symlink or copy the skills:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/kubeshark/kubeshark
|
||||
mkdir -p ~/.claude/skills
|
||||
|
||||
# Symlink to stay in sync with the repo (recommended)
|
||||
ln -s kubeshark/skills/network-rca ~/.claude/skills/network-rca
|
||||
ln -s kubeshark/skills/kfl ~/.claude/skills/kfl
|
||||
|
||||
# Or copy to your project (project scope only)
|
||||
mkdir -p .claude/skills
|
||||
cp -r kubeshark/skills/network-rca .claude/skills/
|
||||
cp -r kubeshark/skills/kfl .claude/skills/
|
||||
|
||||
# Or copy for personal use (all your projects)
|
||||
cp -r kubeshark/skills/network-rca ~/.claude/skills/
|
||||
cp -r kubeshark/skills/kfl ~/.claude/skills/
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
We welcome contributions — whether improving an existing skill or proposing a new one.
|
||||
|
||||
- **Suggest improvements**: Open an issue or PR with changes to an existing skill's `SKILL.md`
|
||||
or reference docs. Better examples, clearer workflows, and additional filter patterns
|
||||
are always appreciated.
|
||||
- **Add a new skill**: Open an issue describing the use case first. New skills should
|
||||
follow the structure below and reference Kubeshark MCP tools by exact name.
|
||||
|
||||
### Skill structure
|
||||
|
||||
```
|
||||
skills/
|
||||
└── <skill-name>/
|
||||
├── SKILL.md # Required. YAML frontmatter + markdown body.
|
||||
└── references/ # Optional. Detailed reference docs.
|
||||
└── *.md
|
||||
```
|
||||
|
||||
### Guidelines
|
||||
|
||||
- Keep `SKILL.md` under 500 lines. Use `references/` for detailed content.
|
||||
- Use imperative tone. Reference MCP tools by exact name.
|
||||
- Include realistic example tool responses.
|
||||
- The `description` frontmatter should be generous with trigger keywords.
|
||||
|
||||
### Planned skills
|
||||
|
||||
- `api-security` — OWASP API Top 10 assessment against live or snapshot traffic.
|
||||
- `incident-response` — 7-phase forensic incident investigation methodology.
|
||||
- `network-engineering` — Real-time traffic analysis, latency debugging, dependency mapping.
|
||||
489
skills/install/SKILL.md
Normal file
489
skills/install/SKILL.md
Normal file
@@ -0,0 +1,489 @@
|
||||
---
|
||||
name: install
|
||||
user-invocable: true
|
||||
description: >
|
||||
Kubeshark installation and deployment skill. Use this skill whenever the user wants
|
||||
to install Kubeshark, deploy Kubeshark to a Kubernetes cluster, set up Kubeshark,
|
||||
configure Kubeshark helm values, generate a Kubeshark config file, customize
|
||||
Kubeshark deployment, troubleshoot Kubeshark installation, upgrade Kubeshark,
|
||||
uninstall Kubeshark, or manage the Kubeshark Helm release. Also trigger when
|
||||
the user mentions "kubeshark tap", "kubeshark clean", "helm install kubeshark",
|
||||
"get kubeshark running", "set up traffic capture", "deploy kubeshark",
|
||||
"kubeshark not starting", "kubeshark pods not ready", "configure namespaces",
|
||||
"persistent storage", "cloud storage for snapshots", "kubeshark ingress",
|
||||
"kubeshark auth", "kubeshark SAML", "kubeshark license", "kubeshark config",
|
||||
"custom helm values", "kubeshark on EKS/GKE/AKS", "kubeshark on OpenShift",
|
||||
"kubeshark on KinD/minikube/k3s", "air-gapped", "offline install",
|
||||
or any request related to getting Kubeshark installed, configured, and running
|
||||
in a Kubernetes cluster.
|
||||
---
|
||||
|
||||
# Kubeshark Installation & Deployment
|
||||
|
||||
You are a Kubeshark deployment specialist. Your job is to help users install,
|
||||
configure, and deploy Kubeshark to their Kubernetes cluster — tailoring the
|
||||
configuration to their specific environment, requirements, and use case.
|
||||
|
||||
Kubeshark deploys via Helm. The CLI (`kubeshark tap`) is a thin wrapper that
|
||||
installs a basic Helm chart and establishes a port-forward — nothing more.
|
||||
For larger or production clusters, use Helm directly with a custom values file.
|
||||
|
||||
## Decision: CLI or Helm?
|
||||
|
||||
**Use the CLI** when:
|
||||
- Quick install on a dev/test cluster (minikube, KinD, k3s)
|
||||
- Personal environment, single user
|
||||
- Just want to try Kubeshark quickly
|
||||
|
||||
**Use Helm directly** when:
|
||||
- Larger cluster (staging, production)
|
||||
- Need custom configuration (ingress, auth, storage, namespaces)
|
||||
- GitOps / infrastructure-as-code workflows
|
||||
- Team environment
|
||||
|
||||
## Path A: CLI (Dev/Test Clusters)
|
||||
|
||||
### Step 1 — Install the CLI
|
||||
|
||||
Check if Kubeshark is already installed:
|
||||
|
||||
```bash
|
||||
kubeshark version
|
||||
```
|
||||
|
||||
If not installed, offer one of these methods:
|
||||
|
||||
**Homebrew (easiest, where available):**
|
||||
|
||||
```bash
|
||||
brew tap kubeshark/kubeshark
|
||||
brew install kubeshark
|
||||
```
|
||||
|
||||
**Binary download:**
|
||||
|
||||
For the full list of platforms and architectures, see https://docs.kubeshark.com/en/install
|
||||
|
||||
```bash
|
||||
# Linux (amd64)
|
||||
curl -Lo kubeshark https://github.com/kubeshark/kubeshark/releases/latest/download/kubeshark_linux_amd64
|
||||
chmod +x kubeshark
|
||||
sudo mv kubeshark /usr/local/bin/
|
||||
|
||||
# Linux (arm64)
|
||||
curl -Lo kubeshark https://github.com/kubeshark/kubeshark/releases/latest/download/kubeshark_linux_arm64
|
||||
chmod +x kubeshark
|
||||
sudo mv kubeshark /usr/local/bin/
|
||||
|
||||
# macOS (Apple Silicon)
|
||||
curl -Lo kubeshark https://github.com/kubeshark/kubeshark/releases/latest/download/kubeshark_darwin_arm64
|
||||
chmod +x kubeshark
|
||||
sudo mv kubeshark /usr/local/bin/
|
||||
|
||||
# macOS (Intel)
|
||||
curl -Lo kubeshark https://github.com/kubeshark/kubeshark/releases/latest/download/kubeshark_darwin_amd64
|
||||
chmod +x kubeshark
|
||||
sudo mv kubeshark /usr/local/bin/
|
||||
```
|
||||
|
||||
### Step 2 — Check for Updates
|
||||
|
||||
**Always check for updates before using the CLI.** This is critical — Kubeshark
|
||||
releases frequently and running an outdated version can cause issues.
|
||||
|
||||
```bash
|
||||
# Homebrew
|
||||
brew upgrade kubeshark
|
||||
|
||||
# Binary — check the latest release and re-download if newer
|
||||
kubeshark version
|
||||
# Compare with https://github.com/kubeshark/kubeshark/releases/latest
|
||||
```
|
||||
|
||||
### Step 3 — Deploy with `kubeshark tap`
|
||||
|
||||
```bash
|
||||
kubeshark tap
|
||||
```
|
||||
|
||||
This installs the Helm chart with defaults and opens the dashboard in your browser.
|
||||
That's it for dev/test clusters.
|
||||
|
||||
### Step 4 — Reconnect if Connection Breaks
|
||||
|
||||
If the port-forward drops (laptop sleep, network change, terminal closed):
|
||||
|
||||
```bash
|
||||
kubeshark proxy
|
||||
```
|
||||
|
||||
This re-establishes the port-forward and reopens the dashboard. It does **not**
|
||||
reinstall — Kubeshark is still running in the cluster.
|
||||
|
||||
### Step 5 — Clean Up After Use
|
||||
|
||||
**Always clean up when done.** Kubeshark runs eBPF probes and DaemonSet workers
|
||||
on every node — leaving it running wastes cluster resources.
|
||||
|
||||
```bash
|
||||
kubeshark clean
|
||||
```
|
||||
|
||||
Always remind the user to run `kubeshark clean` when they're finished. This is
|
||||
easy to forget and important.
|
||||
|
||||
## Path B: Helm (Larger / Production Clusters)
|
||||
|
||||
### Step 1 — Upgrade the Helm Chart
|
||||
|
||||
**Always update the Helm repo first.** This is the most important first step —
|
||||
running an outdated chart can cause issues.
|
||||
|
||||
```bash
|
||||
helm repo add kubeshark https://helm.kubeshark.com
|
||||
helm repo update
|
||||
```
|
||||
|
||||
### Step 2 — Create a Config Directory
|
||||
|
||||
Store all configuration files in `~/.kubeshark/`:
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.kubeshark
|
||||
```
|
||||
|
||||
**Before writing any file to `~/.kubeshark/`, check if it already exists.**
|
||||
If `~/.kubeshark/values.yaml` (or any target filename) already exists, **ask the
|
||||
user** before overwriting. Either:
|
||||
1. Back up the existing file first: `cp ~/.kubeshark/values.yaml ~/.kubeshark/values.yaml.bak.$(date +%s)`
|
||||
2. Use a descriptive name for the new file (e.g., `values-production.yaml`, `values-staging.yaml`)
|
||||
|
||||
The user may have multiple values files for different clusters or environments.
|
||||
|
||||
### Step 3 — Build the Values File
|
||||
|
||||
Walk through the following configuration areas with the user. Each section
|
||||
explains what the value does and what to recommend.
|
||||
|
||||
#### Pod Targeting (CRITICAL)
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
regex: .*
|
||||
namespaces: []
|
||||
excludedNamespaces: []
|
||||
```
|
||||
|
||||
**This is one of the most important configuration decisions.** By default,
|
||||
Kubeshark monitors the entire cluster's traffic. On a large cluster this is a
|
||||
huge undertaking that consumes significant CPU and memory on every node.
|
||||
|
||||
**Always set namespace targeting.** Ask the user which namespaces contain the
|
||||
workloads they care about, and set those explicitly:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
namespaces:
|
||||
- production
|
||||
- staging
|
||||
```
|
||||
|
||||
Alternatively, use `excludedNamespaces` to monitor everything except specific
|
||||
namespaces:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
excludedNamespaces:
|
||||
- kube-system
|
||||
- monitoring
|
||||
- kubeshark
|
||||
```
|
||||
|
||||
The `regex` field filters by pod name within the targeted namespaces. Leave as
|
||||
`.*` unless the user wants to focus on specific pods.
|
||||
|
||||
Setting pod targeting rules causes Kubeshark to focus only on specific workloads,
|
||||
which moderates compute consumption significantly.
|
||||
|
||||
#### Docker Registry (Air-Gapped Environments)
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
docker:
|
||||
registry: docker.io/kubeshark
|
||||
tag: ""
|
||||
```
|
||||
|
||||
- `tap.docker.registry` — Change this for air-gapped environments where there's
|
||||
no access to `docker.io`. Point to your internal registry. Additional config
|
||||
may be needed (pull secrets, registry credentials).
|
||||
- `tap.docker.tag` — Set a specific version. If a patch version is missing, the
|
||||
latest patch in that minor version is used. **Leave empty (recommended)** to
|
||||
use the version matching the Helm chart.
|
||||
|
||||
For air-gapped clusters, also set:
|
||||
|
||||
```yaml
|
||||
internetConnectivity: false
|
||||
```
|
||||
|
||||
This is the **most important setting for air-gapped clusters** — it disables all
|
||||
outbound connectivity checks (license validation, telemetry, update checks).
|
||||
|
||||
#### Capture & Dissection
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
capture:
|
||||
dissection:
|
||||
enabled: true
|
||||
stopAfter: 5m
|
||||
raw:
|
||||
enabled: true
|
||||
storageSize: 1Gi
|
||||
dbMaxSize: 500Mi
|
||||
```
|
||||
|
||||
**`tap.capture.dissection.enabled`** — Controls real-time dissection (L7 protocol
|
||||
parsing on production nodes). Real-time dissection consumes significant compute
|
||||
resources from production nodes. **Recommend starting with `false` (disabled).**
|
||||
This can be toggled on-demand from the dashboard when needed, so it's used only
|
||||
when necessary and doesn't consume resources the rest of the time.
|
||||
|
||||
Dissection is independent from raw capture + snapshots. Raw capture is lightweight
|
||||
and runs continuously; dissection is the heavy operation.
|
||||
|
||||
**`tap.capture.dissection.stopAfter`** — Time after which dissection automatically
|
||||
disables once all client connections end. Set to `0` to never auto-disable (manual
|
||||
control only).
|
||||
|
||||
**`tap.capture.raw.enabled`** — Keep this `true`. Raw capture consumes very little
|
||||
production resources yet captures all traffic. This is what powers snapshots and
|
||||
retrospective analysis.
|
||||
|
||||
**`tap.capture.raw.storageSize`** — The FIFO buffer for raw capture per node.
|
||||
**Recommend 100Gi** for production. The larger this is, the further back in time
|
||||
snapshots can reach.
|
||||
|
||||
**`tap.capture.dbMaxSize`** — Size of the database holding dissected API calls.
|
||||
Bigger = more history kept. Adjust based on how much queryable history the user needs.
|
||||
|
||||
**`tap.capture.captureSelf`** — Debug option. Ignore during installation.
|
||||
|
||||
**`bpfOverride`** — Debug option. Ignore during installation.
|
||||
|
||||
#### Delayed Dissection
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
delayedDissection:
|
||||
cpu: "1"
|
||||
memory: 4Gi
|
||||
```
|
||||
|
||||
Delayed dissection is the process on the Hub that dissects raw capture data within
|
||||
a snapshot. It runs on the Hub node (not production nodes) and is triggered when
|
||||
a delayed dissection operation is requested on a snapshot.
|
||||
|
||||
**Give this as much resources as possible.** Recommend `cpu: "5"` and `memory: 5Gi`.
|
||||
This speeds up snapshot analysis significantly.
|
||||
|
||||
#### Snapshot Storage (Local)
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
local:
|
||||
storageClass: ""
|
||||
storageSize: 20Gi
|
||||
```
|
||||
|
||||
This is where snapshots are stored locally. **Be very generous with this.**
|
||||
**Recommend 2Ti (2TB)** for production environments that will accumulate snapshots.
|
||||
|
||||
**`storageClass`** — Must match a valid storage class in the cluster. Suggest
|
||||
based on the cloud provider:
|
||||
|
||||
| Provider | Recommended Storage Class |
|
||||
|----------|-------------------------|
|
||||
| EKS (AWS) | `gp2` or `gp3` |
|
||||
| GKE (Google) | `standard` or `premium-rwo` |
|
||||
| AKS (Azure) | `managed-csi` or `managed-premium` |
|
||||
| OpenShift | Check `kubectl get sc` — varies by provider |
|
||||
| KinD / minikube | `standard` (default) |
|
||||
| Private / bare metal | Ask the user for their storage class |
|
||||
|
||||
Always verify available storage classes with `kubectl get sc`.
|
||||
|
||||
#### Cloud Storage (Long-Term Retention)
|
||||
|
||||
Cloud storage enables uploading snapshots to S3, GCS, or Azure Blob for long-term
|
||||
retention, cross-cluster sharing, and backup/restore.
|
||||
|
||||
For detailed configuration per provider (including IRSA, Workload Identity, static
|
||||
credentials, and ConfigMap/Secret setup), see `references/cloud-storage.md`.
|
||||
|
||||
Summary of provider values:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "" # "s3", "azblob", or "gcs" (empty = disabled)
|
||||
prefix: "" # Key prefix in bucket
|
||||
configMaps: [] # Pre-existing ConfigMaps with cloud config
|
||||
secrets: [] # Pre-existing Secrets with cloud credentials
|
||||
```
|
||||
|
||||
Help the user select the right provider based on where their cluster runs and
|
||||
walk them through the authentication setup.
|
||||
|
||||
#### Resources
|
||||
|
||||
For a first installation, **do not change the resource defaults.** Let the user
|
||||
run Kubeshark with defaults first and tune based on actual usage patterns later.
|
||||
|
||||
The defaults are reasonable starting points. Resource consumption depends heavily
|
||||
on how much traffic is processed, which is controlled by pod targeting rules.
|
||||
|
||||
#### Node Selectors
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
nodeSelectorTerms:
|
||||
workers:
|
||||
- matchExpressions:
|
||||
- key: kubernetes.io/os
|
||||
operator: In
|
||||
values: [linux]
|
||||
```
|
||||
|
||||
Use `nodeSelectorTerms` when the user wants to focus on specific nodes. The less
|
||||
workload processed by Kubeshark, the less CPU and memory it consumes. The goal is
|
||||
to process workloads of interest, not the entire cluster.
|
||||
|
||||
#### Ingress (STRONGLY RECOMMENDED)
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
ingress:
|
||||
enabled: false
|
||||
className: ""
|
||||
host: ks.svc.cluster.local
|
||||
path: /
|
||||
tls: []
|
||||
annotations: {}
|
||||
```
|
||||
|
||||
**Ingress is the strongly preferred access method.** While port-forward is available,
|
||||
it is **highly NOT recommended** for anything beyond quick local testing. Port-forward
|
||||
is fragile, drops connections, and doesn't scale for team use.
|
||||
|
||||
**Always help the user configure ingress.** Ask them about their ingress controller
|
||||
(nginx, ALB, Traefik, etc.) and build the ingress config:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
ingress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
host: kubeshark.example.com
|
||||
tls:
|
||||
- secretName: kubeshark-tls
|
||||
hosts:
|
||||
- kubeshark.example.com
|
||||
annotations: {}
|
||||
```
|
||||
|
||||
For ALB on AWS:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
ingress:
|
||||
enabled: true
|
||||
className: alb
|
||||
host: kubeshark.example.com
|
||||
annotations:
|
||||
alb.ingress.kubernetes.io/scheme: internal
|
||||
alb.ingress.kubernetes.io/target-type: ip
|
||||
```
|
||||
|
||||
#### Air-Gapped Clusters
|
||||
|
||||
For air-gapped environments, two settings are essential:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
docker:
|
||||
registry: your-internal-registry.example.com/kubeshark
|
||||
internetConnectivity: false
|
||||
```
|
||||
|
||||
`internetConnectivity: false` is the **single most important option** for
|
||||
air-gapped clusters. Without it, Kubeshark will attempt outbound connections
|
||||
that will fail and cause issues.
|
||||
|
||||
### Step 4 — Install
|
||||
|
||||
```bash
|
||||
helm install kubeshark kubeshark/kubeshark \
|
||||
-f ~/.kubeshark/values.yaml \
|
||||
-n kubeshark --create-namespace
|
||||
```
|
||||
|
||||
### Step 5 — Upgrade
|
||||
|
||||
When upgrading, **always update the Helm repo first**:
|
||||
|
||||
```bash
|
||||
helm repo update
|
||||
helm upgrade kubeshark kubeshark/kubeshark \
|
||||
-f ~/.kubeshark/values.yaml \
|
||||
-n kubeshark
|
||||
```
|
||||
|
||||
## Uninstalling
|
||||
|
||||
**Via CLI:**
|
||||
|
||||
```bash
|
||||
kubeshark clean
|
||||
kubeshark clean -s kubeshark # Specific namespace
|
||||
```
|
||||
|
||||
**Via Helm:**
|
||||
|
||||
```bash
|
||||
helm uninstall kubeshark -n kubeshark
|
||||
```
|
||||
|
||||
PersistentVolumeClaims are not deleted by default. Remove manually if needed:
|
||||
|
||||
```bash
|
||||
kubectl delete pvc -l app.kubernetes.io/name=kubeshark -n kubeshark
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Pods not starting**: Check `kubectl get pods -l app.kubernetes.io/name=kubeshark -n <ns>`
|
||||
and `kubectl describe pod`. Common: ImagePullBackOff (registry), Pending (storage/resources),
|
||||
CrashLoopBackOff (check `kubectl logs`).
|
||||
- **No traffic**: Verify namespaces have running pods, check pod regex, ensure eBPF supported
|
||||
(kernel 4.14+, 5.4+ recommended).
|
||||
- **Permissions**: Requires privileged containers with NET_RAW, NET_ADMIN, SYS_ADMIN,
|
||||
SYS_PTRACE, SYS_RESOURCE, IPC_LOCK capabilities.
|
||||
- **Storage**: Verify storage class exists (`kubectl get sc`), PVC is bound (`kubectl get pvc`).
|
||||
|
||||
## Setup Reference
|
||||
|
||||
### Kubeshark MCP for AI Agents
|
||||
|
||||
After installation, connect the Kubeshark MCP so AI agents can interact with Kubeshark:
|
||||
|
||||
```bash
|
||||
# Claude Code
|
||||
claude mcp add kubeshark -- kubeshark mcp
|
||||
|
||||
# Direct URL (no kubectl needed)
|
||||
claude mcp add kubeshark -- kubeshark mcp --url https://kubeshark.example.com
|
||||
```
|
||||
96
skills/install/references/cloud-storage.md
Normal file
96
skills/install/references/cloud-storage.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Cloud Storage for Snapshots
|
||||
|
||||
This is a pointer to the authoritative cloud storage documentation maintained in
|
||||
the Helm chart:
|
||||
|
||||
**Source of truth**: `helm-chart/docs/snapshots_cloud_storage.md`
|
||||
|
||||
Always read that file for the latest configuration details, including:
|
||||
|
||||
- Amazon S3 (static credentials, IRSA, cross-account AssumeRole)
|
||||
- Azure Blob Storage (storage key, Workload Identity / DefaultAzureCredential)
|
||||
- Google Cloud Storage (service account JSON, GKE Workload Identity)
|
||||
- IAM permissions and trust policy examples
|
||||
- ConfigMap and Secret setup patterns
|
||||
- Inline values vs. external ConfigMap/Secret approaches
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Helm Values Structure
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "" # "s3", "azblob", or "gcs" (empty = disabled)
|
||||
prefix: "" # Key prefix in the bucket/container
|
||||
configMaps: [] # Pre-existing ConfigMaps with cloud config env vars
|
||||
secrets: [] # Pre-existing Secrets with cloud credentials
|
||||
s3:
|
||||
bucket: ""
|
||||
region: ""
|
||||
accessKey: ""
|
||||
secretKey: ""
|
||||
roleArn: ""
|
||||
externalId: ""
|
||||
azblob:
|
||||
storageAccount: ""
|
||||
container: ""
|
||||
storageKey: ""
|
||||
gcs:
|
||||
bucket: ""
|
||||
project: ""
|
||||
credentialsJson: ""
|
||||
```
|
||||
|
||||
### Recommended Auth Per Provider
|
||||
|
||||
| Provider | Production Recommendation |
|
||||
|----------|-------------------------|
|
||||
| S3 (EKS) | IRSA (IAM Roles for Service Accounts) — no static credentials |
|
||||
| S3 (non-EKS) | Static credentials via Secret, or default AWS credential chain |
|
||||
| Azure Blob (AKS) | Workload Identity / Managed Identity |
|
||||
| Azure Blob (non-AKS) | Storage account key via Secret |
|
||||
| GCS (GKE) | GKE Workload Identity — no JSON key file |
|
||||
| GCS (non-GKE) | Service account JSON key via Secret |
|
||||
|
||||
### Inline Values (Simplest Approach)
|
||||
|
||||
Set credentials directly in values.yaml. The Helm chart creates the necessary
|
||||
ConfigMap/Secret resources automatically.
|
||||
|
||||
**S3:**
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "s3"
|
||||
s3:
|
||||
bucket: my-kubeshark-snapshots
|
||||
region: us-east-1
|
||||
```
|
||||
|
||||
**GCS:**
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "gcs"
|
||||
gcs:
|
||||
bucket: my-kubeshark-snapshots
|
||||
project: my-gcp-project
|
||||
```
|
||||
|
||||
**Azure Blob:**
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
cloud:
|
||||
provider: "azblob"
|
||||
azblob:
|
||||
storageAccount: mykubesharksa
|
||||
container: snapshots
|
||||
```
|
||||
|
||||
For production setups with proper IAM integration, see the full documentation
|
||||
in `helm-chart/docs/snapshots_cloud_storage.md`.
|
||||
376
skills/install/references/helm-values.md
Normal file
376
skills/install/references/helm-values.md
Normal file
@@ -0,0 +1,376 @@
|
||||
# Kubeshark Helm Values Reference
|
||||
|
||||
Complete reference for all Kubeshark Helm chart values. Use this when building
|
||||
custom `values.yaml` files or `--set` flags.
|
||||
|
||||
## Docker Images
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
docker:
|
||||
registry: docker.io/kubeshark # Docker registry
|
||||
tag: "" # Image tag (empty = chart appVersion)
|
||||
tagLocked: true # Lock to specific tag
|
||||
imagePullPolicy: Always # Always, IfNotPresent, Never
|
||||
imagePullSecrets: [] # Registry pull secrets
|
||||
overrideImage: # Override individual component images
|
||||
worker: ""
|
||||
hub: ""
|
||||
front: ""
|
||||
overrideTag: # Override individual component tags
|
||||
worker: ""
|
||||
hub: ""
|
||||
front: ""
|
||||
```
|
||||
|
||||
## Proxy / Port-Forward
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
proxy:
|
||||
worker:
|
||||
srvPort: 48999
|
||||
hub:
|
||||
srvPort: 8898
|
||||
front:
|
||||
port: 8899 # Local port for port-forward
|
||||
host: 127.0.0.1 # Bind address
|
||||
```
|
||||
|
||||
## Pod Targeting
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
regex: .* # Pod name regex filter
|
||||
namespaces: [] # Target namespaces (empty = all)
|
||||
excludedNamespaces: [] # Namespaces to exclude
|
||||
bpfOverride: "" # Custom BPF filter override
|
||||
```
|
||||
|
||||
## Capture & Dissection
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
capture:
|
||||
dissection:
|
||||
enabled: true # Enable L7 dissection
|
||||
stopAfter: 5m # Auto-stop dissection after duration
|
||||
captureSelf: false # Capture Kubeshark's own traffic
|
||||
raw:
|
||||
enabled: true # Enable raw packet capture (needed for snapshots)
|
||||
storageSize: 1Gi # FIFO buffer size per node
|
||||
dbMaxSize: 500Mi # Max L7 database size per node
|
||||
delayedDissection:
|
||||
cpu: "1" # CPU for delayed dissection jobs
|
||||
memory: 4Gi # Memory for delayed dissection jobs
|
||||
storageSize: "" # Storage for delayed dissection
|
||||
storageClass: "" # Storage class for delayed dissection
|
||||
```
|
||||
|
||||
## Snapshots
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
local:
|
||||
storageClass: "" # Storage class for local snapshots
|
||||
storageSize: 20Gi # PVC size for local snapshots
|
||||
cloud:
|
||||
provider: "" # s3, gcs, or azblob
|
||||
prefix: "" # Path prefix in bucket
|
||||
configMaps: [] # Additional ConfigMaps to mount
|
||||
secrets: [] # Additional Secrets to mount
|
||||
s3:
|
||||
bucket: ""
|
||||
region: ""
|
||||
accessKey: ""
|
||||
secretKey: ""
|
||||
roleArn: "" # IAM role ARN (IRSA)
|
||||
externalId: "" # STS external ID
|
||||
azblob:
|
||||
storageAccount: ""
|
||||
container: ""
|
||||
storageKey: ""
|
||||
gcs:
|
||||
bucket: ""
|
||||
project: ""
|
||||
credentialsJson: "" # Service account JSON
|
||||
```
|
||||
|
||||
## Helm Release
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
release:
|
||||
repo: https://helm.kubeshark.com # Helm chart repository
|
||||
name: kubeshark # Release name
|
||||
namespace: default # Release namespace
|
||||
helmChartPath: "" # Path to local chart (overrides repo)
|
||||
```
|
||||
|
||||
## Storage
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
persistentStorage: false # Enable PVC for worker data
|
||||
persistentStorageStatic: false # Static provisioning
|
||||
persistentStoragePvcVolumeMode: FileSystem # FileSystem or Block
|
||||
efsFileSytemIdAndPath: "" # EFS file system ID (EKS)
|
||||
secrets: [] # Additional secrets to mount
|
||||
storageLimit: 10Gi # Max storage per node
|
||||
storageClass: standard # Default storage class
|
||||
```
|
||||
|
||||
## Resources
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
resources:
|
||||
hub:
|
||||
limits:
|
||||
cpu: "0" # 0 = no limit
|
||||
memory: 5Gi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 50Mi
|
||||
sniffer:
|
||||
limits:
|
||||
cpu: "0"
|
||||
memory: 5Gi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 50Mi
|
||||
tracer:
|
||||
limits:
|
||||
cpu: "0"
|
||||
memory: 5Gi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 50Mi
|
||||
```
|
||||
|
||||
## Health Probes
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
probes:
|
||||
hub:
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 5
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
sniffer:
|
||||
initialDelaySeconds: 5
|
||||
periodSeconds: 5
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
```
|
||||
|
||||
## TLS & Service Mesh
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
serviceMesh: true # Capture mTLS traffic (service mesh)
|
||||
tls: true # Capture OpenSSL/Go TLS traffic
|
||||
disableTlsLog: true # Suppress TLS debug logging
|
||||
packetCapture: best # Capture method: best, af_packet, pcap
|
||||
```
|
||||
|
||||
## Labels, Annotations & Scheduling
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
labels: {} # Additional labels for all pods
|
||||
annotations: {} # Additional annotations for all pods
|
||||
nodeSelectorTerms:
|
||||
hub: # Hub pod node selector
|
||||
- matchExpressions:
|
||||
- key: kubernetes.io/os
|
||||
operator: In
|
||||
values: [linux]
|
||||
workers: # Worker DaemonSet node selector
|
||||
- matchExpressions:
|
||||
- key: kubernetes.io/os
|
||||
operator: In
|
||||
values: [linux]
|
||||
front: # Frontend pod node selector
|
||||
- matchExpressions:
|
||||
- key: kubernetes.io/os
|
||||
operator: In
|
||||
values: [linux]
|
||||
tolerations:
|
||||
hub: []
|
||||
workers:
|
||||
- operator: Exists
|
||||
effect: NoExecute # Workers tolerate NoExecute by default
|
||||
front: []
|
||||
priorityClass: "" # PriorityClassName for pods
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
auth:
|
||||
enabled: false
|
||||
type: saml # Only SAML supported currently
|
||||
roles:
|
||||
admin:
|
||||
filter: "" # KFL filter restricting visible traffic
|
||||
canDownloadPCAP: true
|
||||
canUseScripting: true
|
||||
scriptingPermissions:
|
||||
canSave: true
|
||||
canActivate: true
|
||||
canDelete: true
|
||||
canUpdateTargetedPods: true
|
||||
canStopTrafficCapturing: true
|
||||
canControlDissection: true
|
||||
showAdminConsoleLink: true
|
||||
rolesClaim: role # SAML attribute for role mapping
|
||||
defaultRole: "" # Role for users without a role claim
|
||||
defaultFilter: "" # Default KFL filter for all users
|
||||
saml:
|
||||
idpMetadataUrl: "" # SAML IdP metadata URL
|
||||
x509crt: "" # SP certificate
|
||||
x509key: "" # SP private key
|
||||
```
|
||||
|
||||
## Ingress
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
ingress:
|
||||
enabled: false
|
||||
className: "" # nginx, alb, traefik, etc.
|
||||
host: ks.svc.cluster.local
|
||||
path: /
|
||||
tls: [] # TLS configuration
|
||||
annotations: {} # Ingress annotations
|
||||
```
|
||||
|
||||
## Protocol Dissectors
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
enabledDissectors:
|
||||
- amqp
|
||||
- dns
|
||||
- http
|
||||
- icmp
|
||||
- kafka
|
||||
- mongodb
|
||||
- mysql
|
||||
- postgresql
|
||||
- redis
|
||||
- ws
|
||||
- ldap
|
||||
- radius
|
||||
- diameter
|
||||
- udp-flow
|
||||
- tcp-flow
|
||||
- udp-conn
|
||||
- tcp-conn
|
||||
portMapping: # Default port-to-protocol mappings
|
||||
http: [80, 443, 8080]
|
||||
amqp: [5671, 5672]
|
||||
kafka: [9092]
|
||||
mongodb: [27017]
|
||||
mysql: [3306]
|
||||
postgresql: [5432]
|
||||
redis: [6379]
|
||||
ldap: [389]
|
||||
diameter: [3868]
|
||||
customMacros:
|
||||
https: "tls and (http or http2)"
|
||||
```
|
||||
|
||||
## Networking & Security
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
hostNetwork: true # Use host network (required for capture)
|
||||
ipv6: true # Enable IPv6 support
|
||||
mountBpf: true # Mount BPF filesystem
|
||||
securityContext:
|
||||
privileged: true
|
||||
appArmorProfile:
|
||||
type: ""
|
||||
localhostProfile: ""
|
||||
seLinuxOptions:
|
||||
level: ""
|
||||
role: ""
|
||||
type: ""
|
||||
user: ""
|
||||
capabilities:
|
||||
networkCapture: [NET_RAW, NET_ADMIN]
|
||||
serviceMeshCapture: [SYS_ADMIN, SYS_PTRACE, DAC_OVERRIDE]
|
||||
ebpfCapture: [SYS_ADMIN, SYS_PTRACE, SYS_RESOURCE, IPC_LOCK]
|
||||
```
|
||||
|
||||
## Dashboard
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
dashboard:
|
||||
streamingType: connect-rpc
|
||||
completeStreamingEnabled: true
|
||||
clusterWideMapEnabled: false
|
||||
entriesLimit: "300000"
|
||||
routing:
|
||||
front:
|
||||
basePath: "" # Base path for reverse proxy
|
||||
```
|
||||
|
||||
## Scripting
|
||||
|
||||
```yaml
|
||||
scripting:
|
||||
enabled: false
|
||||
env: {} # Environment variables for scripts
|
||||
source: "" # Git repo for scripts
|
||||
sources: [] # Multiple script sources
|
||||
watchScripts: true # Watch for script changes
|
||||
active: [] # Active scripts
|
||||
console: true # Enable script console
|
||||
```
|
||||
|
||||
## Misc
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
dryRun: false # Preview targeted pods without deploying
|
||||
debug: false # Enable debug mode
|
||||
telemetry:
|
||||
enabled: true # Anonymous usage telemetry
|
||||
resourceGuard:
|
||||
enabled: false # Resource usage guard
|
||||
watchdog:
|
||||
enabled: false # Watchdog process
|
||||
gitops:
|
||||
enabled: false # GitOps mode
|
||||
defaultFilter: "" # Default KFL display filter
|
||||
globalFilter: "" # Global KFL filter (cannot be overridden)
|
||||
dns:
|
||||
nameservers: [] # Custom DNS nameservers
|
||||
searches: [] # Custom DNS search domains
|
||||
options: [] # Custom DNS options
|
||||
misc:
|
||||
jsonTTL: 5m # TTL for JSON entries
|
||||
pcapTTL: "0" # TTL for PCAP files (0 = no TTL)
|
||||
trafficSampleRate: 100 # Traffic sampling rate (1-100)
|
||||
resolutionStrategy: auto # IP resolution: auto, dns, k8s
|
||||
detectDuplicates: false # Detect duplicate packets
|
||||
staleTimeoutSeconds: 30 # Timeout for stale connections
|
||||
tcpFlowTimeout: 1200 # TCP flow idle timeout (seconds)
|
||||
udpFlowTimeout: 1200 # UDP flow idle timeout (seconds)
|
||||
|
||||
headless: false # Suppress browser auto-open
|
||||
license: "" # Kubeshark Pro license key
|
||||
timezone: "" # Override timezone
|
||||
logLevel: warning # Log level: debug, info, warning, error
|
||||
|
||||
kube:
|
||||
configPath: "" # Custom kubeconfig path
|
||||
context: "" # Kubernetes context name
|
||||
```
|
||||
401
skills/kfl/SKILL.md
Normal file
401
skills/kfl/SKILL.md
Normal file
@@ -0,0 +1,401 @@
|
||||
---
|
||||
name: kfl
|
||||
user-invocable: false
|
||||
description: >
|
||||
KFL2 (Kubeshark Filter Language) reference. This skill MUST be loaded before
|
||||
writing, constructing, or suggesting any KFL filter expression. KFL is statically
|
||||
typed — incorrect field names or syntax will fail silently or error. Do not guess
|
||||
at KFL syntax without this skill loaded. Trigger on any mention of KFL, CEL filters,
|
||||
traffic filtering, display filters, query syntax, filter expressions, write a filter,
|
||||
construct a query, build a KFL, create a filter expression, "how do I filter",
|
||||
"show me only", "find traffic where", protocol-specific queries (HTTP status codes,
|
||||
DNS lookups, Redis commands, Kafka topics), Kubernetes-aware filtering (by namespace,
|
||||
pod, service, label, annotation), L4 connection/flow filters, time-based queries,
|
||||
or any request to slice/search/narrow network traffic in Kubeshark. Also trigger
|
||||
when other skills need to construct filters — KFL is the query language for all
|
||||
Kubeshark traffic analysis.
|
||||
last-updated: 2026-05-08
|
||||
---
|
||||
|
||||
# KFL2 — Kubeshark Filter Language
|
||||
|
||||
You are a KFL2 expert. KFL2 is built on Google's CEL (Common Expression Language)
|
||||
and is the query language for all Kubeshark traffic analysis. It operates as a
|
||||
**display filter** — it doesn't affect what's captured, only what you see.
|
||||
|
||||
Think of KFL the way you think of SQL for databases or Google search syntax for
|
||||
the web. Kubeshark captures and indexes all cluster traffic; KFL is how you
|
||||
search it.
|
||||
|
||||
For the complete variable and field reference, see `references/kfl2-reference.md`.
|
||||
|
||||
## Core Syntax
|
||||
|
||||
KFL expressions are boolean CEL expressions. An empty filter matches everything.
|
||||
|
||||
### Operators
|
||||
|
||||
| Category | Operators |
|
||||
|----------|-----------|
|
||||
| Comparison | `==`, `!=`, `<`, `<=`, `>`, `>=` |
|
||||
| Logical | `&&`, `\|\|`, `!` |
|
||||
| Arithmetic | `+`, `-`, `*`, `/`, `%` |
|
||||
| Membership | `in` |
|
||||
| Ternary | `condition ? true_val : false_val` |
|
||||
|
||||
### String Functions
|
||||
|
||||
```
|
||||
str.contains(substring) // Substring search
|
||||
str.startsWith(prefix) // Prefix match
|
||||
str.endsWith(suffix) // Suffix match
|
||||
str.matches(regex) // Regex match
|
||||
size(str) // String length
|
||||
```
|
||||
|
||||
### Collection Functions
|
||||
|
||||
```
|
||||
size(collection) // List/map/string length
|
||||
key in map // Key existence
|
||||
map[key] // Value access
|
||||
map_get(map, key, default) // Safe access with default
|
||||
value in list // List membership
|
||||
```
|
||||
|
||||
### Time Functions
|
||||
|
||||
```
|
||||
timestamp("2026-03-14T22:00:00Z") // Parse ISO timestamp
|
||||
duration("5m") // Parse duration
|
||||
now() // Current time (snapshot at filter creation)
|
||||
```
|
||||
|
||||
### Negation
|
||||
|
||||
```
|
||||
!http // Everything that is NOT HTTP
|
||||
http && status_code != 200 // HTTP responses that aren't 200
|
||||
http && !path.contains("/health") // Exclude health checks
|
||||
!(src.pod.namespace == "kube-system") // Exclude system namespace
|
||||
```
|
||||
|
||||
## Protocol Detection
|
||||
|
||||
Boolean flags that indicate which protocol was detected. Use these as the first
|
||||
filter term — they're fast and narrow the search space immediately.
|
||||
|
||||
| Flag | Protocol | Flag | Protocol |
|
||||
|------|----------|------|----------|
|
||||
| `http` | HTTP/1.1, HTTP/2 | `redis` | Redis |
|
||||
| `dns` | DNS | `kafka` | Kafka |
|
||||
| `tls` | eBPF TLS interception | `amqp` | AMQP |
|
||||
| `tcp` | TCP | `ldap` | LDAP |
|
||||
| `udp` | UDP | `ws` | WebSocket |
|
||||
| `sctp` | SCTP | `gql` | GraphQL (v1+v2) |
|
||||
| `icmp` | ICMP | `gqlv1` / `gqlv2` | GraphQL version-specific |
|
||||
| `grpc` | gRPC (HTTP/2 sub-protocol) | `mongodb` | MongoDB |
|
||||
| `mysql` | MySQL | `postgresql` | PostgreSQL |
|
||||
| `radius` | RADIUS | | |
|
||||
| `diameter` | Diameter | `conn` / `flow` | L4 connection/flow tracking |
|
||||
| | | `tcp_conn` / `udp_conn` | Transport-specific connections |
|
||||
|
||||
## Kubernetes Context
|
||||
|
||||
The most common starting point. Filter by where traffic originates or terminates.
|
||||
|
||||
### Pod and Service Fields
|
||||
|
||||
```
|
||||
src.pod.name == "orders-594487879c-7ddxf"
|
||||
dst.pod.namespace == "production"
|
||||
src.service.name == "api-gateway"
|
||||
dst.service.namespace == "payments"
|
||||
```
|
||||
|
||||
Pod fields fall back to service data when pod info is unavailable, so
|
||||
`dst.pod.namespace` works even for service-level entries.
|
||||
|
||||
### Summary Name and Namespace
|
||||
|
||||
Convenience variables that pick the best available identity for a peer:
|
||||
|
||||
```
|
||||
src.name == "api-gateway" // pod > service > dns > process
|
||||
dst.name.contains("payment") // works across identity types
|
||||
src.namespace == "production" // pod namespace, falls back to service
|
||||
dst.namespace != "kube-system" // exclude system namespace
|
||||
```
|
||||
|
||||
### Aggregate Collections
|
||||
|
||||
Match against any direction (src or dst):
|
||||
|
||||
```
|
||||
"production" in namespaces // Any namespace match
|
||||
"orders" in pods // Any pod name match
|
||||
"api-gateway" in services // Any service name match
|
||||
```
|
||||
|
||||
### Labels and Annotations
|
||||
|
||||
```
|
||||
map_get(local_labels, "app", "") == "checkout" // Safe access with default
|
||||
map_get(remote_labels, "version", "") == "canary"
|
||||
"tier" in local_labels // Label existence check
|
||||
```
|
||||
|
||||
Always use `map_get()` for labels and annotations — direct access like
|
||||
`local_labels["app"]` errors if the key doesn't exist.
|
||||
|
||||
### Node and Process
|
||||
|
||||
```
|
||||
node_name == "ip-10-0-25-170.ec2.internal"
|
||||
local_process_name == "nginx"
|
||||
remote_process_name.contains("postgres")
|
||||
```
|
||||
|
||||
### DNS Resolution
|
||||
|
||||
```
|
||||
src.dns == "api.example.com"
|
||||
dst.dns.contains("redis")
|
||||
```
|
||||
|
||||
## HTTP Filtering
|
||||
|
||||
HTTP is the most common protocol for API-level investigation.
|
||||
|
||||
### Fields
|
||||
|
||||
| Field | Type | Example |
|
||||
|-------|------|---------|
|
||||
| `method` | string | `"GET"`, `"POST"`, `"PUT"`, `"DELETE"` |
|
||||
| `url` | string | Full path + query: `"/api/users?id=123"` |
|
||||
| `path` | string | Path only: `"/api/users"` |
|
||||
| `status_code` | int | `200`, `404`, `500` |
|
||||
| `http_version` | string | `"HTTP/1.1"`, `"HTTP/2"` |
|
||||
| `request.headers` | map | `request.headers["content-type"]` |
|
||||
| `response.headers` | map | `response.headers["server"]` |
|
||||
| `request.cookies` | map | `request.cookies["session"]` |
|
||||
| `response.cookies` | map | `response.cookies["token"]` |
|
||||
| `query_string` | map | `query_string["id"]` |
|
||||
| `request_body_size` | int | Request body bytes |
|
||||
| `response_body_size` | int | Response body bytes |
|
||||
| `elapsed_time` | int | Duration in **microseconds** |
|
||||
|
||||
### Common Patterns
|
||||
|
||||
```
|
||||
// Error investigation
|
||||
http && status_code >= 500 // Server errors
|
||||
http && status_code == 429 // Rate limiting
|
||||
http && status_code >= 400 && status_code < 500 // Client errors
|
||||
|
||||
// Endpoint targeting
|
||||
http && method == "POST" && path.contains("/orders")
|
||||
http && url.matches(".*/api/v[0-9]+/users.*")
|
||||
|
||||
// Performance
|
||||
http && elapsed_time > 5000000 // > 5 seconds
|
||||
http && response_body_size > 1000000 // > 1MB responses
|
||||
|
||||
// Header inspection
|
||||
http && "authorization" in request.headers
|
||||
http && request.headers["content-type"] == "application/json"
|
||||
|
||||
// GraphQL (subset of HTTP)
|
||||
gql && method == "POST" && status_code >= 400
|
||||
|
||||
// Only eBPF-intercepted TLS traffic (decrypted HTTPS)
|
||||
tls && http && status_code >= 500
|
||||
```
|
||||
|
||||
> **Note on `tls`**: The `tls` flag is an alias for `capture_source == "ebpf_tls"`.
|
||||
> It indicates traffic captured via eBPF TLS interception, not TLS protocol dissection.
|
||||
|
||||
## DNS Filtering
|
||||
|
||||
DNS issues are often the hidden root cause of outages.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `dns_questions` | []string | Question domain names |
|
||||
| `dns_answers` | []string | Answer domain names |
|
||||
| `dns_question_types` | []string | Record types: A, AAAA, CNAME, MX, TXT, SRV, PTR |
|
||||
| `dns_request` | bool | Is request |
|
||||
| `dns_response` | bool | Is response |
|
||||
| `dns_request_length` | int | Request size |
|
||||
| `dns_response_length` | int | Response size |
|
||||
|
||||
```
|
||||
dns && "api.external-service.com" in dns_questions
|
||||
dns && dns_response && status_code != 0 // Failed lookups
|
||||
dns && "A" in dns_question_types // A record queries
|
||||
dns && size(dns_questions) > 1 // Multi-question
|
||||
```
|
||||
|
||||
## Database and Messaging Protocols
|
||||
|
||||
### Redis
|
||||
|
||||
```
|
||||
redis && redis_type == "GET" // Command type
|
||||
redis && redis_key.startsWith("session:") // Key pattern
|
||||
redis && redis_command.contains("DEL") // Command search
|
||||
redis && redis_total_size > 10000 // Large operations
|
||||
```
|
||||
|
||||
### Kafka
|
||||
|
||||
```
|
||||
kafka && kafka_api_key_name == "PRODUCE" // Produce operations
|
||||
kafka && kafka_client_id == "payment-processor" // Client filtering
|
||||
kafka && kafka_request_summary.contains("orders") // Topic filtering
|
||||
kafka && kafka_size > 10000 // Large messages
|
||||
```
|
||||
|
||||
### MongoDB
|
||||
|
||||
```
|
||||
mongodb && mongodb_command == "find" // Find operations
|
||||
mongodb && mongodb_collection == "users" // Collection filtering
|
||||
mongodb && mongodb_database == "mydb" // Database filtering
|
||||
mongodb && !mongodb_success // Failed operations
|
||||
mongodb && mongodb_error_code != 0 // Error code filtering
|
||||
mongodb && mongodb_total_size > 10000 // Large operations
|
||||
```
|
||||
|
||||
### MySQL
|
||||
|
||||
```
|
||||
mysql && mysql_command == "COM_QUERY" // SQL queries
|
||||
mysql && mysql_query.contains("SELECT") // SELECT statements
|
||||
mysql && mysql_database == "orders_db" // Database filtering
|
||||
mysql && !mysql_success // Failed queries
|
||||
mysql && mysql_error_code != 0 // Error code filtering
|
||||
mysql && mysql_total_size > 10000 // Large queries
|
||||
```
|
||||
|
||||
### PostgreSQL
|
||||
|
||||
```
|
||||
postgresql && postgresql_command == "COM_QUERY" // Query commands
|
||||
postgresql && postgresql_query.contains("SELECT") // SELECT statements
|
||||
postgresql && postgresql_database == "orders_db" // Database filtering
|
||||
postgresql && postgresql_user == "admin" // User filtering
|
||||
postgresql && !postgresql_success // Failed queries
|
||||
postgresql && postgresql_error_code != "" // Error code filtering (SQLSTATE string)
|
||||
postgresql && postgresql_total_size > 10000 // Large queries
|
||||
```
|
||||
|
||||
> **Note**: `postgresql_error_code` is a **string** (SQLSTATE code like `"23505"`),
|
||||
> not an int. This differs from MySQL's `mysql_error_code` which is an int.
|
||||
|
||||
### gRPC
|
||||
|
||||
gRPC is a sub-protocol of HTTP/2. All HTTP variables are also available on gRPC entries.
|
||||
|
||||
```
|
||||
grpc && grpc_method == "SayHello" // Method filtering
|
||||
grpc && grpc_status != 0 // Non-OK status codes
|
||||
grpc && grpc_status == 14 // UNAVAILABLE
|
||||
grpc && grpc_method.contains("Create") // Method pattern
|
||||
grpc && elapsed_time > 1000000 // Slow gRPC calls (>1s)
|
||||
```
|
||||
|
||||
### AMQP, LDAP, RADIUS, Diameter
|
||||
|
||||
```
|
||||
amqp && amqp_method == "basic.publish" // AMQP publish
|
||||
ldap && ldap_type == "bind" // LDAP bind requests
|
||||
radius && radius_code_name == "Access-Request" // RADIUS auth
|
||||
diameter && diameter_method.contains("Credit") // Diameter credit control
|
||||
```
|
||||
|
||||
For the full variable list for these protocols, see `references/kfl2-reference.md`.
|
||||
|
||||
## Transport Layer (L4)
|
||||
|
||||
### TCP/UDP Fields
|
||||
|
||||
```
|
||||
tcp && tcp_error_type != "" // TCP errors
|
||||
udp && udp_length > 1000 // Large UDP packets
|
||||
```
|
||||
|
||||
### Connection Tracking
|
||||
|
||||
```
|
||||
conn && conn_state == "open" // Active connections
|
||||
conn && conn_local_bytes > 1000000 // High-volume
|
||||
conn && "HTTP" in conn_l7_detected // L7 protocol detection
|
||||
tcp_conn && conn_state == "closed" // Closed TCP connections
|
||||
```
|
||||
|
||||
### Flow Tracking (with Rate Metrics)
|
||||
|
||||
```
|
||||
flow && flow_local_pps > 1000 // High packet rate
|
||||
flow && flow_local_bps > 1000000 // High bandwidth
|
||||
flow && flow_state == "closed" && "TLS" in flow_l7_detected
|
||||
tcp_flow && flow_local_bps > 5000000 // High-throughput TCP
|
||||
```
|
||||
|
||||
## Network Layer
|
||||
|
||||
```
|
||||
src.ip == "10.0.53.101"
|
||||
dst.ip.startsWith("192.168.")
|
||||
src.port == 8080
|
||||
dst.port >= 8000 && dst.port <= 9000
|
||||
```
|
||||
|
||||
## Time-Based Filtering
|
||||
|
||||
```
|
||||
timestamp > timestamp("2026-03-14T22:00:00Z")
|
||||
timestamp >= timestamp("2026-03-14T22:00:00Z") && timestamp <= timestamp("2026-03-14T23:00:00Z")
|
||||
timestamp > now() - duration("5m") // Last 5 minutes
|
||||
elapsed_time > 2000000 // Latency > 2 seconds
|
||||
```
|
||||
|
||||
## Building Filters: Progressive Narrowing
|
||||
|
||||
The most effective investigation technique — start broad, add constraints:
|
||||
|
||||
```
|
||||
// Step 1: Protocol + namespace
|
||||
http && dst.pod.namespace == "production"
|
||||
|
||||
// Step 2: Add error condition
|
||||
http && dst.pod.namespace == "production" && status_code >= 500
|
||||
|
||||
// Step 3: Narrow to service
|
||||
http && dst.pod.namespace == "production" && status_code >= 500 && dst.service.name == "payment-service"
|
||||
|
||||
// Step 4: Narrow to endpoint
|
||||
http && dst.pod.namespace == "production" && status_code >= 500 && dst.service.name == "payment-service" && path.contains("/charge")
|
||||
|
||||
// Step 5: Add timing
|
||||
http && dst.pod.namespace == "production" && status_code >= 500 && dst.service.name == "payment-service" && path.contains("/charge") && elapsed_time > 2000000
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
1. **Protocol flags first** — `http && ...` is faster than `... && http`
|
||||
2. **`startsWith`/`endsWith` over `contains`** — prefix/suffix checks are faster
|
||||
3. **Specific ports before string ops** — `dst.port == 80` is cheaper than `url.contains(...)`
|
||||
4. **Use `map_get` for labels** — avoids errors on missing keys
|
||||
5. **Keep filters simple** — CEL short-circuits on `&&`, so put cheap checks first
|
||||
|
||||
## Type Safety
|
||||
|
||||
KFL2 is statically typed. Common gotchas:
|
||||
|
||||
- `status_code` is `int`, not string — use `status_code == 200`, not `"200"`
|
||||
- `elapsed_time` is in **microseconds** — 5 seconds = `5000000`
|
||||
- `timestamp` requires `timestamp()` function — not a raw string
|
||||
- Map access on missing keys errors — use `key in map` or `map_get()` first
|
||||
- List membership uses `value in list` — not `list.contains(value)`
|
||||
491
skills/kfl/references/kfl2-reference.md
Normal file
491
skills/kfl/references/kfl2-reference.md
Normal file
@@ -0,0 +1,491 @@
|
||||
# KFL2 Complete Variable and Field Reference
|
||||
|
||||
> Last synced with [kfl2 repo](https://github.com/kubeshark/kfl2): 2026-05-08
|
||||
|
||||
This is the exhaustive reference for every variable available in KFL2 filters.
|
||||
KFL2 is built on Google's CEL (Common Expression Language) and evaluates against
|
||||
Kubeshark's protobuf-based `BaseEntry` structure.
|
||||
|
||||
## Most Commonly Used Variables
|
||||
|
||||
These are the variables you'll reach for in 90% of investigations:
|
||||
|
||||
| Variable | Type | What it's for |
|
||||
|----------|------|---------------|
|
||||
| `status_code` | int | HTTP response status (200, 404, 500) |
|
||||
| `method` | string | HTTP method (GET, POST, PUT, DELETE) |
|
||||
| `path` | string | URL path without query string |
|
||||
| `dst.pod.namespace` | string | Where traffic is going (namespace) |
|
||||
| `dst.service.name` | string | Where traffic is going (service) |
|
||||
| `src.pod.name` | string | Where traffic comes from (pod) |
|
||||
| `elapsed_time` | int | Request duration in microseconds |
|
||||
| `dns_questions` | []string | DNS domains being queried |
|
||||
| `namespaces` | []string | All namespaces involved (src + dst) |
|
||||
|
||||
## Network-Level Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `src.ip` | string | Source IP address | `"10.0.53.101"` |
|
||||
| `dst.ip` | string | Destination IP address | `"192.168.1.1"` |
|
||||
| `src.port` | int | Source port number | `43210` |
|
||||
| `dst.port` | int | Destination port number | `8080` |
|
||||
| `protocol` | string | Detected protocol type | `"HTTP"`, `"DNS"` |
|
||||
|
||||
## Identity and Metadata Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `id` | int | BaseEntry unique identifier (assigned by sniffer) |
|
||||
| `node_id` | string | Node identifier (assigned by hub) |
|
||||
| `index` | int | Entry index for stream uniqueness |
|
||||
| `stream` | string | Stream identifier (hex string) |
|
||||
| `timestamp` | timestamp | Event time (UTC), use with `timestamp()` function |
|
||||
| `elapsed_time` | int | Response-request latency in microseconds |
|
||||
| `worker` | string | Worker identifier |
|
||||
|
||||
## Cross-Reference Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `conn_id` | int | L7 to L4 connection cross-reference ID |
|
||||
| `flow_id` | int | L7 to L4 flow cross-reference ID |
|
||||
| `has_pcap` | bool | Whether PCAP data is available for this entry |
|
||||
|
||||
## Capture Source Variables
|
||||
|
||||
| Variable | Type | Description | Values |
|
||||
|----------|------|-------------|--------|
|
||||
| `capture_source` | string | Canonical capture source | `"unspecified"`, `"af_packet"`, `"ebpf"`, `"ebpf_tls"` |
|
||||
| `capture_backend` | string | Backend family | `"af_packet"`, `"ebpf"` |
|
||||
| `capture_source_code` | int | Numeric enum | 0=unspecified, 1=af_packet, 2=ebpf, 3=ebpf_tls |
|
||||
| `capture` | map | Nested map access | `capture["source"]`, `capture["backend"]` |
|
||||
|
||||
## Protocol Detection Flags
|
||||
|
||||
Boolean variables indicating detected protocol. Use as first filter term for performance.
|
||||
|
||||
| Variable | Protocol | Variable | Protocol |
|
||||
|----------|----------|----------|----------|
|
||||
| `http` | HTTP/1.1, HTTP/2 | `redis` | Redis |
|
||||
| `dns` | DNS | `kafka` | Kafka |
|
||||
| `tls` | eBPF TLS interception | `amqp` | AMQP messaging |
|
||||
| `tcp` | TCP transport | `ldap` | LDAP directory |
|
||||
| `udp` | UDP transport | `ws` | WebSocket |
|
||||
| `sctp` | SCTP streaming | `gql` | GraphQL (v1 or v2) |
|
||||
| `icmp` | ICMP | `gqlv1` | GraphQL v1 only |
|
||||
| `grpc` | gRPC (HTTP/2 sub-protocol) | `gqlv2` | GraphQL v2 only |
|
||||
| `mongodb` | MongoDB | `mysql` | MySQL |
|
||||
| `postgresql` | PostgreSQL | `diameter` | Diameter |
|
||||
| `radius` | RADIUS auth | | |
|
||||
| | | `conn` | L4 connection tracking |
|
||||
| `flow` | L4 flow tracking | `tcp_conn` | TCP connection tracking |
|
||||
| `tcp_flow` | TCP flow tracking | `udp_conn` | UDP connection tracking |
|
||||
| `udp_flow` | UDP flow tracking | | |
|
||||
|
||||
## HTTP Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `method` | string | HTTP method | `"GET"`, `"POST"`, `"PUT"`, `"DELETE"`, `"PATCH"` |
|
||||
| `url` | string | Full URL path and query string | `"/api/users?id=123"` |
|
||||
| `path` | string | URL path component (no query) | `"/api/users"` |
|
||||
| `status_code` | int | HTTP response status code | `200`, `404`, `500` |
|
||||
| `http_version` | string | HTTP protocol version | `"HTTP/1.1"`, `"HTTP/2"` |
|
||||
| `query_string` | map[string]string | Parsed URL query parameters | `query_string["id"]` → `"123"` |
|
||||
| `request.headers` | map[string]string | Request HTTP headers | `request.headers["content-type"]` |
|
||||
| `response.headers` | map[string]string | Response HTTP headers | `response.headers["server"]` |
|
||||
| `request.cookies` | map[string]string | Request cookies | `request.cookies["session"]` |
|
||||
| `response.cookies` | map[string]string | Response cookies | `response.cookies["token"]` |
|
||||
| `request_headers_size` | int | Request headers size in bytes | |
|
||||
| `request_body_size` | int | Request body size in bytes | |
|
||||
| `response_headers_size` | int | Response headers size in bytes | |
|
||||
| `response_body_size` | int | Response body size in bytes | |
|
||||
|
||||
GraphQL requests have `gql` (or `gqlv1`/`gqlv2`) set to true and all HTTP
|
||||
variables available.
|
||||
|
||||
**Example**: `http && method == "POST" && status_code >= 500 && path.contains("/api")`
|
||||
|
||||
## DNS Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `dns_questions` | []string | Question domain names (request + response) | `["example.com"]` |
|
||||
| `dns_answers` | []string | Answer domain names | `["1.2.3.4"]` |
|
||||
| `dns_question_types` | []string | Record types in questions | `["A"]`, `["AAAA"]`, `["CNAME"]` |
|
||||
| `dns_request` | bool | Is DNS request message | |
|
||||
| `dns_response` | bool | Is DNS response message | |
|
||||
| `dns_request_length` | int | DNS request size in bytes (0 if absent) | |
|
||||
| `dns_response_length` | int | DNS response size in bytes (0 if absent) | |
|
||||
| `dns_total_size` | int | Sum of request + response sizes | |
|
||||
|
||||
Supported question types: A, AAAA, NS, CNAME, SOA, MX, TXT, SRV, PTR, ANY.
|
||||
|
||||
**Example**: `dns && dns_response && status_code != 0` (failed DNS lookups)
|
||||
|
||||
## TLS Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `tls` | bool | eBPF TLS interception (alias for `capture_source == "ebpf_tls"`) | |
|
||||
| `tls_summary` | string | TLS handshake summary | `"ClientHello"`, `"ServerHello"` |
|
||||
| `tls_info` | string | TLS connection details | `"TLS 1.3, AES-256-GCM"` |
|
||||
| `tls_request_size` | int | TLS request size in bytes | |
|
||||
| `tls_response_size` | int | TLS response size in bytes | |
|
||||
| `tls_total_size` | int | Sum of request + response (computed if not provided) | |
|
||||
|
||||
## TCP Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `tcp` | bool | TCP payload detected |
|
||||
| `tcp_method` | string | TCP method information |
|
||||
| `tcp_payload` | bytes | Raw TCP payload data |
|
||||
| `tcp_error_type` | string | TCP error type (empty if none) |
|
||||
| `tcp_error_message` | string | TCP error message (empty if none) |
|
||||
|
||||
## UDP Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `udp` | bool | UDP payload detected |
|
||||
| `udp_length` | int | UDP packet length |
|
||||
| `udp_checksum` | int | UDP checksum value |
|
||||
| `udp_payload` | bytes | Raw UDP payload data |
|
||||
|
||||
## SCTP Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `sctp` | bool | SCTP payload detected |
|
||||
| `sctp_checksum` | int | SCTP checksum value |
|
||||
| `sctp_chunk_type` | string | SCTP chunk type |
|
||||
| `sctp_length` | int | SCTP chunk length |
|
||||
|
||||
## ICMP Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `icmp` | bool | ICMP payload detected |
|
||||
| `icmp_type` | string | ICMP type code |
|
||||
| `icmp_version` | int | ICMP version (4 or 6) |
|
||||
| `icmp_length` | int | ICMP message length |
|
||||
|
||||
## WebSocket Variables
|
||||
|
||||
| Variable | Type | Description | Values |
|
||||
|----------|------|-------------|--------|
|
||||
| `ws` | bool | WebSocket payload detected | |
|
||||
| `ws_opcode` | string | WebSocket operation code | `"text"`, `"binary"`, `"close"`, `"ping"`, `"pong"` |
|
||||
| `ws_request` | bool | Is WebSocket request | |
|
||||
| `ws_response` | bool | Is WebSocket response | |
|
||||
| `ws_request_payload_data` | string | Request payload (safely truncated) | |
|
||||
| `ws_request_payload_length` | int | Request payload length in bytes | |
|
||||
| `ws_response_payload_length` | int | Response payload length in bytes | |
|
||||
|
||||
## Redis Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `redis` | bool | Redis payload detected | |
|
||||
| `redis_type` | string | Redis command verb | `"GET"`, `"SET"`, `"DEL"`, `"HGET"` |
|
||||
| `redis_command` | string | Full Redis command line | `"GET session:1234"` |
|
||||
| `redis_key` | string | Key (truncated to 64 bytes) | `"session:1234"` |
|
||||
| `redis_request_size` | int | Request size (0 if absent) | |
|
||||
| `redis_response_size` | int | Response size (0 if absent) | |
|
||||
| `redis_total_size` | int | Sum of request + response | |
|
||||
|
||||
**Example**: `redis && redis_type == "GET" && redis_key.startsWith("session:")`
|
||||
|
||||
## Kafka Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `kafka` | bool | Kafka payload detected | |
|
||||
| `kafka_api_key` | int | Kafka API key number | 0=FETCH, 1=PRODUCE |
|
||||
| `kafka_api_key_name` | string | Human-readable API operation | `"PRODUCE"`, `"FETCH"` |
|
||||
| `kafka_client_id` | string | Kafka client identifier | `"payment-processor"` |
|
||||
| `kafka_size` | int | Message size (request preferred, else response) | |
|
||||
| `kafka_request` | bool | Is Kafka request | |
|
||||
| `kafka_response` | bool | Is Kafka response | |
|
||||
| `kafka_request_summary` | string | Request summary/topic | `"orders-topic"` |
|
||||
| `kafka_request_size` | int | Request size (0 if absent) | |
|
||||
| `kafka_response_size` | int | Response size (0 if absent) | |
|
||||
|
||||
**Example**: `kafka && kafka_api_key_name == "PRODUCE" && kafka_request_summary.contains("orders")`
|
||||
|
||||
## AMQP Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `amqp` | bool | AMQP payload detected | |
|
||||
| `amqp_method` | string | AMQP method name | `"basic.publish"`, `"channel.open"` |
|
||||
| `amqp_summary` | string | Operation summary | |
|
||||
| `amqp_request` | bool | Is AMQP request | |
|
||||
| `amqp_response` | bool | Is AMQP response | |
|
||||
| `amqp_request_length` | int | Request length (0 if absent) | |
|
||||
| `amqp_response_length` | int | Response length (0 if absent) | |
|
||||
| `amqp_total_size` | int | Sum of request + response | |
|
||||
|
||||
## LDAP Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `ldap` | bool | LDAP payload detected |
|
||||
| `ldap_type` | string | LDAP operation type (request preferred) |
|
||||
| `ldap_summary` | string | Operation summary |
|
||||
| `ldap_request` | bool | Is LDAP request |
|
||||
| `ldap_response` | bool | Is LDAP response |
|
||||
| `ldap_request_length` | int | Request length (0 if absent) |
|
||||
| `ldap_response_length` | int | Response length (0 if absent) |
|
||||
| `ldap_total_size` | int | Sum of request + response |
|
||||
|
||||
## RADIUS Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `radius` | bool | RADIUS payload detected | |
|
||||
| `radius_code` | int | RADIUS code (request preferred) | |
|
||||
| `radius_code_name` | string | Code name | `"Access-Request"` |
|
||||
| `radius_request` | bool | Is RADIUS request | |
|
||||
| `radius_response` | bool | Is RADIUS response | |
|
||||
| `radius_request_authenticator` | string | Request authenticator (hex) | |
|
||||
| `radius_request_length` | int | Request size (0 if absent) | |
|
||||
| `radius_response_length` | int | Response size (0 if absent) | |
|
||||
| `radius_total_size` | int | Sum of request + response | |
|
||||
|
||||
## Diameter Variables
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `diameter` | bool | Diameter payload detected |
|
||||
| `diameter_method` | string | Method name (request preferred) |
|
||||
| `diameter_summary` | string | Operation summary |
|
||||
| `diameter_request` | bool | Is Diameter request |
|
||||
| `diameter_response` | bool | Is Diameter response |
|
||||
| `diameter_request_length` | int | Request size (0 if absent) |
|
||||
| `diameter_response_length` | int | Response size (0 if absent) |
|
||||
| `diameter_total_size` | int | Sum of request + response |
|
||||
|
||||
## MongoDB Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `mongodb` | bool | MongoDB payload detected | |
|
||||
| `mongodb_command` | string | Operation type | `"find"`, `"insert"`, `"update"`, `"delete"` |
|
||||
| `mongodb_database` | string | Database name | `"mydb"` |
|
||||
| `mongodb_collection` | string | Collection name | `"users"` |
|
||||
| `mongodb_opcode` | string | Operation opcode name | |
|
||||
| `mongodb_request_size` | int | Request size in bytes | |
|
||||
| `mongodb_response_size` | int | Response size in bytes | |
|
||||
| `mongodb_total_size` | int | Combined request + response size | |
|
||||
| `mongodb_success` | bool | Operation success status | |
|
||||
| `mongodb_error_code` | int | Error code | |
|
||||
| `mongodb_error_message` | string | Error description | |
|
||||
| `mongodb_error_code_name` | string | Named error code | |
|
||||
|
||||
**Example**: `mongodb && mongodb_command == "find" && mongodb_collection == "users"`
|
||||
|
||||
## MySQL Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `mysql` | bool | MySQL payload detected | |
|
||||
| `mysql_command` | string | SQL command name | `"COM_QUERY"`, `"COM_STMT_PREPARE"` |
|
||||
| `mysql_query` | string | Full SQL query text | `"SELECT * FROM users"` |
|
||||
| `mysql_database` | string | Active database name | `"orders_db"` |
|
||||
| `mysql_statement_id` | int | Prepared statement identifier | |
|
||||
| `mysql_request_size` | int | Request payload size in bytes | |
|
||||
| `mysql_response_size` | int | Response payload size in bytes | |
|
||||
| `mysql_total_size` | int | Combined request + response size | |
|
||||
| `mysql_success` | bool | Response OK status | |
|
||||
| `mysql_error_code` | int | MySQL error code | |
|
||||
| `mysql_error_message` | string | Error description | |
|
||||
|
||||
**Example**: `mysql && mysql_query.contains("SELECT") && !mysql_success`
|
||||
|
||||
## PostgreSQL Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `postgresql` | bool | PostgreSQL payload detected | |
|
||||
| `postgresql_command` | string | Command tag | `"SELECT"`, `"INSERT"`, `"UPDATE"` |
|
||||
| `postgresql_query` | string | Full SQL query text | `"SELECT * FROM users WHERE id = 1"` |
|
||||
| `postgresql_database` | string | Active database name | `"orders_db"` |
|
||||
| `postgresql_user` | string | Authenticated user name | `"app_service"` |
|
||||
| `postgresql_request_size` | int | Request payload size in bytes | |
|
||||
| `postgresql_response_size` | int | Response payload size in bytes | |
|
||||
| `postgresql_total_size` | int | Combined request + response size | |
|
||||
| `postgresql_success` | bool | Response OK status | |
|
||||
| `postgresql_error_code` | **string** | SQLSTATE error code (NOT int) | `"23505"` (unique violation), `"42P01"` (undefined table) |
|
||||
| `postgresql_error_message` | string | Error description | |
|
||||
|
||||
**Important**: Unlike MySQL's `mysql_error_code` (int), `postgresql_error_code` is a
|
||||
**string** because PostgreSQL uses 5-character SQLSTATE codes.
|
||||
|
||||
**Example**: `postgresql && postgresql_query.contains("SELECT") && !postgresql_success`
|
||||
|
||||
## gRPC Variables
|
||||
|
||||
gRPC is a sub-protocol of HTTP/2. When `grpc` is true, all HTTP variables are also available.
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `grpc` | bool | gRPC payload detected | |
|
||||
| `grpc_method` | string | Trailing method name from gRPC :path | `"SayHello"` (from `/helloworld.Greeter/SayHello`) |
|
||||
| `grpc_status` | int | gRPC status code from Grpc-Status trailer | `0`=OK, `5`=NOT_FOUND, `14`=UNAVAILABLE; `-1` on non-gRPC |
|
||||
|
||||
**Example**: `grpc && grpc_status != 0 && grpc_method.contains("Create")`
|
||||
|
||||
## L4 Connection Tracking Variables
|
||||
|
||||
| Variable | Type | Description | Example |
|
||||
|----------|------|-------------|---------|
|
||||
| `conn` | bool | Connection tracking entry | |
|
||||
| `conn_state` | string | Connection state | `"open"`, `"in_progress"`, `"closed"` |
|
||||
| `conn_local_pkts` | int | Packets from local peer | |
|
||||
| `conn_local_bytes` | int | Bytes from local peer | |
|
||||
| `conn_remote_pkts` | int | Packets from remote peer | |
|
||||
| `conn_remote_bytes` | int | Bytes from remote peer | |
|
||||
| `conn_l7_detected` | []string | L7 protocols detected on connection | `["HTTP", "TLS"]` |
|
||||
| `conn_group_id` | int | Connection group identifier | |
|
||||
|
||||
**Example**: `conn && conn_state == "open" && conn_local_bytes > 1000000` (high-volume open connections)
|
||||
|
||||
## L4 Flow Tracking Variables
|
||||
|
||||
Flows extend connections with rate metrics (packets/bytes per second).
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `flow` | bool | Flow tracking entry |
|
||||
| `flow_state` | string | Flow state (`"open"`, `"in_progress"`, `"closed"`) |
|
||||
| `flow_local_pkts` | int | Packets from local peer |
|
||||
| `flow_local_bytes` | int | Bytes from local peer |
|
||||
| `flow_remote_pkts` | int | Packets from remote peer |
|
||||
| `flow_remote_bytes` | int | Bytes from remote peer |
|
||||
| `flow_local_pps` | int | Local packets per second |
|
||||
| `flow_local_bps` | int | Local bytes per second |
|
||||
| `flow_remote_pps` | int | Remote packets per second |
|
||||
| `flow_remote_bps` | int | Remote bytes per second |
|
||||
| `flow_l7_detected` | []string | L7 protocols detected on flow |
|
||||
| `flow_group_id` | int | Flow group identifier |
|
||||
|
||||
**Example**: `tcp_flow && flow_local_bps > 5000000` (high-bandwidth TCP flows)
|
||||
|
||||
## Kubernetes Variables
|
||||
|
||||
### Pod and Service (Directional)
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `src.pod.name` | string | Source pod name |
|
||||
| `src.pod.namespace` | string | Source pod namespace |
|
||||
| `dst.pod.name` | string | Destination pod name |
|
||||
| `dst.pod.namespace` | string | Destination pod namespace |
|
||||
| `src.service.name` | string | Source service name |
|
||||
| `src.service.namespace` | string | Source service namespace |
|
||||
| `dst.service.name` | string | Destination service name |
|
||||
| `dst.service.namespace` | string | Destination service namespace |
|
||||
|
||||
**Fallback behavior**: Pod namespace/name fields automatically fall back to
|
||||
service data when pod info is unavailable. This means `dst.pod.namespace` works
|
||||
even when only service-level resolution exists.
|
||||
|
||||
**Example**: `src.service.name == "api-gateway" && dst.pod.namespace == "production"`
|
||||
|
||||
### Summary Name and Namespace
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `src.name` | string | Worker-enriched summary name of source (pod > service > dns > process) |
|
||||
| `dst.name` | string | Worker-enriched summary name of destination |
|
||||
| `src.namespace` | string | Source namespace with service fallback |
|
||||
| `dst.namespace` | string | Destination namespace with service fallback |
|
||||
|
||||
### Aggregate Collections (Non-Directional)
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `namespaces` | []string | All namespaces (src + dst, pod + service) |
|
||||
| `pods` | []string | All pod names (src + dst) |
|
||||
| `services` | []string | All service names (src + dst) |
|
||||
|
||||
### Labels and Annotations
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `local_labels` | map[string]string | Kubernetes labels of local peer |
|
||||
| `local_annotations` | map[string]string | Kubernetes annotations of local peer |
|
||||
| `remote_labels` | map[string]string | Kubernetes labels of remote peer |
|
||||
| `remote_annotations` | map[string]string | Kubernetes annotations of remote peer |
|
||||
|
||||
Use `map_get(local_labels, "key", "default")` for safe access that won't error
|
||||
on missing keys.
|
||||
|
||||
**Example**: `map_get(local_labels, "app", "") == "checkout" && "production" in namespaces`
|
||||
|
||||
### Node Information
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `node` | map | Nested: `node["name"]`, `node["ip"]` |
|
||||
| `node_name` | string | Node name (flat alias) |
|
||||
| `node_ip` | string | Node IP (flat alias) |
|
||||
| `local_node_name` | string | Node name of local peer |
|
||||
| `remote_node_name` | string | Node name of remote peer |
|
||||
|
||||
### Process Information
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `local_process_name` | string | Process name on local peer |
|
||||
| `remote_process_name` | string | Process name on remote peer |
|
||||
|
||||
### DNS Resolution
|
||||
|
||||
| Variable | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `src.dns` | string | DNS resolution of source IP |
|
||||
| `dst.dns` | string | DNS resolution of destination IP |
|
||||
| `dns_resolutions` | []string | All DNS resolutions (deduplicated) |
|
||||
|
||||
### Resolution Status
|
||||
|
||||
| Variable | Type | Values |
|
||||
|----------|------|--------|
|
||||
| `local_resolution_status` | string | `""` (resolved), `"no_node_mapping"`, `"rpc_error"`, `"rpc_empty"`, `"cache_miss"`, `"queue_full"` |
|
||||
| `remote_resolution_status` | string | Same as above |
|
||||
|
||||
## Default Values
|
||||
|
||||
When a variable is not present in an entry, KFL2 uses these defaults:
|
||||
|
||||
| Type | Default |
|
||||
|------|---------|
|
||||
| string | `""` |
|
||||
| int | `0` |
|
||||
| bool | `false` |
|
||||
| list | `[]` |
|
||||
| map | `{}` |
|
||||
| bytes | `[]` |
|
||||
|
||||
## Protocol Variable Precedence
|
||||
|
||||
For protocols with request/response pairs (Kafka, RADIUS, Diameter), merged
|
||||
fields prefer the **request** side. If no request exists, the response value
|
||||
is used. Size totals are always computed as `request_size + response_size`.
|
||||
|
||||
## CEL Language Features
|
||||
|
||||
KFL2 supports the full CEL specification:
|
||||
|
||||
- **Short-circuit evaluation**: `&&` stops on first false, `||` stops on first true
|
||||
- **Ternary**: `condition ? value_if_true : value_if_false`
|
||||
- **Regex**: `str.matches("pattern")` uses RE2 syntax
|
||||
- **Type coercion**: Timestamps require `timestamp()`, durations require `duration()`
|
||||
- **Null safety**: Use `in` operator or `map_get()` before accessing map keys
|
||||
|
||||
For the full CEL specification, see the
|
||||
[CEL Language Definition](https://github.com/google/cel-spec/blob/master/doc/langdef.md).
|
||||
484
skills/network-rca/SKILL.md
Normal file
484
skills/network-rca/SKILL.md
Normal file
@@ -0,0 +1,484 @@
|
||||
---
|
||||
name: network-rca
|
||||
description: >
|
||||
Kubernetes network root cause analysis skill powered by Kubeshark MCP. Use this skill
|
||||
whenever the user wants to investigate past incidents, perform retrospective traffic
|
||||
analysis, take or manage traffic snapshots, extract PCAPs, dissect L7 API calls from
|
||||
historical captures, compare traffic patterns over time, detect drift or anomalies
|
||||
between snapshots, or do any kind of forensic network analysis in Kubernetes.
|
||||
Also trigger when the user mentions snapshots, raw capture, PCAP extraction,
|
||||
traffic replay, postmortem analysis, "what happened yesterday/last week",
|
||||
root cause analysis, RCA, cloud snapshot storage, snapshot dissection, or KFL filters
|
||||
for historical traffic. Even if the user just says "figure out what went wrong"
|
||||
or "compare today's traffic to yesterday" in a Kubernetes context, use this skill.
|
||||
---
|
||||
|
||||
# Network Root Cause Analysis with Kubeshark MCP
|
||||
|
||||
You are a Kubernetes network forensics specialist. Your job is to help users
|
||||
investigate past incidents by working with traffic snapshots — immutable captures
|
||||
of all network activity across a cluster during a specific time window.
|
||||
|
||||
Kubeshark is a search engine for network traffic. Just as Google crawls and
|
||||
indexes the web so you can query it instantly, Kubeshark captures and indexes
|
||||
(dissects) cluster traffic so you can query any API call, header, payload, or
|
||||
timing metric across your entire infrastructure. Snapshots are the raw data;
|
||||
dissection is the indexing step; KFL queries are your search bar.
|
||||
|
||||
Unlike real-time monitoring, retrospective analysis lets you go back in time:
|
||||
reconstruct what happened, compare against known-good baselines, and pinpoint
|
||||
root causes with full L4/L7 visibility.
|
||||
|
||||
## Timezone Handling
|
||||
|
||||
All timestamps presented to the user **must use the local timezone** of the environment
|
||||
where the agent is running. Users think in local time ("this happened around 3pm"), and
|
||||
UTC-only output adds friction during incident response when speed matters.
|
||||
|
||||
### Rules
|
||||
|
||||
1. **Detect the local timezone** at the start of every investigation. Use the system
|
||||
clock or environment (e.g., `date +%Z` or equivalent) to determine the timezone.
|
||||
2. **Present local time as the primary reference** in all output — summaries, event
|
||||
correlations, time-range references, and tables.
|
||||
3. **Show UTC in parentheses** for clarity, e.g., `15:03:22 IST (12:03:22 UTC)`.
|
||||
4. **Convert tool responses** — Kubeshark MCP tools return timestamps in UTC. Always
|
||||
convert these to local time before presenting to the user.
|
||||
5. **Use local time in natural language** — when describing events, say "the spike at
|
||||
3:23 PM" not "the spike at 12:23 UTC".
|
||||
|
||||
### Snapshot Creation
|
||||
|
||||
When creating snapshots, Kubeshark MCP tools accept UTC timestamps. Convert the user's
|
||||
local time references to UTC before passing them to tools like `create_snapshot` or
|
||||
`export_snapshot_pcap`. Confirm the converted window with the user if there's any
|
||||
ambiguity.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting any analysis, verify the environment is ready.
|
||||
|
||||
### Kubeshark MCP Health Check
|
||||
|
||||
Confirm the Kubeshark MCP is accessible and tools are available. Look for tools
|
||||
like `list_api_calls`, `list_l4_flows`, `create_snapshot`, etc.
|
||||
|
||||
**Tool**: `check_kubeshark_status`
|
||||
|
||||
If tools like `list_api_calls` or `list_l4_flows` are missing from the response,
|
||||
something is wrong with the MCP connection. Guide the user through setup
|
||||
(see Setup Reference at the bottom).
|
||||
|
||||
### Raw Capture Must Be Enabled
|
||||
|
||||
Retrospective analysis depends on raw capture — Kubeshark's kernel-level (eBPF)
|
||||
packet recording that stores traffic at the node level. Without it, snapshots
|
||||
have nothing to work with.
|
||||
|
||||
Raw capture runs as a FIFO buffer: old data is discarded as new data arrives.
|
||||
The buffer size determines how far back you can go. Larger buffer = wider
|
||||
snapshot window.
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
capture:
|
||||
raw:
|
||||
enabled: true
|
||||
storageSize: 10Gi # Per-node FIFO buffer
|
||||
```
|
||||
|
||||
If raw capture isn't enabled, inform the user that retrospective analysis
|
||||
requires it and share the configuration above.
|
||||
|
||||
### Snapshot Storage
|
||||
|
||||
Snapshots are assembled on the Hub's storage, which is ephemeral by default.
|
||||
For serious forensic work, persistent storage is recommended:
|
||||
|
||||
```yaml
|
||||
tap:
|
||||
snapshots:
|
||||
local:
|
||||
storageClass: gp2
|
||||
storageSize: 1000Gi
|
||||
```
|
||||
|
||||
## Core Workflow
|
||||
|
||||
Every investigation starts with a snapshot. After that, you choose one of two
|
||||
investigation routes depending on your goal:
|
||||
|
||||
1. **Determine time window** — When did the issue occur? Use `get_data_boundaries`
|
||||
to see what raw capture data (L4) is available.
|
||||
2. **Check the L7 (dissected) window** — Before any KFL query on *live* data,
|
||||
call `get_l7_data_boundaries`. It returns the per-node + cluster-wide range
|
||||
of dissected API call data plus a `dissection_enabled` flag. Treat L4
|
||||
(`get_data_boundaries`) as the snapshot/PCAP window and L7
|
||||
(`get_l7_data_boundaries`) as the KFL-query window — they can differ
|
||||
significantly because L7 only starts producing entries once dissection is
|
||||
enabled (existing raw capture is **not** retroactively dissected).
|
||||
3. **Create or locate a snapshot** — Either take a new snapshot covering the
|
||||
incident window, or find an existing one with `list_snapshots`.
|
||||
4. **Choose your investigation route** — PCAP or Dissection (see below).
|
||||
|
||||
### Choosing the Right Route
|
||||
|
||||
| | PCAP Route | Dissection Route |
|
||||
|---|---|---|
|
||||
| **Speed** | Immediate — no indexing needed | Takes time to index |
|
||||
| **Filtering** | Nodes, time window, BPF filters | Kubernetes & API-level (pods, labels, paths, status codes) |
|
||||
| **Output** | Cluster-wide PCAP files | Structured query results |
|
||||
| **Investigation by** | Human (Wireshark) | AI agent or human (queryable database) |
|
||||
| **Best for** | Compliance, sharing with network teams, Wireshark deep-dives | Root cause analysis, API-level debugging, automated investigation |
|
||||
|
||||
Both routes are valid and complementary. Use PCAP when you need raw packets
|
||||
for human analysis or compliance. Use Dissection when you want an AI agent
|
||||
to search and analyze traffic programmatically.
|
||||
|
||||
**Default to Dissection.** Unless the user explicitly asks for a PCAP file or
|
||||
Wireshark export, assume Dissection is needed. Any question about workloads,
|
||||
APIs, services, pods, error rates, latency, or traffic patterns requires
|
||||
dissected data.
|
||||
|
||||
## Snapshot Operations
|
||||
|
||||
Both routes start here. A snapshot is an immutable freeze of all cluster traffic
|
||||
in a time window.
|
||||
|
||||
### Check Data Boundaries
|
||||
|
||||
**Tool**: `get_data_boundaries`
|
||||
|
||||
Check what raw capture data exists across the cluster. You can only create
|
||||
snapshots within these boundaries — data outside the window has been rotated
|
||||
out of the FIFO buffer.
|
||||
|
||||
**Example response** (raw tool output is in UTC — convert to local time before presenting):
|
||||
```
|
||||
Cluster-wide:
|
||||
Oldest: 2026-03-14 18:12:34 IST (16:12:34 UTC)
|
||||
Newest: 2026-03-14 20:05:20 IST (18:05:20 UTC)
|
||||
|
||||
Per node:
|
||||
┌─────────────────────────────┬───────────────────────────────┬───────────────────────────────┐
|
||||
│ Node │ Oldest │ Newest │
|
||||
├─────────────────────────────┼───────────────────────────────┼───────────────────────────────┤
|
||||
│ ip-10-0-25-170.ec2.internal │ 18:12:34 IST (16:12:34 UTC) │ 20:03:39 IST (18:03:39 UTC) │
|
||||
│ ip-10-0-32-115.ec2.internal │ 18:13:45 IST (16:13:45 UTC) │ 20:05:20 IST (18:05:20 UTC) │
|
||||
└─────────────────────────────┴───────────────────────────────┴───────────────────────────────┘
|
||||
```
|
||||
|
||||
If the incident falls outside the available window, the data has been rotated
|
||||
out. Suggest increasing `storageSize` for future coverage.
|
||||
|
||||
### Check L7 (Dissected) Data Boundaries
|
||||
|
||||
**Tool**: `get_l7_data_boundaries`
|
||||
|
||||
Check what *dissected* L7 entries exist across the cluster. This is the
|
||||
pre-flight check before any KFL query against live data. The response
|
||||
contains:
|
||||
|
||||
- `dissection_enabled`: if `false`, KFL queries on live data will return
|
||||
empty regardless of L4 boundaries. Enabling dissection only captures
|
||||
*forward* — raw capture is **not** retroactively dissected.
|
||||
- `cluster.oldest_ts` / `cluster.newest_ts`: cluster-wide window where KFL
|
||||
on live data has any chance of returning results.
|
||||
- `nodes[].oldest_ts` / `nodes[].newest_ts`: per-node windows for narrowing
|
||||
queries.
|
||||
|
||||
**Key distinction:**
|
||||
|
||||
| | L4 (`get_data_boundaries`) | L7 (`get_l7_data_boundaries`) |
|
||||
|---|---|---|
|
||||
| Data | Raw PCAP capture | Dissected API call entries |
|
||||
| Useful for | Snapshots, PCAP extraction | KFL queries |
|
||||
| Backfill | Comes from FIFO ring buffer | Only forward from dissection-enable |
|
||||
|
||||
If the user is asking an API-level question and `dissection_enabled` is
|
||||
`false`, enable it first — but tell the user they will only see entries
|
||||
captured *after* enabling, never the historical window.
|
||||
|
||||
### Create a Snapshot
|
||||
|
||||
**Tool**: `create_snapshot`
|
||||
|
||||
Specify nodes (or cluster-wide) and a time window within the data boundaries.
|
||||
Snapshots include raw capture files, Kubernetes pod events, and eBPF cgroup events.
|
||||
|
||||
Snapshots take time to build. Check status with `get_snapshot` — wait until
|
||||
`completed` before proceeding with either route.
|
||||
|
||||
### List Existing Snapshots
|
||||
|
||||
**Tool**: `list_snapshots`
|
||||
|
||||
Shows all snapshots on the local Hub, with name, size, status, and node count.
|
||||
|
||||
### Cloud Storage
|
||||
|
||||
Snapshots on the Hub are ephemeral. Cloud storage (S3, GCS, Azure Blob)
|
||||
provides long-term retention. Snapshots can be downloaded to any cluster
|
||||
with Kubeshark — not necessarily the original one.
|
||||
|
||||
**Check cloud status**: `get_cloud_storage_status`
|
||||
**Upload to cloud**: `upload_snapshot_to_cloud`
|
||||
**Download from cloud**: `download_snapshot_from_cloud`
|
||||
|
||||
---
|
||||
|
||||
## Route 1: PCAP
|
||||
|
||||
The PCAP route does **not** require dissection. It works directly with the raw
|
||||
snapshot data to produce filtered, cluster-wide PCAP files. Use this route when:
|
||||
|
||||
- You need raw packets for Wireshark analysis
|
||||
- You're sharing captures with network teams
|
||||
- You need evidence for compliance or audit
|
||||
- A human will perform the investigation (not an AI agent)
|
||||
|
||||
### Filtering a PCAP
|
||||
|
||||
**Tool**: `export_snapshot_pcap`
|
||||
|
||||
Filter the snapshot down to what matters using:
|
||||
- **Nodes** — specific cluster nodes only
|
||||
- **Time** — sub-window within the snapshot
|
||||
- **BPF filter** — standard Berkeley Packet Filter syntax (e.g., `host 10.0.53.101`,
|
||||
`port 8080`, `net 10.0.0.0/16`)
|
||||
|
||||
These filters are combinable — select specific nodes, narrow the time range,
|
||||
and apply a BPF expression all at once.
|
||||
|
||||
### Workload-to-BPF Workflow
|
||||
|
||||
When you know the workload names but not their IPs, resolve them from the
|
||||
snapshot's metadata. Snapshots preserve pod-to-IP mappings from capture time,
|
||||
so resolution is accurate even if pods have been rescheduled since.
|
||||
|
||||
**Tool**: `list_workloads`
|
||||
|
||||
Use `list_workloads` with `name` + `namespace` for a singular lookup (works
|
||||
live and against snapshots), or with `snapshot_id` + filters for a broader
|
||||
scan.
|
||||
|
||||
**Example workflow — singular lookup** — extract PCAP for specific workloads:
|
||||
|
||||
1. Resolve IPs: `list_workloads` with `name: "orders-594487879c-7ddxf"`, `namespace: "prod"` → IPs: `["10.0.53.101"]`
|
||||
2. Resolve IPs: `list_workloads` with `name: "payment-service-6b8f9d-x2k4p"`, `namespace: "prod"` → IPs: `["10.0.53.205"]`
|
||||
3. Build BPF: `host 10.0.53.101 or host 10.0.53.205`
|
||||
4. Export: `export_snapshot_pcap` with that BPF filter
|
||||
|
||||
**Example workflow — filtered scan** — extract PCAP for all workloads
|
||||
matching a pattern in a snapshot:
|
||||
|
||||
1. List workloads: `list_workloads` with `snapshot_id`, `namespaces: ["prod"]`,
|
||||
`name_regex: "payment.*"` → returns all matching workloads with their IPs
|
||||
2. Collect all IPs from the response
|
||||
3. Build BPF: `host 10.0.53.205 or host 10.0.53.210 or ...`
|
||||
4. Export: `export_snapshot_pcap` with that BPF filter
|
||||
|
||||
This gives you a cluster-wide PCAP filtered to exactly the workloads involved
|
||||
in the incident — ready for Wireshark or long-term storage.
|
||||
|
||||
### IP-to-Workload Resolution
|
||||
|
||||
When you have an IP address (e.g., from a PCAP or L4 flow) and need to
|
||||
identify the workload behind it:
|
||||
|
||||
**Tool**: `list_ips`
|
||||
|
||||
Use `list_ips` with `ip` for a singular lookup (works live and against
|
||||
snapshots), or with `snapshot_id` + filters for a broader scan.
|
||||
|
||||
**Example — singular lookup**: `list_ips` with `ip: "10.0.53.101"`,
|
||||
`snapshot_id: "snap-abc"` → returns pod/service identity for that IP.
|
||||
|
||||
**Example — filtered scan**: `list_ips` with `snapshot_id: "snap-abc"`,
|
||||
`namespaces: ["prod"]`, `labels: {"app": "payment"}` → returns all IPs
|
||||
associated with workloads matching those filters.
|
||||
|
||||
---
|
||||
|
||||
## Route 2: Dissection
|
||||
|
||||
The Dissection route indexes raw packets into structured L7 API calls, building
|
||||
a queryable database from the snapshot. Use this route when:
|
||||
|
||||
- An AI agent is performing the investigation
|
||||
- You need to search by Kubernetes context (pods, namespaces, labels, services)
|
||||
- You need to search by API elements (paths, status codes, headers, payloads)
|
||||
- You want structured responses you can analyze programmatically
|
||||
- You need to drill into the payload of a specific API call
|
||||
|
||||
**KFL requirement**: The Dissection route uses KFL filters for all queries
|
||||
(`list_api_calls`, `get_api_stats`, etc.). Before constructing any KFL filter,
|
||||
load the KFL skill (`skills/kfl/`). KFL is statically typed — incorrect field
|
||||
names or syntax will fail silently or error. If the KFL skill is not available,
|
||||
suggest the user install it:
|
||||
|
||||
```bash
|
||||
ln -s /path/to/kubeshark/skills/kfl ~/.claude/skills/kfl
|
||||
```
|
||||
|
||||
**If the KFL skill cannot be loaded**, only use the exact filter examples shown
|
||||
in this skill. Do not improvise or guess at field names, operators, or syntax.
|
||||
KFL field names differ from what you might expect (e.g., `status_code` not
|
||||
`response.status`, `src.pod.namespace` not `src.namespace`). Using incorrect
|
||||
fields produces wrong results without warning.
|
||||
|
||||
### Dissection Is Required — Do Not Skip This
|
||||
|
||||
**Any question about workloads, Kubernetes resources, services, pods, namespaces,
|
||||
or API calls requires dissection.** Only the PCAP route works without it. If the
|
||||
user asks anything about traffic content, API behavior, error rates, latency,
|
||||
or service-to-service communication, you **must** ensure dissection is active
|
||||
before attempting to answer.
|
||||
|
||||
**Do not wait for dissection to complete on its own — it will not start by itself.**
|
||||
|
||||
Follow this sequence every time before using `list_api_calls`, `get_api_call`,
|
||||
or `get_api_stats`:
|
||||
|
||||
1. **Check status**: Call `get_snapshot_dissection_status` (or `list_snapshot_dissections`)
|
||||
to see if a dissection already exists for this snapshot.
|
||||
2. **If dissection exists and is completed** — proceed with your query. No further
|
||||
action needed.
|
||||
3. **If dissection is in progress** — wait for it to complete, then proceed.
|
||||
4. **If no dissection exists** — you **must** call `start_snapshot_dissection` to
|
||||
trigger it. Then monitor progress with `get_snapshot_dissection_status` until
|
||||
it completes.
|
||||
|
||||
Never assume dissection is running. Never wait for a dissection that was not started.
|
||||
The agent is responsible for triggering dissection when it is missing.
|
||||
|
||||
**Tool**: `start_snapshot_dissection`
|
||||
|
||||
Dissection takes time proportional to snapshot size — it parses every packet,
|
||||
reassembles streams, and builds the index. After completion, these tools
|
||||
become available:
|
||||
- `list_api_calls` — Search API transactions with KFL filters
|
||||
- `get_api_call` — Drill into a specific call (headers, body, timing, payload)
|
||||
- `get_api_stats` — Aggregated statistics (throughput, error rates, latency)
|
||||
|
||||
### Every Question Is a Query
|
||||
|
||||
**Every user prompt that involves APIs, workloads, services, pods, namespaces,
|
||||
or Kubernetes semantics should translate into a `list_api_calls` call with an
|
||||
appropriate KFL filter.** Do not answer from memory or prior results — always
|
||||
run a fresh query that matches what the user is asking.
|
||||
|
||||
Examples of user prompts and the queries they should trigger:
|
||||
|
||||
| User says | Action |
|
||||
|---|---|
|
||||
| "Show me all 500 errors" | `list_api_calls` with KFL: `http && status_code == 500` |
|
||||
| "What's hitting the payment service?" | `list_api_calls` with KFL: `dst.service.name == "payment-service"` |
|
||||
| "Any DNS failures?" | `list_api_calls` with KFL: `dns && status_code != 0` |
|
||||
| "Show traffic from namespace prod to staging" | `list_api_calls` with KFL: `src.pod.namespace == "prod" && dst.pod.namespace == "staging"` |
|
||||
| "What are the slowest API calls?" | `list_api_calls` with KFL: `http && elapsed_time > 5000000` |
|
||||
|
||||
The user's natural language maps to KFL. Your job is to translate intent into
|
||||
the right filter and run the query — don't summarize old results or speculate
|
||||
without fresh data.
|
||||
|
||||
### Investigation Strategy
|
||||
|
||||
Start broad, then narrow:
|
||||
|
||||
1. `get_api_stats` — Get the overall picture: error rates, latency percentiles,
|
||||
throughput. Look for spikes or anomalies.
|
||||
2. `list_api_calls` filtered by error codes (4xx, 5xx) or high latency — find
|
||||
the problematic transactions.
|
||||
3. `get_api_call` on specific calls — inspect headers, bodies, timing, and
|
||||
full payload to understand what went wrong.
|
||||
4. Use KFL filters to slice by namespace, service, protocol, or any combination.
|
||||
|
||||
**Example `list_api_calls` response** (filtered to `http && status_code >= 500`,
|
||||
timestamps converted from UTC to local):
|
||||
```
|
||||
┌──────────────────────────────────────────┬────────┬──────────────────────────┬────────┬───────────┐
|
||||
│ Timestamp │ Method │ URL │ Status │ Elapsed │
|
||||
├──────────────────────────────────────────┼────────┼──────────────────────────┼────────┼───────────┤
|
||||
│ 2026-03-14 19:23:45 IST (17:23:45 UTC) │ POST │ /api/v1/orders/charge │ 503 │ 12,340 ms │
|
||||
│ 2026-03-14 19:23:46 IST (17:23:46 UTC) │ POST │ /api/v1/orders/charge │ 503 │ 11,890 ms │
|
||||
│ 2026-03-14 19:23:48 IST (17:23:48 UTC) │ GET │ /api/v1/inventory/check │ 500 │ 8,210 ms │
|
||||
│ 2026-03-14 19:24:01 IST (17:24:01 UTC) │ POST │ /api/v1/payments/process │ 502 │ 30,000 ms │
|
||||
└──────────────────────────────────────────┴────────┴──────────────────────────┴────────┴───────────┘
|
||||
Src: api-gateway (prod) → Dst: payment-service (prod)
|
||||
```
|
||||
|
||||
Use the pattern of repeated failures and high latency to identify the failing
|
||||
service chain, then drill into individual calls with `get_api_call`.
|
||||
|
||||
### KFL Filters for Dissected Traffic
|
||||
|
||||
Layer filters progressively when investigating:
|
||||
|
||||
```
|
||||
// Step 1: Protocol + namespace
|
||||
http && dst.pod.namespace == "production"
|
||||
|
||||
// Step 2: Add error condition
|
||||
http && dst.pod.namespace == "production" && status_code >= 500
|
||||
|
||||
// Step 3: Narrow to service
|
||||
http && dst.pod.namespace == "production" && status_code >= 500 && dst.service.name == "payment-service"
|
||||
|
||||
// Step 4: Narrow to endpoint
|
||||
http && dst.pod.namespace == "production" && status_code >= 500 && dst.service.name == "payment-service" && path.contains("/charge")
|
||||
```
|
||||
|
||||
Other common RCA filters:
|
||||
|
||||
```
|
||||
dns && dns_response && status_code != 0 // Failed DNS lookups
|
||||
src.service.namespace != dst.service.namespace // Cross-namespace traffic
|
||||
http && elapsed_time > 5000000 // Slow transactions (> 5s)
|
||||
conn && conn_state == "open" && conn_local_bytes > 1000000 // High-volume connections
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Combining Both Routes
|
||||
|
||||
The two routes are complementary. A common pattern:
|
||||
|
||||
1. Start with **Dissection** — let the AI agent search and identify the root cause
|
||||
2. Once you've pinpointed the problematic workloads, use `list_workloads`
|
||||
to get their IPs (singular lookup by name+namespace, or filtered scan
|
||||
by namespace/regex/labels against the snapshot)
|
||||
3. Switch to **PCAP** — export a filtered PCAP of just those workloads for
|
||||
Wireshark deep-dive, sharing with the network team, or compliance archival
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Post-Incident RCA
|
||||
|
||||
1. Identify the incident time window from alerts, logs, or user reports
|
||||
2. Check `get_data_boundaries` — is the window still in raw capture (L4)?
|
||||
3. Check `get_l7_data_boundaries` — was dissection enabled at that time, and
|
||||
does the window overlap with the L7 entry range? If `dissection_enabled`
|
||||
is `false` or the window predates the L7 range, the Dissection route is
|
||||
limited to whatever entries exist now — falling back to the PCAP route
|
||||
is often the right call.
|
||||
4. `create_snapshot` covering the incident window (add 15 minutes buffer)
|
||||
5. **Dissection route**: `start_snapshot_dissection` → `get_api_stats` →
|
||||
`list_api_calls` → `get_api_call` → follow the dependency chain
|
||||
6. **PCAP route**: `list_workloads` → `export_snapshot_pcap` with BPF →
|
||||
hand off to Wireshark or archive
|
||||
|
||||
### Other Use Cases
|
||||
|
||||
- **Trend analysis** — Take snapshots at regular intervals and compare
|
||||
`get_api_stats` across them to detect latency drift, error rate changes,
|
||||
or new service-to-service connections.
|
||||
- **Forensic preservation** — `create_snapshot` + `upload_snapshot_to_cloud`
|
||||
for immutable, long-term evidence. Downloadable to any cluster months later.
|
||||
- **Production-to-local replay** — Upload a production snapshot to cloud,
|
||||
download it on a local KinD cluster, and investigate safely.
|
||||
|
||||
## Setup Reference
|
||||
|
||||
For CLI installation, MCP configuration, verification, and troubleshooting,
|
||||
see `references/setup.md`.
|
||||
70
skills/network-rca/references/setup.md
Normal file
70
skills/network-rca/references/setup.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# Kubeshark MCP Setup Reference
|
||||
|
||||
## Installing the CLI
|
||||
|
||||
**Homebrew (macOS)**:
|
||||
```bash
|
||||
brew install kubeshark
|
||||
```
|
||||
|
||||
**Linux**:
|
||||
```bash
|
||||
sh <(curl -Ls https://kubeshark.com/install)
|
||||
```
|
||||
|
||||
**From source**:
|
||||
```bash
|
||||
git clone https://github.com/kubeshark/kubeshark
|
||||
cd kubeshark && make
|
||||
```
|
||||
|
||||
## MCP Configuration
|
||||
|
||||
**Claude Desktop / Cowork** (`claude_desktop_config.json`):
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"kubeshark": {
|
||||
"command": "kubeshark",
|
||||
"args": ["mcp"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Claude Code (CLI)**:
|
||||
```bash
|
||||
claude mcp add kubeshark -- kubeshark mcp
|
||||
```
|
||||
|
||||
**Without kubectl access** (direct URL mode):
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"kubeshark": {
|
||||
"command": "kubeshark",
|
||||
"args": ["mcp", "--url", "https://kubeshark.example.com"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```bash
|
||||
# Claude Code equivalent:
|
||||
claude mcp add kubeshark -- kubeshark mcp --url https://kubeshark.example.com
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
- Claude Code: `/mcp` to check connection status
|
||||
- Terminal: `kubeshark mcp --list-tools`
|
||||
- Cluster: `kubectl get pods -l app=kubeshark-hub`
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Binary not found** → Install via Homebrew or the install script above
|
||||
- **Connection refused** → Deploy Kubeshark first: `kubeshark tap`
|
||||
- **No L7 data** → Check `get_dissection_status` and `enable_dissection`
|
||||
- **Snapshot creation fails** → Verify raw capture is enabled in Kubeshark config
|
||||
- **Empty snapshot** → Check `get_data_boundaries` — the requested window may
|
||||
fall outside available data
|
||||
724
skills/security-audit/SKILL.md
Normal file
724
skills/security-audit/SKILL.md
Normal file
@@ -0,0 +1,724 @@
|
||||
---
|
||||
name: security-audit
|
||||
description: >
|
||||
Kubernetes network security audit skill powered by Kubeshark MCP. Use this skill
|
||||
whenever the user wants to audit a cluster for security threats, detect compromised
|
||||
workloads, find malicious traffic patterns, hunt for indicators of compromise (IOCs),
|
||||
check for data exfiltration, identify C2 (command and control) communication,
|
||||
detect cryptomining, find lateral movement, discover credential theft attempts,
|
||||
assess network security posture, or perform threat hunting in Kubernetes.
|
||||
Also trigger when the user mentions security audit, threat detection, compromise
|
||||
assessment, vulnerability scan, "is my cluster compromised", "find malicious traffic",
|
||||
"check for threats", DNS exfiltration, DNS tunneling, port scanning, IMDS access,
|
||||
reverse shell, crypto miner, MITRE ATT&CK, IOC detection, anomaly detection,
|
||||
suspicious traffic, rogue workloads, unauthorized access, or any request to
|
||||
evaluate cluster security through network traffic analysis.
|
||||
---
|
||||
|
||||
# Kubernetes Network Security Audit with Kubeshark MCP
|
||||
|
||||
You are a Kubernetes network security specialist. Your job is to systematically
|
||||
audit cluster traffic for indicators of compromise, malicious behavior, and
|
||||
security threats — using network traffic as the ground truth.
|
||||
|
||||
Network traffic cannot lie. Logs can be tampered with, metrics can be spoofed,
|
||||
but packets on the wire reveal what workloads actually do — what they connect to,
|
||||
what protocols they speak, what data they send. Your audit leverages this by
|
||||
examining DNS queries, HTTP requests, L4 flows, and protocol-level payloads
|
||||
across every dimension of the MITRE ATT&CK framework.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before starting any audit, verify the environment is ready.
|
||||
|
||||
**Tool**: `check_kubeshark_status`
|
||||
|
||||
Confirm Kubeshark is deployed and tools are available. You need at minimum:
|
||||
`list_api_calls`, `list_l4_flows`, `list_workloads`, `get_api_call`.
|
||||
|
||||
**KFL requirement**: This skill uses KFL filters for all queries. Before
|
||||
constructing any filter, load the KFL skill (`skills/kfl/`). KFL is statically
|
||||
typed — incorrect field names will fail silently. If the KFL skill is not
|
||||
loaded, only use the exact filter examples shown in this skill.
|
||||
|
||||
**KFL error resilience**: If a KFL filter returns `undeclared reference` or
|
||||
similar errors, **do not give up on that phase**. Fall back to:
|
||||
1. Port-based filtering: `dst.port == 5432` instead of protocol flags
|
||||
2. Name-based filtering: `dst.name.contains("db")` or `src.name.contains("pod-name")`
|
||||
3. Browsing entries with `get_api_call` on IDs from `list_l4_flows`
|
||||
A KFL error means the filter syntax is wrong, not that the data doesn't exist.
|
||||
|
||||
## Audit Methodology
|
||||
|
||||
A security audit is NOT an incident investigation. You are not responding to
|
||||
a known event — you are proactively searching for threats that may be hiding
|
||||
in normal traffic. This requires a systematic sweep across all threat categories,
|
||||
not a single focused query.
|
||||
|
||||
The audit has **two sections** that run in sequence:
|
||||
|
||||
```
|
||||
SECTION A: Real-Time Analysis → Instant, uses live dissected traffic
|
||||
SECTION B: Snapshot Deep Dive → Immutable evidence, protocol-level inspection
|
||||
```
|
||||
|
||||
### Why Two Sections?
|
||||
|
||||
Kubeshark has two modes of data access:
|
||||
|
||||
1. **Real-time dissection** — traffic is dissected as it flows through the
|
||||
cluster. Provides instant access to L7 data (DNS, HTTP, etc.) that is
|
||||
already captured and indexed. However, real-time dissection is resource-
|
||||
intensive and may not be enabled, or may have gaps in coverage.
|
||||
|
||||
2. **Snapshots** — immutable captures of raw traffic within a time window.
|
||||
Must be created explicitly, then dissected separately. Guarantees complete
|
||||
coverage of all packets in the window, but takes time to create and index.
|
||||
|
||||
Section A uses whatever is already available — fast, immediate, but possibly
|
||||
incomplete. Section B creates snapshots for thorough, evidence-grade analysis.
|
||||
|
||||
### Severity Classification
|
||||
|
||||
Classify every finding using this framework:
|
||||
|
||||
| Severity | Criteria | Examples |
|
||||
|----------|----------|---------|
|
||||
| **CRITICAL** | Active data exfiltration, credential theft in progress, confirmed C2 | DNS tunneling, IMDS credential harvest, mining pool connections |
|
||||
| **HIGH** | Reconnaissance with cluster-wide scope, confirmed unauthorized access | K8s API secret enumeration, port scanning, cluster-admin abuse |
|
||||
| **MEDIUM** | Suspicious patterns requiring investigation, limited-scope recon | Cross-namespace probes, outdated User-Agents, unusual external connections |
|
||||
| **LOW** | Anomalies that may be benign, single-instance events | Unknown workloads, new external destinations, noisy but not malicious |
|
||||
|
||||
### Timezone
|
||||
|
||||
Kubeshark returns timestamps in UTC. Always convert to local time before
|
||||
presenting to the user. Detect the local timezone at the start (e.g.,
|
||||
`date +%Z`). Present local time as primary, with UTC in parentheses:
|
||||
`15:03:22 IST (12:03:22 UTC)`.
|
||||
|
||||
**Conversion**: Kubeshark timestamps are Unix milliseconds. To convert:
|
||||
`ms / 1000` → Unix seconds → datetime → format with timezone offset.
|
||||
Example: `1778534735974` → `2026-05-11 14:05:35 PDT (21:05:35 UTC)`.
|
||||
|
||||
---
|
||||
|
||||
## SECTION A: Real-Time Analysis
|
||||
|
||||
**Goal**: Fast initial sweep using live data that's already available. No
|
||||
waiting for snapshot creation or dissection.
|
||||
|
||||
### Step 1: Check What's Available
|
||||
|
||||
**Tool**: `check_kubeshark_status`
|
||||
|
||||
Confirm Kubeshark is running and which tools are available.
|
||||
|
||||
**Tool**: `get_data_boundaries`
|
||||
|
||||
Check how far back raw capture data exists. You need this to plan snapshot
|
||||
creation in Step 3 — call it now so the data is ready when you need it.
|
||||
|
||||
**Tool**: `list_workloads` (no snapshot_id — queries live state)
|
||||
|
||||
Get the current workload inventory for the target namespace. This returns
|
||||
pod names, namespaces, and IP addresses. Save the IPs — you'll need them
|
||||
throughout the audit.
|
||||
|
||||
**Note**: `list_workloads` without a `snapshot_id` may fail with some
|
||||
Kubeshark versions (`snapshot_id is required for filtered listing`). If
|
||||
this happens, use individual lookups with `name` + `namespace` parameters,
|
||||
or skip to Step 3 and get the workload inventory from the first snapshot.
|
||||
|
||||
### Step 2: Query Live Traffic
|
||||
|
||||
In parallel, query the real-time dissected traffic across key dimensions.
|
||||
Use `list_api_calls` and `list_l4_flows` **without** a `snapshot_id` to
|
||||
hit the live data.
|
||||
|
||||
Run these queries simultaneously:
|
||||
|
||||
| Query | KFL Filter | What You're Looking For |
|
||||
|-------|-----------|------------------------|
|
||||
| DNS traffic | `dns` | Mining domains, high-entropy subdomains, external resolution, NXDOMAIN flood |
|
||||
| HTTP traffic | `http` | C2 beaconing, suspicious URLs, external destinations, anomalous headers |
|
||||
| L4 flows | (via `list_l4_flows`) | External IPs, suspicious ports (3333, 4444), IMDS (169.254.169.254), fan-out patterns |
|
||||
| PostgreSQL | `postgresql` | SQL injection patterns, sensitive table access |
|
||||
| Redis | `redis` | Dangerous commands (CONFIG, KEYS, CLIENT LIST) |
|
||||
|
||||
Filter by namespace if the user specified one (e.g., `dns && src.pod.namespace == "k8s-mule"`).
|
||||
|
||||
**Important**: Real-time dissection may have incomplete data — traffic that
|
||||
arrived before dissection was enabled, or during gaps in coverage, won't
|
||||
appear. Treat Section A findings as a fast first pass, not the final word.
|
||||
|
||||
### Step 3: Create Snapshots (Sequential — One at a Time)
|
||||
|
||||
While analyzing real-time data, begin creating snapshots for Section B.
|
||||
|
||||
**CRITICAL: Create snapshots ONE AT A TIME, sequentially.** Kubeshark only
|
||||
supports one concurrent snapshot download. Parallel creation will cause
|
||||
failures and data loss. The pattern is:
|
||||
|
||||
1. Create snapshot → wait for completion → start dissection → move to next
|
||||
2. Snapshot creation is fast (seconds). Dissection is slow (minutes).
|
||||
3. You do NOT need to wait for dissection before creating the next snapshot.
|
||||
Create the next snapshot while the previous one dissects.
|
||||
|
||||
Use the data boundaries from Step 1 (`get_data_boundaries`) to calculate
|
||||
how many snapshots are needed:
|
||||
|
||||
```
|
||||
total_range_ms = newest_timestamp - oldest_timestamp
|
||||
window_ms = 240000 # 4 minutes
|
||||
num_snapshots = ceil(total_range_ms / window_ms)
|
||||
```
|
||||
|
||||
Then create snapshots in **4-minute increments**, starting from the most
|
||||
recent:
|
||||
|
||||
```
|
||||
Step 1: create_snapshot (now - 4min → now)
|
||||
→ poll get_snapshot until status == "completed"
|
||||
→ start_snapshot_dissection
|
||||
Step 2: create_snapshot (now - 8min → now - 4min)
|
||||
→ poll get_snapshot until status == "completed"
|
||||
→ start_snapshot_dissection
|
||||
Step 3: create_snapshot (now - 12min → now - 8min)
|
||||
→ poll get_snapshot until status == "completed"
|
||||
→ start_snapshot_dissection
|
||||
```
|
||||
|
||||
**Polling pattern**: After `create_snapshot`, call `get_snapshot` with the
|
||||
returned snapshot ID to check status. Repeat until `status == "completed"`.
|
||||
After `start_snapshot_dissection`, call `get_snapshot_dissection_status`
|
||||
and check until `progress == 100`.
|
||||
|
||||
4-minute windows balance snapshot size (fast to create and dissect) against
|
||||
coverage (captures threats with sleep cycles up to ~3 minutes). Most attack
|
||||
patterns in the wild repeat within 30-120 seconds.
|
||||
|
||||
**Do not skip this step.** A single short snapshot will miss threats with
|
||||
longer sleep cycles. The 4-minute windows ensure full coverage.
|
||||
|
||||
**Note**: Small snapshots (under ~15 minutes of traffic) often dissect in
|
||||
seconds rather than minutes. If dissection completes quickly, you can
|
||||
collapse the phased approach (immediate data first, L7 after) into a
|
||||
single pass through all phases.
|
||||
|
||||
### Step 4: Present Intermediate Results
|
||||
|
||||
Present Section A findings to the user as **intermediate results** — clearly
|
||||
labeled as preliminary:
|
||||
|
||||
```
|
||||
## Intermediate Results (Real-Time Analysis)
|
||||
|
||||
⚠️ These findings are based on live dissected traffic, which may have
|
||||
gaps in coverage. Snapshot analysis is in progress and will provide
|
||||
the complete, evidence-grade audit.
|
||||
|
||||
[findings table and details]
|
||||
|
||||
Snapshots are being created and dissected. Full report to follow.
|
||||
```
|
||||
|
||||
This gives the user immediate value while snapshots process. But be explicit:
|
||||
**the audit is not complete until Section B finishes.**
|
||||
|
||||
---
|
||||
|
||||
## SECTION B: Snapshot Deep Dive
|
||||
|
||||
**Goal**: Systematic, thorough analysis against immutable snapshot data.
|
||||
This is the evidence-grade section — complete coverage, reproducible results.
|
||||
|
||||
**The audit is NOT done until this section completes.** Snapshots must be
|
||||
created, dissected, and analyzed at L7 before the final report is generated.
|
||||
Section A may miss traffic that wasn't being dissected in real-time — Section B
|
||||
captures everything in the raw PCAP buffer, including traffic that real-time
|
||||
dissection dropped or never saw. Do not skip this section or treat Section A
|
||||
results as the final word.
|
||||
|
||||
### What a Snapshot Gives You
|
||||
|
||||
A completed snapshot provides **three independent data sources** — do not
|
||||
wait for dissection to use the first two:
|
||||
|
||||
| Source | Available | Tool | What It Provides |
|
||||
|--------|-----------|------|-----------------|
|
||||
| **Workloads & IPs** | Immediately | `list_workloads` with `snapshot_id` | Pod names, namespaces, IPs at capture time |
|
||||
| **L4 Flows** | Immediately | `list_l4_flows` with `snapshot_id` | TCP/UDP connections: src/dst IPs, ports, bytes, duration |
|
||||
| **PCAP Export** | Immediately | `export_snapshot_pcap` | Raw packets filtered by BPF expression |
|
||||
| **L7 Dissection** | After indexing | `list_api_calls`, `get_api_call`, `get_api_stats` | DNS queries, HTTP requests, SQL statements, Redis commands, gRPC methods |
|
||||
|
||||
### Audit Flow Per Snapshot
|
||||
|
||||
For each 4-minute snapshot, run the full 7-phase sweep. Start with immediate
|
||||
data while dissection completes:
|
||||
|
||||
```
|
||||
Snapshot ready
|
||||
├── Start dissection (background)
|
||||
├── Phase 1: list_workloads (immediate) — workload inventory + IPs
|
||||
│ export_snapshot_pcap (immediate) — raw packet evidence
|
||||
├── Phase 3: list_l4_flows (immediate) — external flows, port scanning
|
||||
├── Phase 4: list_l4_flows (immediate) — lateral movement, fan-out
|
||||
│
|
||||
├── [dissection completes]
|
||||
│
|
||||
├── Phase 2: list_api_calls — DNS threat analysis
|
||||
├── Phase 5: list_api_calls — protocol abuse (PG, Redis, gRPC)
|
||||
├── Phase 6: list_api_calls — credential access (IMDS, cloud APIs)
|
||||
└── Phase 7: correlate all findings
|
||||
```
|
||||
|
||||
Process snapshots in reverse chronological order (most recent first). If the
|
||||
first snapshot reveals enough threats, you may not need to analyze all of them.
|
||||
|
||||
### PCAP for Deep Inspection
|
||||
|
||||
PCAP export happens in Phase 1b (immediately after snapshot creation). In
|
||||
later phases, if a new finding needs deeper packet-level analysis beyond
|
||||
what `list_api_calls` provides, export additional PCAPs using the workload
|
||||
IPs collected in Phase 1a:
|
||||
|
||||
```
|
||||
export_snapshot_pcap(snapshot_id, bpf_filter="host <workload_ip>")
|
||||
```
|
||||
|
||||
### Merging Findings Across Snapshots
|
||||
|
||||
Threats that appear in multiple snapshots are confirmed persistent. One-time
|
||||
events in a single snapshot may be transient. Note which findings repeat
|
||||
across snapshots — persistence is a strong signal of real compromise vs.
|
||||
a single anomalous event.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Workload Inventory & PCAP Evidence
|
||||
|
||||
**Goal**: Identify all active workloads, collect their IPs, and export raw
|
||||
PCAP evidence — all before dissection completes.
|
||||
**Data source**: Immediate (no dissection needed).
|
||||
|
||||
### 1a: Workload Inventory
|
||||
|
||||
**Tool**: `list_workloads` with `snapshot_id`
|
||||
|
||||
Query with the target namespace (or all namespaces). The response includes
|
||||
pod names, namespaces, and **IP addresses at capture time** — these IPs are
|
||||
critical for building BPF filters in later phases and for correlating L4
|
||||
flows to workload identities.
|
||||
|
||||
For each workload, note:
|
||||
- Pod name and namespace
|
||||
- IP address (save these — you'll need them for PCAP export and L4 analysis)
|
||||
- Whether it's expected (matches known deployments)
|
||||
|
||||
**What to flag**:
|
||||
- Workloads not matching any known Deployment/DaemonSet/StatefulSet
|
||||
- Pods with names that mimic system components (e.g., `kube-proxy-debug`)
|
||||
- Unexpected number of replicas or pods in the namespace
|
||||
|
||||
### 1b: PCAP Export (Immediate — No Dissection Needed)
|
||||
|
||||
**Tool**: `export_snapshot_pcap` with `snapshot_id`
|
||||
|
||||
PCAP export is available immediately after snapshot creation — it reads raw
|
||||
packets, not dissected data. Use it now to preserve evidence and get raw
|
||||
packet-level visibility before L7 dissection completes.
|
||||
|
||||
**Export PCAP for every CRITICAL finding** from Section A's real-time analysis.
|
||||
Use the workload IPs from 1a to build BPF filters:
|
||||
|
||||
```
|
||||
export_snapshot_pcap(snapshot_id, bpf_filter="host <workload_ip>")
|
||||
```
|
||||
|
||||
This is especially useful for:
|
||||
- Verifying encrypted C2 (TLS ClientHello SNI inspection)
|
||||
- Confirming Stratum mining protocol content
|
||||
- Extracting DNS tunnel payloads at packet level
|
||||
- Preserving forensic evidence before cluster changes
|
||||
|
||||
If Section A identified no CRITICAL findings yet, export a broad PCAP for
|
||||
the most suspicious workloads based on L4 flow analysis (Phase 3).
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: DNS Threat Analysis
|
||||
|
||||
**Goal**: DNS is the single most reliable indicator of compromise. Every attack
|
||||
that communicates externally needs DNS resolution. Sweep DNS traffic for all
|
||||
known threat patterns.
|
||||
|
||||
### 2a: External DNS (Non-Cluster Queries)
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `dns`
|
||||
|
||||
Examine all DNS queries. Flag anything that is NOT `*.cluster.local` or
|
||||
`*.svc.cluster.local` — these are external resolutions that reveal what
|
||||
workloads are reaching out to.
|
||||
|
||||
**What to flag**:
|
||||
|
||||
| Pattern | Threat | KFL Filter |
|
||||
|---------|--------|------------|
|
||||
| Mining pool domains (minexmr, nanopool, mining-pool) | Cryptojacking | `dns && dns_questions.exists(q, q.contains("minexmr"))` |
|
||||
| High-entropy subdomains (base64-like, >30 chars) | DNS tunneling / exfiltration | `dns` — then inspect subdomain length and entropy |
|
||||
| DGA patterns (random .com/.net with NXDOMAIN) | C2 beaconing | `dns && dns_response && size(dns_answers) == 0` |
|
||||
| DoH resolver domains (cloudflare-dns.com, dns.google) | DNS bypass / C2 channel | `dns && dns_questions.exists(q, q.contains("cloudflare-dns"))` |
|
||||
| Cloud API domains (sts.amazonaws.com, s3.amazonaws.com) | Stolen credential usage | `dns && dns_questions.exists(q, q.contains("amazonaws.com"))` |
|
||||
| C2/attacker domains (attacker, c2, darknet, exfil) | Command & Control | `dns && dns_questions.exists(q, q.contains("c2"))` |
|
||||
|
||||
### 2b: DNS Query Volume and Types
|
||||
|
||||
High query volume from a single pod is suspicious. Also check for unusual
|
||||
record types:
|
||||
|
||||
- **TXT queries** to external domains → data exfiltration
|
||||
- **NULL queries** → DNS tunneling (iodine, dnscat2)
|
||||
- **AXFR queries** → zone transfer attempts (reconnaissance)
|
||||
- **SRV queries** to many namespaces → service enumeration
|
||||
|
||||
### 2c: NXDOMAIN Ratio
|
||||
|
||||
A high NXDOMAIN ratio (>20% of queries) from a single source suggests DGA
|
||||
beaconing — the malware tries many generated domains, most of which don't exist.
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `dns && dns_response && size(dns_answers) == 0`
|
||||
|
||||
Compare the count of failed queries to total queries per source pod.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: External Communication
|
||||
|
||||
**Goal**: Identify all traffic leaving the cluster. Any pod connecting to
|
||||
external IPs or domains needs justification.
|
||||
**Data source**: Immediate (no dissection needed). Use L4 flows first,
|
||||
then enrich with L7 data from dissection when available.
|
||||
|
||||
### 3a: L4 External Flows
|
||||
|
||||
**Tool**: `list_l4_flows` with `snapshot_id`
|
||||
|
||||
This is available immediately — do not wait for dissection. Use the workload
|
||||
IPs from Phase 1 to map flows to pod identities.
|
||||
|
||||
Look for flows where the destination is NOT a cluster-internal IP (not RFC 1918:
|
||||
10.x.x.x, 172.16-31.x.x, 192.168.x.x). Every external flow is a potential
|
||||
exfiltration or C2 channel.
|
||||
|
||||
**What to flag**:
|
||||
|
||||
| Pattern | Threat | Severity |
|
||||
|---------|--------|----------|
|
||||
| Destination 169.254.169.254 | IMDS metadata credential theft | CRITICAL |
|
||||
| Destination port 3333, 14433, 45700 | Stratum mining protocol | CRITICAL |
|
||||
| Destination port 4444, 1337 | Reverse shell / backdoor | CRITICAL |
|
||||
| Persistent connections to single external IP | C2 beaconing | HIGH |
|
||||
| Large outbound data volume (>1MB) to external | Data exfiltration | HIGH |
|
||||
| Connections to cloud API endpoints (port 443) | Stolen credential usage | MEDIUM |
|
||||
|
||||
### 3b: HTTP External Requests
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `http && !dst.pod.namespace.startsWith("kube")`
|
||||
|
||||
Inspect outbound HTTP requests for:
|
||||
|
||||
- **Beaconing patterns**: Regular-interval requests to the same external URL
|
||||
- **Suspicious User-Agents**: `Mozilla/4.0`, `curl/`, empty, or malware-like
|
||||
- **Suspicious paths**: `/check?s=`, `/beacon`, `/heartbeat`, `/proxy?coin=`
|
||||
- **Base64 in headers**: Oversized Cookie or custom X-* headers with encoded data
|
||||
- **gRPC to external**: `Content-Type: application/grpc` to non-cluster destinations
|
||||
- **WebSocket upgrades**: `Upgrade: websocket` to external hosts (potential mining)
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Lateral Movement
|
||||
|
||||
**Goal**: Identify pods communicating with services they shouldn't — crossing
|
||||
namespace boundaries, probing infrastructure, or scanning the network.
|
||||
**Data source**: L4 flows (immediate) for port scanning detection. L7
|
||||
dissection (after indexing) for cross-namespace HTTP and API server analysis.
|
||||
|
||||
### 4a: Cross-Namespace Traffic
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `src.pod.namespace != dst.pod.namespace`
|
||||
|
||||
Most pods should only talk within their namespace (and to kube-system services).
|
||||
Cross-namespace traffic to unexpected destinations is a lateral movement indicator.
|
||||
|
||||
### 4b: Kubernetes API Server Access
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `http && dst.port == 443 && path.startsWith("/api")`
|
||||
|
||||
Check what pods are querying the K8s API server and what they're requesting:
|
||||
|
||||
| API Path | Threat | Severity |
|
||||
|----------|--------|----------|
|
||||
| `/api/v1/secrets` | Secret enumeration | CRITICAL |
|
||||
| `/api/v1/pods` | Workload discovery | HIGH |
|
||||
| `/apis/rbac.authorization.k8s.io` | RBAC reconnaissance | HIGH |
|
||||
| `/api/v1/configmaps` | Config enumeration | MEDIUM |
|
||||
| `/api/v1/namespaces` | Namespace discovery | MEDIUM |
|
||||
|
||||
A pod hitting **multiple** of these paths is performing systematic enumeration,
|
||||
not legitimate API access. Legitimate workloads typically access 1-2 specific
|
||||
resources, not sweep across resource types.
|
||||
|
||||
### 4c: Port Scanning Detection
|
||||
|
||||
**Tool**: `list_l4_flows` with `snapshot_id` (immediate — no dissection needed)
|
||||
|
||||
Use the workload IPs from Phase 1 to identify the source pod.
|
||||
Look for a single source IP with connections to:
|
||||
- Many distinct destination IPs (>10)
|
||||
- Many distinct destination ports (>5)
|
||||
- High connection failure rate (RST/timeout)
|
||||
|
||||
This is a textbook port scan pattern.
|
||||
|
||||
### 4d: Service Fingerprinting
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `http && (path == "/.env" || path == "/actuator/info" || path == "/server-info" || path == "/version")`
|
||||
|
||||
These paths are used for service fingerprinting — mapping what software is
|
||||
running on internal endpoints. A pod probing multiple services with these
|
||||
paths is performing reconnaissance.
|
||||
|
||||
### 4e: Service Account Permission Audit via Traffic
|
||||
|
||||
Cross-reference Phase 4b findings (K8s API traffic) with the source pod's
|
||||
actual service account to determine if permissions are excessive.
|
||||
|
||||
For each pod making API server calls:
|
||||
|
||||
1. **Identify the service account**: From the workload inventory or via
|
||||
`kubectl get pod <name> -n <ns> -o jsonpath='{.spec.serviceAccountName}'`
|
||||
2. **Check what it accessed**: The API paths from Phase 4b reveal what the
|
||||
pod actually queried (secrets, pods, RBAC, configmaps)
|
||||
3. **Compare against expected access**: A `frontend` pod should never hit
|
||||
`/api/v1/secrets`. A `batch-processor` has no reason to query
|
||||
`/apis/rbac.authorization.k8s.io/v1/clusterrolebindings`.
|
||||
|
||||
**What to flag**:
|
||||
|
||||
| Pattern | Threat | Severity |
|
||||
|---------|--------|----------|
|
||||
| Pod queries secrets but its SA only needs pod read | Over-privileged SA or stolen token | HIGH |
|
||||
| Pod hits cluster-wide endpoints (`--all-namespaces` style queries) | Cluster-admin binding | CRITICAL |
|
||||
| Pod's SA is `default` but makes authenticated API calls | Token mounted unnecessarily | MEDIUM |
|
||||
| Multiple pods share the same over-privileged SA | Lateral blast radius | HIGH |
|
||||
|
||||
This converts a network finding (API traffic volume) into an actionable RBAC
|
||||
recommendation — telling the user exactly which ClusterRoleBinding to revoke.
|
||||
|
||||
### 4f: Cross-Namespace Threat Correlation
|
||||
|
||||
When port scanning or lateral movement targets IPs outside the audited
|
||||
namespace (e.g., IPs in the pod CIDR `10.244.x.x` that don't belong to
|
||||
any workload in the target namespace), resolve them to identify the
|
||||
cross-namespace blast radius:
|
||||
|
||||
1. Use `list_workloads` (all namespaces) to map destination IPs to pods
|
||||
2. Identify which namespaces are being probed
|
||||
3. Flag the scope: "port scan from `k8s-mule/network-diagnostics` is
|
||||
targeting pods in `default`, `monitoring`, and `kube-system`"
|
||||
|
||||
This turns a single-namespace finding into a cluster-wide risk assessment.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Protocol Abuse
|
||||
|
||||
**Goal**: Inspect L7 payload content for attack patterns within supported
|
||||
protocols. This is the phase most often skipped — and where subtle threats hide.
|
||||
|
||||
### 5a: PostgreSQL Wire Protocol
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `postgresql`
|
||||
|
||||
The `postgresql_query` variable contains the full SQL text. Use it to detect:
|
||||
|
||||
| KFL Filter | Threat | Severity |
|
||||
|------------|--------|----------|
|
||||
| `postgresql && postgresql_query.contains("UNION SELECT")` | SQL injection | HIGH |
|
||||
| `postgresql && postgresql_query.contains("pg_shadow")` | Password hash theft | CRITICAL |
|
||||
| `postgresql && postgresql_query.contains("information_schema")` | Schema enumeration | MEDIUM |
|
||||
| `postgresql && postgresql_query.contains("TRUNCATE")` | Data destruction | CRITICAL |
|
||||
| `postgresql && postgresql_query.contains("DROP TABLE")` | Data destruction | CRITICAL |
|
||||
| `postgresql && !postgresql_success` | Failed queries (may indicate probing) | MEDIUM |
|
||||
|
||||
Use `get_api_call` to inspect the full SQL content. Also check `postgresql_user`
|
||||
— queries from unexpected users are suspicious.
|
||||
|
||||
### 5b: Redis Protocol
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `redis`
|
||||
|
||||
Use `redis_type` (command verb) and `redis_command` (full command line) to detect:
|
||||
|
||||
| KFL Filter | Threat | Severity |
|
||||
|------------|--------|----------|
|
||||
| `redis && redis_type == "CONFIG"` | Server config dump/write | HIGH |
|
||||
| `redis && redis_type == "KEYS"` | Full key enumeration | HIGH |
|
||||
| `redis && redis_type == "CLIENT"` | Connection enumeration | MEDIUM |
|
||||
| `redis && redis_type == "DEBUG"` | Debug access | MEDIUM |
|
||||
| `redis && redis_command.contains("CONFIG SET dir")` | Arbitrary file write (RCE) | CRITICAL |
|
||||
| `redis && redis_type == "FLUSHALL"` | Data destruction | CRITICAL |
|
||||
|
||||
### 5c: gRPC Endpoints
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `grpc`
|
||||
|
||||
Use `grpc_method` to inspect method names:
|
||||
|
||||
| KFL Filter | Threat | Severity |
|
||||
|------------|--------|----------|
|
||||
| `grpc && grpc_method.contains("Reflection")` | API surface enumeration | MEDIUM |
|
||||
| `grpc && dst.name.contains("attacker")` | Data exfiltration | HIGH |
|
||||
| `grpc && grpc_status != 0` | Failed gRPC calls (may indicate probing) | LOW |
|
||||
|
||||
### 5d: HTTP Request Anomalies
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `http`
|
||||
|
||||
Check for:
|
||||
- **WebSocket upgrades to external hosts**: `Upgrade: websocket` header — potential
|
||||
mining proxy or persistent C2 channel
|
||||
- **DNS-over-HTTPS requests**: `accept: application/dns-json` header — DNS bypass
|
||||
- **AWS Signature headers**: `Authorization: AWS4-HMAC-SHA256` — stolen cloud creds
|
||||
- **IMDS-specific headers**: `X-aws-ec2-metadata-token-ttl-seconds` — token request
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Credential Access
|
||||
|
||||
**Goal**: Detect active credential theft — IMDS access, service account abuse,
|
||||
cloud API exploitation.
|
||||
|
||||
### 6a: Instance Metadata Service (IMDS)
|
||||
|
||||
**Tool**: `list_api_calls` with KFL: `dst.ip == "169.254.169.254"`
|
||||
|
||||
Or use `list_l4_flows` to find connections to 169.254.169.254.
|
||||
|
||||
Any pod connecting to this IP is attempting to steal the node's cloud credentials.
|
||||
Check the HTTP paths:
|
||||
|
||||
| Path | What's Being Stolen |
|
||||
|------|-------------------|
|
||||
| `/latest/meta-data/iam/security-credentials/` | IAM role name |
|
||||
| `/latest/meta-data/iam/security-credentials/<role>` | Actual AWS credentials |
|
||||
| `/latest/dynamic/instance-identity/document` | Instance identity (account ID, region) |
|
||||
| `/latest/user-data` | Instance bootstrap scripts (may contain secrets) |
|
||||
| `/latest/api/token` (PUT) | IMDSv2 session token |
|
||||
|
||||
### 6b: Service Account Token Exfiltration
|
||||
|
||||
Look for HTTP requests where the body or headers contain JWT tokens
|
||||
(strings starting with `eyJ`). These may be service account tokens being
|
||||
sent to external endpoints.
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Attack Chain Correlation
|
||||
|
||||
**Goal**: Connect individual findings into a coherent attack narrative.
|
||||
|
||||
After completing phases 1-6, synthesize findings into an attack chain. Real
|
||||
attacks follow a progression:
|
||||
|
||||
```
|
||||
1. INITIAL ACCESS → How did the attacker get in?
|
||||
2. RECONNAISSANCE → Port scanning, DNS enumeration, API discovery
|
||||
3. CREDENTIAL ACCESS → IMDS theft, secret enumeration, token exfil
|
||||
4. LATERAL MOVEMENT → Cross-namespace probing, SSRF, service scanning
|
||||
5. EXFILTRATION → DNS tunneling, HTTP exfil, gRPC streaming
|
||||
6. PERSISTENCE → C2 beaconing, cryptomining (monetization)
|
||||
```
|
||||
|
||||
Map each finding to a stage. If you see findings across multiple stages from
|
||||
the same namespace or related workloads, you've found a coordinated attack.
|
||||
|
||||
### Output Format
|
||||
|
||||
Present the audit results as:
|
||||
|
||||
1. **Workload inventory** — table of all observed workloads with threat level
|
||||
2. **Detailed findings** — one section per finding, ordered by severity
|
||||
3. **Attack chain summary** — if findings correlate, map the kill chain
|
||||
4. **Immediate actions** — prioritized remediation steps
|
||||
|
||||
---
|
||||
|
||||
## Audit Report — Two-Stage Delivery
|
||||
|
||||
The audit produces **two outputs** — an intermediate report during Section A,
|
||||
and a final PDF report after Section B completes.
|
||||
|
||||
### Stage 1: Intermediate Report (after Section A)
|
||||
|
||||
Present findings from real-time analysis directly in the conversation. Clearly
|
||||
label as preliminary. This gives the user immediate value while snapshots
|
||||
are being created and dissected.
|
||||
|
||||
### Stage 2: Final PDF Report (after Section B)
|
||||
|
||||
This is the primary deliverable. It is generated **only after all snapshots
|
||||
have been dissected and analyzed at L7**. Do not generate the final report
|
||||
based on Section A alone — that would miss protocol-level threats (SQL
|
||||
injection, Redis abuse, gRPC exfil) that only appear after dissection.
|
||||
|
||||
1. **Write** the report as markdown: `security-audit-<namespace>-<date>.md`
|
||||
Follow the template in `references/report-template.md` — it defines
|
||||
the full structure: executive summary, threat table, detailed findings
|
||||
with evidence, attack chain analysis, detection coverage, and remediation.
|
||||
|
||||
2. **Convert to PDF** (in preference order):
|
||||
```bash
|
||||
npx md-to-pdf security-audit-<namespace>-<date>.md # Best quality
|
||||
pandoc security-audit-<namespace>-<date>.md -o security-audit-<namespace>-<date>.pdf
|
||||
```
|
||||
If neither tool is available, leave the markdown as the deliverable.
|
||||
|
||||
3. **The final report must include findings from both sections** — Section A
|
||||
(real-time) and Section B (snapshot dissection). Findings confirmed by
|
||||
both sections are marked with higher confidence. Findings only in
|
||||
Section B (missed by real-time) should be noted — this reveals gaps
|
||||
in real-time dissection coverage.
|
||||
|
||||
### Key Report Requirements
|
||||
|
||||
- **Quote raw evidence** — actual DNS queries, HTTP URLs, SQL statements,
|
||||
Redis commands. The reader must be able to verify without re-running.
|
||||
- **Timestamp every finding** — snapshot ID + local time (UTC in parentheses).
|
||||
- **Specific recommendations** — not "fix RBAC" but "revoke ClusterRoleBinding
|
||||
`mule-recon-cluster-admin`".
|
||||
- **Include MITRE ATT&CK IDs** for each finding.
|
||||
- **Evidence preservation** — list snapshot IDs, recommend cloud storage upload.
|
||||
|
||||
---
|
||||
|
||||
## What Network Auditing Cannot Detect
|
||||
|
||||
Be transparent about blind spots. Network traffic analysis **cannot** detect:
|
||||
|
||||
- **Configuration vulnerabilities**: Privileged containers, missing resource
|
||||
limits, permissive RBAC, hostPath mounts — these are YAML-level issues with
|
||||
no traffic signature
|
||||
- **Secrets in environment variables**: Hardcoded credentials don't generate
|
||||
network traffic until used
|
||||
- **Image vulnerabilities**: CVEs in container images are not visible on the wire
|
||||
- **Idle threats**: A malicious pod that hasn't started communicating yet
|
||||
|
||||
Recommend `kubectl`-based configuration auditing for these gaps. Network
|
||||
auditing is the complement, not the replacement, for config-level security
|
||||
scanning.
|
||||
|
||||
## Threat Intelligence Reference
|
||||
|
||||
For detailed descriptions of all 22 network-observable threat scenarios with
|
||||
MITRE ATT&CK mappings and detection guidance, see `references/threat-catalog.md`.
|
||||
64
skills/security-audit/references/kfl-security-filters.md
Normal file
64
skills/security-audit/references/kfl-security-filters.md
Normal file
@@ -0,0 +1,64 @@
|
||||
# KFL Quick Reference: Security Audit Filters
|
||||
|
||||
## DNS Threat Hunting
|
||||
```
|
||||
dns // All DNS traffic
|
||||
dns && dns_response && size(dns_answers) == 0 // Failed lookups (NXDOMAIN — no answers)
|
||||
dns && dns_questions.exists(q, q.contains("minexmr")) // Mining pool DNS
|
||||
dns && dns_questions.exists(q, q.contains("nanopool")) // Mining pool DNS
|
||||
dns && dns_questions.exists(q, q.contains("amazonaws")) // Cloud API resolution
|
||||
dns && dns_questions.exists(q, q.contains("cloudflare-dns")) // DoH bypass
|
||||
dns && dns_questions.exists(q, q.contains("dns.google")) // DoH bypass
|
||||
```
|
||||
|
||||
## External Communication
|
||||
```
|
||||
http && dst.name.contains("attacker") // Known-bad destinations
|
||||
http && map_get(request.headers, "user-agent", "").contains("Mozilla/4.0") // Suspicious UA
|
||||
http && map_get(request.headers, "accept", "").contains("dns-json") // DoH requests
|
||||
http && map_get(request.headers, "upgrade", "") == "websocket" // WebSocket (potential mining)
|
||||
```
|
||||
|
||||
## Lateral Movement
|
||||
```
|
||||
src.pod.namespace != dst.pod.namespace // Cross-namespace traffic
|
||||
http && path.startsWith("/api/v1/secrets") // Secret enumeration
|
||||
http && path == "/.env" // Service fingerprinting
|
||||
http && path == "/actuator/info" // Spring Boot fingerprinting
|
||||
http && path == "/version" // Version fingerprinting
|
||||
```
|
||||
|
||||
## Protocol Inspection
|
||||
```
|
||||
postgresql // PostgreSQL wire protocol
|
||||
postgresql && postgresql_query.contains("UNION SELECT") // SQL injection patterns
|
||||
postgresql && !postgresql_success // Failed PostgreSQL queries
|
||||
redis // Redis protocol
|
||||
grpc // gRPC calls (native detection)
|
||||
grpc && grpc_method.contains("Reflection") // gRPC reflection enumeration
|
||||
```
|
||||
|
||||
## Credential Theft
|
||||
```
|
||||
dst.ip == "169.254.169.254" // IMDS access
|
||||
http && path.contains("/meta-data/iam") // IAM credential paths
|
||||
http && map_get(request.headers, "authorization", "").startsWith("AWS4-HMAC-SHA256") // Stolen AWS creds
|
||||
http && "x-aws-ec2-metadata-token-ttl-seconds" in request.headers // IMDSv2 token request
|
||||
```
|
||||
|
||||
## Resource Hijacking
|
||||
```
|
||||
dst.port == 3333 // Stratum mining (standard)
|
||||
dst.port == 14433 // Stratum mining (alt)
|
||||
dst.port == 45700 // Stratum mining (alt)
|
||||
dst.port == 4444 // Reverse shell / backdoor
|
||||
```
|
||||
|
||||
## Per-Namespace Scoping
|
||||
|
||||
Add namespace filters to any query above:
|
||||
```
|
||||
dns && src.pod.namespace == "k8s-mule" // DNS from specific namespace
|
||||
http && src.pod.namespace == "k8s-mule" // HTTP from specific namespace
|
||||
redis && src.pod.namespace == "k8s-mule" // Redis from specific namespace
|
||||
```
|
||||
102
skills/security-audit/references/report-template.md
Normal file
102
skills/security-audit/references/report-template.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# Security Audit Report Template
|
||||
|
||||
Use this template for the markdown report. Fill in all sections, then convert
|
||||
to PDF.
|
||||
|
||||
```markdown
|
||||
# Kubernetes Network Security Audit Report
|
||||
|
||||
**Cluster**: <cluster name/context>
|
||||
**Namespace**: <target namespace>
|
||||
**Date**: <audit date and time, local timezone>
|
||||
**Audit window**: <start time> — <end time> (<duration>)
|
||||
**Snapshots analyzed**: <count and IDs>
|
||||
**Audited by**: Claude Code + Kubeshark MCP
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
<2-3 sentence summary: how many threats found, highest severity,
|
||||
whether an active attack chain was identified, top recommendation>
|
||||
|
||||
## Threat Summary
|
||||
|
||||
| # | Severity | Workload | Threat | MITRE ATT&CK |
|
||||
|---|----------|----------|--------|---------------|
|
||||
| 1 | CRITICAL | log-shipper | DNS Tunneling | T1048.003 |
|
||||
| 2 | CRITICAL | cloud-health-monitor | IMDS Credential Theft | T1552.005 |
|
||||
| ... | | | | |
|
||||
|
||||
## Detailed Findings
|
||||
|
||||
### Finding 1: <Title> (CRITICAL)
|
||||
|
||||
**Workload**: <pod name>
|
||||
**MITRE ATT&CK**: <technique ID and name>
|
||||
**Snapshot**: <snapshot ID>
|
||||
**Detection method**: <which phase and tool detected this>
|
||||
|
||||
**Evidence**:
|
||||
<Specific traffic data — DNS queries, HTTP requests, L4 flows,
|
||||
protocol payloads. Include timestamps, source/dest, and relevant
|
||||
content. Quote actual query names, URLs, SQL statements, or
|
||||
Redis commands observed.>
|
||||
|
||||
**Impact**:
|
||||
<What this means — data at risk, credentials exposed, scope of access>
|
||||
|
||||
**Recommendation**:
|
||||
<Specific remediation — NetworkPolicy, RBAC change, pod deletion, credential rotation>
|
||||
|
||||
---
|
||||
|
||||
(repeat for each finding)
|
||||
|
||||
## Attack Chain Analysis
|
||||
|
||||
<If findings correlate, map the kill chain:
|
||||
Initial Access → Reconnaissance → Credential Access → Lateral Movement →
|
||||
Exfiltration → Persistence. Identify which workloads participate in each stage.>
|
||||
|
||||
## Detection Coverage
|
||||
|
||||
| Phase | Checked | Findings |
|
||||
|-------|---------|----------|
|
||||
| Workload Inventory | Yes | <count> |
|
||||
| DNS Threat Analysis | Yes | <count> |
|
||||
| External Communication | Yes | <count> |
|
||||
| Lateral Movement | Yes | <count> |
|
||||
| Protocol Abuse | Yes | <count> |
|
||||
| Credential Access | Yes | <count> |
|
||||
|
||||
## Limitations
|
||||
|
||||
<What this audit cannot detect — config-level vulnerabilities,
|
||||
image CVEs, idle threats. Recommend complementary tools.>
|
||||
|
||||
## Immediate Actions
|
||||
|
||||
1. <Highest priority action>
|
||||
2. <Second priority>
|
||||
3. ...
|
||||
|
||||
## Evidence Preservation
|
||||
|
||||
<List snapshot IDs created during this audit. Recommend uploading
|
||||
to cloud storage for long-term retention. Include PCAP export
|
||||
commands for key findings.>
|
||||
```
|
||||
|
||||
## Quality Guidelines
|
||||
|
||||
- **Include raw evidence** — quote actual DNS queries, HTTP URLs, SQL
|
||||
statements, Redis commands. The reader should be able to verify findings
|
||||
without re-running the audit.
|
||||
- **Timestamp everything** — every finding should reference the snapshot ID
|
||||
and timestamp (local time with UTC in parentheses).
|
||||
- **Be specific in recommendations** — not "fix RBAC" but "revoke
|
||||
ClusterRoleBinding `mule-recon-cluster-admin` and replace with a
|
||||
namespace-scoped Role granting only `get` on `pods`".
|
||||
- **Include MITRE ATT&CK IDs** — makes the report actionable for security
|
||||
teams that track coverage against the framework.
|
||||
190
skills/security-audit/references/threat-catalog.md
Normal file
190
skills/security-audit/references/threat-catalog.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# Network Threat Catalog
|
||||
|
||||
22 network-observable threat patterns organized by MITRE ATT&CK tactic.
|
||||
Each entry describes the attack, what it looks like on the wire, and how
|
||||
to detect it with Kubeshark.
|
||||
|
||||
## Command & Control (TA0011)
|
||||
|
||||
### DGA Beaconing (T1568.002)
|
||||
- **What**: Malware generates pseudo-random domain names daily and queries DNS
|
||||
for each. The C2 operator registers a few; most resolve to NXDOMAIN.
|
||||
- **Wire signature**: Burst of DNS queries for high-entropy .com/.net domains
|
||||
with >80% NXDOMAIN response rate.
|
||||
- **KFL**: `dns && dns_response && size(dns_answers) == 0` — then check for entropy in queried names.
|
||||
- **Difficulty**: Medium. NXDOMAIN flood is distinctive but low-rate DGA can
|
||||
blend with legitimate DNS failures.
|
||||
|
||||
### HTTP C2 Beaconing (T1071.001)
|
||||
- **What**: Implant calls home via HTTP GET at regular intervals, receiving
|
||||
tasking in the response body. Cobalt Strike, Meterpreter pattern.
|
||||
- **Wire signature**: Periodic HTTP GET to fixed external URL at suspiciously
|
||||
regular intervals (30-60s). Outdated User-Agent (Mozilla/4.0). Session
|
||||
identifiers in URL path.
|
||||
- **KFL**: `http && dst.name.contains("attacker")` or check for User-Agent anomalies.
|
||||
- **Difficulty**: Medium. Regularity is the key anomaly.
|
||||
|
||||
### Encrypted C2 (T1573.002)
|
||||
- **What**: C2 over HTTPS. Content is encrypted but TLS SNI reveals suspicious
|
||||
domain names.
|
||||
- **Wire signature**: Outbound TLS to non-standard domains (darknet, cdn-mirror).
|
||||
DNS queries preceding the connection reveal the target.
|
||||
- **KFL**: `dns && (dns_questions.exists(q, q.contains("darknet")) || dns_questions.exists(q, q.contains("cdn-mirror")))`.
|
||||
- **Difficulty**: Hard. Encrypted, uses standard port 443.
|
||||
|
||||
### DNS-over-HTTPS C2 (T1572)
|
||||
- **What**: Bypasses cluster DNS by sending queries as HTTPS to public DoH
|
||||
resolvers (cloudflare-dns.com, dns.google). C2 commands embedded in TXT
|
||||
responses.
|
||||
- **Wire signature**: HTTP requests to DoH endpoints with `accept: application/dns-json`
|
||||
header. No corresponding queries on port 53.
|
||||
- **KFL**: `http && (dst.name.contains("cloudflare-dns") || dst.name.contains("dns.google"))`.
|
||||
- **Difficulty**: Hard. Looks like regular HTTPS to trusted providers.
|
||||
|
||||
## Exfiltration (TA0010)
|
||||
|
||||
### DNS Tunneling (T1048.003)
|
||||
- **What**: Full bidirectional data channel over DNS using tools like iodine,
|
||||
dnscat2. Data encoded in long subdomain labels.
|
||||
- **Wire signature**: High-frequency DNS queries (20+/burst) with subdomain
|
||||
labels near 63-byte limit. Mix of A, TXT, NULL query types.
|
||||
- **KFL**: `dns && dns_questions.exists(q, q.contains("data-relay"))` or look for
|
||||
high query rates per source.
|
||||
- **Difficulty**: Medium. Volume and long subdomains are distinctive.
|
||||
|
||||
### HTTP Header Exfiltration (T1048.001)
|
||||
- **What**: Data exfiltrated in HTTP headers (Cookie, X-Trace-ID) disguised
|
||||
as analytics tracking. Low volume to evade detection.
|
||||
- **Wire signature**: HTTP GET to analytics-looking URL with oversized Cookie
|
||||
or custom headers containing base64-encoded data.
|
||||
- **KFL**: `http && dst.name.contains("cdn-provider")`.
|
||||
- **Difficulty**: Hard. Low volume, standard HTTP, looks like analytics.
|
||||
|
||||
### DNS Credential Exfiltration (T1048.003)
|
||||
- **What**: Stolen JWT tokens or credentials encoded in DNS TXT queries to
|
||||
attacker-controlled authoritative nameserver.
|
||||
- **Wire signature**: DNS TXT queries with structured multi-label subdomains
|
||||
containing base64-like encoded data.
|
||||
- **KFL**: `dns && dns_questions.exists(q, q.contains("steal-creds"))`.
|
||||
- **Difficulty**: Medium. Multi-label structure is distinctive.
|
||||
|
||||
### gRPC Stream Exfiltration (T1048.001)
|
||||
- **What**: Data exfiltration via gRPC (HTTP/2) POST to external endpoint.
|
||||
Blends with normal microservice traffic.
|
||||
- **Wire signature**: HTTP/2 POST with `Content-Type: application/grpc` to
|
||||
external destination with exfil-related method names.
|
||||
- **KFL**: `grpc && dst.name.contains("attacker")`.
|
||||
- **Difficulty**: Hard. gRPC is normal in K8s. External destination is the signal.
|
||||
|
||||
## Lateral Movement (TA0008)
|
||||
|
||||
### K8s API Enumeration (T1613)
|
||||
- **What**: Compromised pod uses mounted service account token to enumerate
|
||||
secrets, pods, RBAC bindings across all namespaces.
|
||||
- **Wire signature**: HTTPS to kubernetes.default.svc with broad GET requests
|
||||
across /api/v1/secrets, /pods, /configmaps, /clusterrolebindings.
|
||||
- **KFL**: `http && dst.port == 443 && path.contains("/api/v1/secrets")`.
|
||||
- **Difficulty**: Medium. The fanout across resource types is the anomaly.
|
||||
|
||||
### SSRF to Internal Services (T1090)
|
||||
- **What**: Pod probes cross-namespace internal services it shouldn't talk to —
|
||||
kube-dns metrics, Prometheus, Grafana, dashboards.
|
||||
- **Wire signature**: HTTP to multiple ClusterIP services across namespaces
|
||||
from a single source pod.
|
||||
- **KFL**: `http && src.pod.namespace == "k8s-mule" && dst.pod.namespace != "k8s-mule"`.
|
||||
- **Difficulty**: Medium. Cross-namespace breadth is the signal.
|
||||
|
||||
### Port Scanning (T1046)
|
||||
- **What**: Sweep of common ports across pod CIDR after initial access.
|
||||
- **Wire signature**: Rapid TCP SYN from single source to many IPs on ports
|
||||
80, 443, 3306, 5432, 6379, 8080, 9090, 27017. High RST/timeout rate.
|
||||
- **KFL**: `tcp && src.name == "network-diagnostics"`.
|
||||
- **Difficulty**: Easy. Classic scan pattern — high fan-out, high failure rate.
|
||||
|
||||
### Service Fingerprinting (T1046)
|
||||
- **What**: HTTP probes to discovery paths across multiple services to identify
|
||||
running software.
|
||||
- **Wire signature**: HTTP GET to /version, /healthz, /.env, /actuator/info,
|
||||
/server-info. HEAD and OPTIONS methods. Multiple targets from one source.
|
||||
- **KFL**: `http && (path == "/.env" || path == "/actuator/info")`.
|
||||
- **Difficulty**: Medium. Path patterns are distinctive.
|
||||
|
||||
## Credential Access (TA0006)
|
||||
|
||||
### IMDS Metadata Theft (T1552.005)
|
||||
- **What**: Query AWS/GCP instance metadata to steal IAM role credentials.
|
||||
The Capital One breach vector.
|
||||
- **Wire signature**: HTTP to 169.254.169.254 with paths /latest/meta-data/iam/,
|
||||
/latest/user-data, /latest/api/token (PUT for IMDSv2).
|
||||
- **KFL**: `dst.ip == "169.254.169.254"`.
|
||||
- **Difficulty**: Easy. Destination IP is unique and unmistakable.
|
||||
|
||||
### Cloud API Abuse (T1078.004)
|
||||
- **What**: Direct calls to AWS APIs (STS, S3, EC2) with stolen credentials
|
||||
from a workload pod.
|
||||
- **Wire signature**: DNS for sts.amazonaws.com, s3.amazonaws.com. HTTPS
|
||||
requests with AWS Signature V4 Authorization headers.
|
||||
- **KFL**: `dns && dns_questions.exists(q, q.contains("amazonaws.com"))`.
|
||||
- **Difficulty**: Medium. Cloud API DNS from a non-controller pod is suspicious.
|
||||
|
||||
## Resource Hijacking (TA0040)
|
||||
|
||||
### Stratum Mining Protocol (T1496)
|
||||
- **What**: XMRig/miner connecting to mining pool via Stratum JSON-RPC over TCP.
|
||||
- **Wire signature**: TCP connection to port 3333/14433/45700 with JSON-RPC
|
||||
messages: mining.subscribe, mining.authorize, mining.submit.
|
||||
- **KFL**: `dst.port == 3333`.
|
||||
- **Difficulty**: Medium. Port 3333 is a well-known mining indicator.
|
||||
|
||||
### Mining Pool DNS (T1496)
|
||||
- **What**: DNS resolution of known mining pool domains before connecting.
|
||||
- **Wire signature**: DNS queries for domains containing minexmr, nanopool,
|
||||
mining-pool, hashvault, supportxmr.
|
||||
- **KFL**: `dns && (dns_questions.exists(q, q.contains("minexmr")) || dns_questions.exists(q, q.contains("mining-pool")))`.
|
||||
- **Difficulty**: Easy. Mining domain names are unmistakable.
|
||||
|
||||
### WebSocket Mining (T1496)
|
||||
- **What**: Browser-based miner communicating via WebSocket on standard ports.
|
||||
- **Wire signature**: HTTP Upgrade: websocket request to external host with
|
||||
mining-related URL path (/proxy?coin=, ?algo=randomx).
|
||||
- **KFL**: `http && map_get(request.headers, "upgrade", "") == "websocket"`.
|
||||
- **Difficulty**: Hard. WebSocket on port 80/443 looks normal. Only URL reveals intent.
|
||||
|
||||
## Protocol Abuse
|
||||
|
||||
### SQL Injection via PG Wire (T1190)
|
||||
- **What**: SQL injection payloads sent through PostgreSQL wire protocol.
|
||||
- **Wire signature**: PG protocol carrying UNION SELECT, information_schema,
|
||||
pg_shadow queries.
|
||||
- **KFL**: `postgresql`.
|
||||
- **Difficulty**: Medium. PG dissection reveals the SQL content directly.
|
||||
|
||||
### Redis Unauthorized Access (T1190)
|
||||
- **What**: Unauthenticated Redis instance probed with dangerous commands.
|
||||
- **Wire signature**: Redis protocol: CONFIG GET *, KEYS *, CLIENT LIST, DEBUG.
|
||||
- **KFL**: `redis`.
|
||||
- **Difficulty**: Easy. Redis command names are directly visible.
|
||||
|
||||
### Database Destruction (T1485)
|
||||
- **What**: Ransomware pattern — SELECT * (data theft) then TRUNCATE/DROP (destruction).
|
||||
- **Wire signature**: PG protocol showing SELECT followed by TRUNCATE on same table.
|
||||
- **KFL**: `postgresql`.
|
||||
- **Difficulty**: Medium. DDL commands in PG protocol are visible with dissection.
|
||||
|
||||
## Reconnaissance (TA0043)
|
||||
|
||||
### DNS Zone Enumeration (T1018)
|
||||
- **What**: Brute-force DNS queries across namespaces to discover services.
|
||||
Includes SRV lookups and AXFR zone transfer attempts.
|
||||
- **Wire signature**: High volume of DNS queries for *.svc.cluster.local patterns
|
||||
across many namespaces. Many NXDOMAIN responses.
|
||||
- **KFL**: `dns && src.name == "service-discovery"`.
|
||||
- **Difficulty**: Easy. Volume and cross-namespace pattern is obvious.
|
||||
|
||||
### gRPC Reflection Enumeration (T1046)
|
||||
- **What**: Probing gRPC server reflection to discover API surfaces without
|
||||
needing proto files.
|
||||
- **Wire signature**: HTTP/2 POST to /grpc.reflection.v1alpha.ServerReflection/
|
||||
ServerReflectionInfo across multiple services.
|
||||
- **KFL**: `grpc && grpc_method.contains("Reflection")` or `http && path.contains("grpc.reflection")`.
|
||||
- **Difficulty**: Medium. Reflection path is a known enumeration vector.
|
||||
Reference in New Issue
Block a user