Compare commits

..

9 Commits

Author SHA1 Message Date
Volodymyr Stoiko
4982bf9e01 Merge branch 'master' into dissection-storage 2026-04-08 20:21:59 +03:00
Alon Girmonsky
fa03da2fd4 Enable MongoDB protocol dissector (#1903)
Add mongodb to the enabled dissectors list and port mapping (27017)
in both Go config defaults and Helm chart values.

Co-authored-by: Alon Girmonsky <alongir@Alons-Mac-Studio.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 08:05:13 -07:00
Volodymyr Stoiko
a005ef8f58 Use snapshot storage config as default for dissection storage config 2026-04-08 14:05:24 +00:00
Volodymyr Stoiko
5c02f79f07 Allow pvc management 2026-04-08 08:12:29 +00:00
Volodymyr Stoiko
dc5b4487df add dissection storage test 2026-04-08 07:58:55 +00:00
Volodymyr Stoiko
a4df20d651 Pass dissection storage configuration 2026-04-08 07:56:50 +00:00
stringsbuilder
4de0ac6abd refactor: replace Split in loops with more efficient SplitSeq and gofmt the code (#1888)
Signed-off-by: stringsbuilder <stringsbuilder@outlook.com>
Co-authored-by: Alon Girmonsky <1990761+alongir@users.noreply.github.com>
2026-04-06 21:07:50 -07:00
Alon Girmonsky
9b5ac2821f Network RCA skill: update resolution tools to list_workloads/list_ips (#1887)
Replace deprecated resolve_workload/resolve_ip references with the new
list_workloads and list_ips tools that support both singular lookup
(name+namespace or IP) and filtered scan (namespace/regex/label filters
against snapshots).

Ref: kubeshark/hub#687

Co-authored-by: Alon Girmonsky <alongir@Alons-Mac-Studio.local>
2026-04-06 12:40:34 -07:00
Alon Girmonsky
1ba6ed94e0 💄 Improve README with AI skills, KFL semantics, and cloud storage (#1892)
* 💄 Improve README with AI skills, KFL semantics image, and cloud storage

- Add AI Skills section with Network RCA and KFL skills, Claude Code plugin install
- Rename "Network Traffic Indexing" to "Query with API, Kubernetes, and Network Semantics" with new KFL semantics image showing how a single query combines all three layers
- Add cloud storage providers (S3, Azure Blob, GCS) and decrypted TLS to Traffic Retention section
- Update Features table: add AI Skills, KFL query language, cloud storage, delayed indexing

* 🔒 Add encrypted traffic visibility to README "What you can do" section

* 🎨 Update snapshots image in README

---------

Co-authored-by: Alon Girmonsky <alongir@Alons-Mac-Studio.local>
2026-04-02 18:38:13 -07:00
11 changed files with 249 additions and 40 deletions

View File

@@ -23,6 +23,7 @@ Kubeshark indexes cluster-wide network traffic at the kernel level using eBPF
- **Download Retrospective PCAPs** — cluster-wide packet captures filtered by nodes, time, workloads, and IPs. Store PCAPs for long-term retention and later investigation.
- **Visualize Network Data** — explore traffic matching queries with API, Kubernetes, or network semantics through a real-time dashboard.
- **See Encrypted Traffic in Plain Text** — automatically decrypt TLS/mTLS traffic using eBPF, with no key management or sidecars required.
- **Integrate with AI** — connect your favorite AI assistant (e.g. Claude, Copilot) to include network data in AI-driven workflows like incident response and root cause analysis.
![Kubeshark](https://github.com/kubeshark/assets/raw/master/png/stream.png)
@@ -67,15 +68,35 @@ Works with Claude Code, Cursor, and any MCP-compatible AI.
[MCP setup guide →](https://docs.kubeshark.com/en/mcp)
### AI Skills
Open-source, reusable skills that teach AI agents domain-specific workflows on top of Kubeshark's MCP tools:
| Skill | Description |
|-------|-------------|
| **[Network RCA](skills/network-rca/)** | Retrospective root cause analysis — snapshots, dissection, PCAP extraction, trend comparison |
| **[KFL](skills/kfl/)** | KFL (Kubeshark Filter Language) expert — writes, debugs, and optimizes traffic filters |
Install as a Claude Code plugin:
```
/plugin marketplace add kubeshark/kubeshark
/plugin install kubeshark
```
Or clone and use directly — skills trigger automatically based on conversation context.
[AI Skills docs →](https://docs.kubeshark.com/en/mcp/skills)
---
### Network Traffic Indexing
### Query with API, Kubernetes, and Network Semantics
Kubeshark indexes cluster-wide network traffic by parsing it according to protocol specifications, with support for HTTP, gRPC, Redis, Kafka, DNS, and more. This enables queries using Kubernetes semantics (e.g. pod, namespace, node), API semantics (e.g. path, headers, status), and network semantics (e.g. IP, port). No code instrumentation required.
Kubeshark indexes cluster-wide network traffic by parsing it according to protocol specifications, with support for HTTP, gRPC, Redis, Kafka, DNS, and more. A single [KFL query](https://docs.kubeshark.com/en/v2/kfl2) can combine all three semantic layers — Kubernetes identity, API context, and network attributes — to pinpoint exactly the traffic you need. No code instrumentation required.
![API context](https://github.com/kubeshark/assets/raw/master/png/api_context.png)
![KFL query combining API, Kubernetes, and network semantics](https://github.com/kubeshark/assets/raw/master/png/kfl-semantics.png)
[Learn more](https://docs.kubeshark.com/en/v2/l7_api_dissection)
[KFL reference →](https://docs.kubeshark.com/en/v2/kfl2) · [Traffic indexing](https://docs.kubeshark.com/en/v2/l7_api_dissection)
### Workload Dependency Map
@@ -87,11 +108,11 @@ A visual map of how workloads communicate, showing dependencies, traffic volume,
### Traffic Retention & PCAP Export
Capture and retain raw network traffic cluster-wide. Download PCAPs scoped by time range, nodes, workloads, and IPs — ready for Wireshark or any PCAP-compatible tool.
Capture and retain raw network traffic cluster-wide, including decrypted TLS. Download PCAPs scoped by time range, nodes, workloads, and IPs — ready for Wireshark or any PCAP-compatible tool. Store snapshots in cloud storage (S3, Azure Blob, GCS) for long-term retention and cross-cluster sharing.
![Traffic Retention](https://github.com/kubeshark/assets/raw/master/png/snapshots.png)
![Traffic Retention](https://github.com/kubeshark/assets/raw/master/png/snapshots-list.png)
[Snapshots guide →](https://docs.kubeshark.com/en/v2/traffic_snapshots)
[Snapshots guide →](https://docs.kubeshark.com/en/v2/traffic_snapshots) · [Cloud storage →](https://docs.kubeshark.com/en/snapshots_cloud_storage)
---
@@ -99,12 +120,12 @@ Capture and retain raw network traffic cluster-wide. Download PCAPs scoped by ti
| Feature | Description |
|---------|-------------|
| [**Traffic Snapshots**](https://docs.kubeshark.com/en/v2/traffic_snapshots) | Point-in-time snapshots, export as PCAP for Wireshark |
| [**L7 API Dissection**](https://docs.kubeshark.com/en/v2/l7_api_dissection) | Request/response matching with full payloads and protocol parsing |
| [**Traffic Snapshots**](https://docs.kubeshark.com/en/v2/traffic_snapshots) | Point-in-time snapshots with cloud storage (S3, Azure Blob, GCS), PCAP export for Wireshark |
| [**Traffic Indexing**](https://docs.kubeshark.com/en/v2/l7_api_dissection) | Real-time and delayed L7 indexing with request/response matching and full payloads |
| [**Protocol Support**](https://docs.kubeshark.com/en/protocols) | HTTP, gRPC, GraphQL, Redis, Kafka, DNS, and more |
| [**TLS Decryption**](https://docs.kubeshark.com/en/encrypted_traffic) | eBPF-based decryption without key management |
| [**AI-Powered Analysis**](https://docs.kubeshark.com/en/v2/ai_powered_analysis) | Query cluster-wide network data with Claude, Cursor, or any MCP-compatible AI |
| [**Display Filters**](https://docs.kubeshark.com/en/v2/kfl2) | Wireshark-inspired display filters for precise traffic analysis |
| [**TLS Decryption**](https://docs.kubeshark.com/en/encrypted_traffic) | eBPF-based decryption without key management, included in snapshots |
| [**AI Integration**](https://docs.kubeshark.com/en/mcp) | MCP server + open-source AI skills for network RCA and traffic filtering |
| [**KFL Query Language**](https://docs.kubeshark.com/en/v2/kfl2) | CEL-based query language with Kubernetes, API, and network semantics |
| [**100% On-Premises**](https://docs.kubeshark.com/en/air_gapped) | Air-gapped support, no external dependencies |
---

View File

@@ -86,9 +86,9 @@ type mcpContent struct {
}
type mcpPrompt struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
Arguments []mcpPromptArg `json:"arguments,omitempty"`
Name string `json:"name"`
Description string `json:"description,omitempty"`
Arguments []mcpPromptArg `json:"arguments,omitempty"`
}
type mcpPromptArg struct {
@@ -117,11 +117,11 @@ type mcpGetPromptResult struct {
// Hub MCP API response types
type hubMCPResponse struct {
Name string `json:"name"`
Description string `json:"description"`
Version string `json:"version"`
Tools []hubMCPTool `json:"tools"`
Prompts []hubMCPPrompt `json:"prompts"`
Name string `json:"name"`
Description string `json:"description"`
Version string `json:"version"`
Tools []hubMCPTool `json:"tools"`
Prompts []hubMCPPrompt `json:"prompts"`
}
type hubMCPTool struct {
@@ -131,9 +131,9 @@ type hubMCPTool struct {
}
type hubMCPPrompt struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
Arguments []hubMCPPromptArg `json:"arguments,omitempty"`
Name string `json:"name"`
Description string `json:"description,omitempty"`
Arguments []hubMCPPromptArg `json:"arguments,omitempty"`
}
type hubMCPPromptArg struct {
@@ -151,10 +151,10 @@ type mcpServer struct {
stdout io.Writer
backendInitialized bool
backendMu sync.Mutex
setFlags []string // --set flags to pass to 'kubeshark tap' when starting
directURL string // If set, connect directly to this URL (no kubectl/proxy)
urlMode bool // True when using direct URL mode
allowDestructive bool // If true, enable start/stop tools
setFlags []string // --set flags to pass to 'kubeshark tap' when starting
directURL string // If set, connect directly to this URL (no kubectl/proxy)
urlMode bool // True when using direct URL mode
allowDestructive bool // If true, enable start/stop tools
cachedHubMCP *hubMCPResponse // Cached tools/prompts from Hub
cachedAt time.Time // When the cache was populated
hubMCPMu sync.Mutex
@@ -772,7 +772,6 @@ func (s *mcpServer) callHubTool(toolName string, args map[string]any) (string, b
return prettyJSON.String(), false
}
func (s *mcpServer) callGetFileURL(args map[string]any) (string, bool) {
filePath, _ := args["path"].(string)
if filePath == "" {
@@ -869,8 +868,8 @@ func (s *mcpServer) callStartKubeshark(args map[string]any) (string, bool) {
// Add namespaces if provided
if v, ok := args["namespaces"].(string); ok && v != "" {
namespaces := strings.Split(v, ",")
for _, ns := range namespaces {
namespaces := strings.SplitSeq(v, ",")
for ns := range namespaces {
ns = strings.TrimSpace(ns)
if ns != "" {
cmdArgs = append(cmdArgs, "-n", ns)

View File

@@ -417,7 +417,7 @@ func TestMCP_CommandArgs(t *testing.T) {
cmdArgs = append(cmdArgs, v)
}
if v, _ := tc.args["namespaces"].(string); v != "" {
for _, ns := range strings.Split(v, ",") {
for ns := range strings.SplitSeq(v, ",") {
cmdArgs = append(cmdArgs, "-n", strings.TrimSpace(ns))
}
}

View File

@@ -128,6 +128,7 @@ func CreateDefaultConfig() ConfigStruct {
"http",
"icmp",
"kafka",
"mongodb",
"redis",
// "sctp",
// "syscall",
@@ -147,6 +148,7 @@ func CreateDefaultConfig() ConfigStruct {
HTTP: []uint16{80, 443, 8080},
AMQP: []uint16{5671, 5672},
KAFKA: []uint16{9092},
MONGODB: []uint16{27017},
REDIS: []uint16{6379},
LDAP: []uint16{389},
DIAMETER: []uint16{3868},

View File

@@ -282,6 +282,7 @@ type PortMapping struct {
HTTP []uint16 `yaml:"http" json:"http"`
AMQP []uint16 `yaml:"amqp" json:"amqp"`
KAFKA []uint16 `yaml:"kafka" json:"kafka"`
MONGODB []uint16 `yaml:"mongodb" json:"mongodb"`
REDIS []uint16 `yaml:"redis" json:"redis"`
LDAP []uint16 `yaml:"ldap" json:"ldap"`
DIAMETER []uint16 `yaml:"diameter" json:"diameter"`
@@ -353,8 +354,10 @@ type SnapshotsConfig struct {
}
type DelayedDissectionConfig struct {
CPU string `yaml:"cpu" json:"cpu" default:"1"`
Memory string `yaml:"memory" json:"memory" default:"4Gi"`
CPU string `yaml:"cpu" json:"cpu" default:"1"`
Memory string `yaml:"memory" json:"memory" default:"4Gi"`
StorageSize string `yaml:"storageSize" json:"storageSize" default:""`
StorageClass string `yaml:"storageClass" json:"storageClass" default:""`
}
type DissectionConfig struct {

View File

@@ -164,6 +164,8 @@ Example for overriding image names:
| `tap.snapshots.cloud.gcs.credentialsJson` | Service account JSON key. When set, the chart auto-creates a Secret with `SNAPSHOT_GCS_CREDENTIALS_JSON`. | `""` |
| `tap.delayedDissection.cpu` | CPU allocation for delayed dissection jobs | `1` |
| `tap.delayedDissection.memory` | Memory allocation for delayed dissection jobs | `4Gi` |
| `tap.delayedDissection.storageSize` | Storage size for dissection job PVC. When empty, falls back to `tap.snapshots.local.storageSize`. When the resolved value is non-empty, a PVC is created; otherwise an `emptyDir` is used. | `""` |
| `tap.delayedDissection.storageClass` | Storage class for dissection job PVC. When empty, falls back to `tap.snapshots.local.storageClass`. | `""` |
| `tap.release.repo` | URL of the Helm chart repository | `https://helm.kubeshark.com` |
| `tap.release.name` | Helm release name | `kubeshark` |
| `tap.release.namespace` | Helm release namespace | `default` |

View File

@@ -86,6 +86,15 @@ rules:
verbs:
- create
- get
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- create
- get
- list
- delete
- apiGroups:
- batch
resources:

View File

@@ -56,6 +56,16 @@ spec:
- -dissector-memory
- '{{ .Values.tap.delayedDissection.memory }}'
{{- end }}
{{- $dissectorStorageSize := .Values.tap.delayedDissection.storageSize | default .Values.tap.snapshots.local.storageSize }}
{{- if $dissectorStorageSize }}
- -dissector-storage-size
- '{{ $dissectorStorageSize }}'
{{- end }}
{{- $dissectorStorageClass := .Values.tap.delayedDissection.storageClass | default .Values.tap.snapshots.local.storageClass }}
{{- if $dissectorStorageClass }}
- -dissector-storage-class
- '{{ $dissectorStorageClass }}'
{{- end }}
{{- if .Values.tap.gitops.enabled }}
- -gitops
{{- end }}

View File

@@ -0,0 +1,127 @@
suite: dissection storage configuration
templates:
- templates/04-hub-deployment.yaml
tests:
- it: should fallback to snapshot storageSize when dissection storageSize is empty
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-size
- contains:
path: spec.template.spec.containers[0].command
content: "20Gi"
- it: should fallback to snapshot storageClass when dissection storageClass is empty
set:
tap.snapshots.local.storageClass: gp2
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-class
- contains:
path: spec.template.spec.containers[0].command
content: gp2
- it: should not render dissector-storage-class when both dissection and snapshot storageClass are empty
asserts:
- notContains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-class
- it: should prefer dissection storageSize over snapshot storageSize
set:
tap.delayedDissection.storageSize: 100Gi
tap.snapshots.local.storageSize: 50Gi
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-size
- contains:
path: spec.template.spec.containers[0].command
content: "100Gi"
- it: should prefer dissection storageClass over snapshot storageClass
set:
tap.delayedDissection.storageClass: io2
tap.snapshots.local.storageClass: gp2
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-class
- contains:
path: spec.template.spec.containers[0].command
content: io2
- it: should fallback to snapshot config for both storageSize and storageClass
set:
tap.snapshots.local.storageSize: 30Gi
tap.snapshots.local.storageClass: gp3
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-size
- contains:
path: spec.template.spec.containers[0].command
content: "30Gi"
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-class
- contains:
path: spec.template.spec.containers[0].command
content: gp3
- it: should not render dissector-storage-size when both dissection and snapshot storageSize are empty
set:
tap.delayedDissection.storageSize: ""
tap.snapshots.local.storageSize: ""
asserts:
- notContains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-size
- it: should render all dissector args together with custom values
set:
tap.delayedDissection.cpu: "4"
tap.delayedDissection.memory: 8Gi
tap.delayedDissection.storageSize: 200Gi
tap.delayedDissection.storageClass: local-path
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-cpu
- contains:
path: spec.template.spec.containers[0].command
content: "4"
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-memory
- contains:
path: spec.template.spec.containers[0].command
content: 8Gi
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-size
- contains:
path: spec.template.spec.containers[0].command
content: "200Gi"
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-storage-class
- contains:
path: spec.template.spec.containers[0].command
content: local-path
- it: should still render existing dissector-cpu and dissector-memory args
asserts:
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-cpu
- contains:
path: spec.template.spec.containers[0].command
content: "1"
- contains:
path: spec.template.spec.containers[0].command
content: -dissector-memory
- contains:
path: spec.template.spec.containers[0].command
content: 4Gi

View File

@@ -37,6 +37,8 @@ tap:
delayedDissection:
cpu: "1"
memory: 4Gi
storageSize: ""
storageClass: ""
snapshots:
local:
storageClass: ""
@@ -205,6 +207,7 @@ tap:
- http
- icmp
- kafka
- mongodb
- redis
- ws
- ldap
@@ -224,6 +227,8 @@ tap:
- 5672
kafka:
- 9092
mongodb:
- 27017
redis:
- 6379
ldap:

View File

@@ -221,18 +221,48 @@ When you know the workload names but not their IPs, resolve them from the
snapshot's metadata. Snapshots preserve pod-to-IP mappings from capture time,
so resolution is accurate even if pods have been rescheduled since.
**Tool**: `resolve_workload`
**Tool**: `list_workloads`
**Example workflow** — extract PCAP for specific workloads:
Use `list_workloads` with `name` + `namespace` for a singular lookup (works
live and against snapshots), or with `snapshot_id` + filters for a broader
scan.
1. Resolve IPs: `resolve_workload` for `orders-594487879c-7ddxf``10.0.53.101`
2. Resolve IPs: `resolve_workload` for `payment-service-6b8f9d-x2k4p``10.0.53.205`
**Example workflow — singular lookup** — extract PCAP for specific workloads:
1. Resolve IPs: `list_workloads` with `name: "orders-594487879c-7ddxf"`, `namespace: "prod"` → IPs: `["10.0.53.101"]`
2. Resolve IPs: `list_workloads` with `name: "payment-service-6b8f9d-x2k4p"`, `namespace: "prod"` → IPs: `["10.0.53.205"]`
3. Build BPF: `host 10.0.53.101 or host 10.0.53.205`
4. Export: `export_snapshot_pcap` with that BPF filter
**Example workflow — filtered scan** — extract PCAP for all workloads
matching a pattern in a snapshot:
1. List workloads: `list_workloads` with `snapshot_id`, `namespaces: ["prod"]`,
`name_regex: "payment.*"` → returns all matching workloads with their IPs
2. Collect all IPs from the response
3. Build BPF: `host 10.0.53.205 or host 10.0.53.210 or ...`
4. Export: `export_snapshot_pcap` with that BPF filter
This gives you a cluster-wide PCAP filtered to exactly the workloads involved
in the incident — ready for Wireshark or long-term storage.
### IP-to-Workload Resolution
When you have an IP address (e.g., from a PCAP or L4 flow) and need to
identify the workload behind it:
**Tool**: `list_ips`
Use `list_ips` with `ip` for a singular lookup (works live and against
snapshots), or with `snapshot_id` + filters for a broader scan.
**Example — singular lookup**: `list_ips` with `ip: "10.0.53.101"`,
`snapshot_id: "snap-abc"` → returns pod/service identity for that IP.
**Example — filtered scan**: `list_ips` with `snapshot_id: "snap-abc"`,
`namespaces: ["prod"]`, `labels: {"app": "payment"}` → returns all IPs
associated with workloads matching those filters.
---
## Route 2: Dissection
@@ -380,8 +410,9 @@ conn && conn_state == "open" && conn_local_bytes > 1000000 // High-volume conne
The two routes are complementary. A common pattern:
1. Start with **Dissection** — let the AI agent search and identify the root cause
2. Once you've pinpointed the problematic workloads, use `resolve_workload`
to get their IPs
2. Once you've pinpointed the problematic workloads, use `list_workloads`
to get their IPs (singular lookup by name+namespace, or filtered scan
by namespace/regex/labels against the snapshot)
3. Switch to **PCAP** — export a filtered PCAP of just those workloads for
Wireshark deep-dive, sharing with the network team, or compliance archival
@@ -394,7 +425,7 @@ The two routes are complementary. A common pattern:
3. `create_snapshot` covering the incident window (add 15 minutes buffer)
4. **Dissection route**: `start_snapshot_dissection``get_api_stats`
`list_api_calls``get_api_call` → follow the dependency chain
5. **PCAP route**: `resolve_workload``export_snapshot_pcap` with BPF →
5. **PCAP route**: `list_workloads``export_snapshot_pcap` with BPF →
hand off to Wireshark or archive
### Other Use Cases