Files
kubeshark/helm-chart/docs/snapshots_cloud_storage.md
2026-03-16 20:29:49 -07:00

17 KiB

Cloud Storage for Snapshots

Kubeshark can upload and download snapshots to cloud object storage, enabling cross-cluster sharing, backup/restore, and long-term retention.

Supported providers: Amazon S3 (s3), Azure Blob Storage (azblob), and Google Cloud Storage (gcs).

Helm Values

tap:
  snapshots:
    cloud:
      provider: ""      # "s3", "azblob", or "gcs" (empty = disabled)
      prefix: ""        # key prefix in the bucket/container (e.g. "snapshots/")
      configMaps: []    # names of pre-existing ConfigMaps with cloud config env vars
      secrets: []       # names of pre-existing Secrets with cloud credentials
      s3:
        bucket: ""
        region: ""
        accessKey: ""
        secretKey: ""
        roleArn: ""
        externalId: ""
      azblob:
        storageAccount: ""
        container: ""
        storageKey: ""
      gcs:
        bucket: ""
        project: ""
        credentialsJson: ""
  • provider selects which cloud backend to use. Leave empty to disable cloud storage.
  • configMaps and secrets are lists of names of existing ConfigMap/Secret resources. They are mounted as envFrom on the hub pod, injecting all their keys as environment variables.

Inline Values (Alternative to External ConfigMaps/Secrets)

Instead of creating ConfigMap and Secret resources manually, you can set cloud storage configuration directly in values.yaml or via --set flags. The Helm chart will automatically create the necessary ConfigMap and Secret resources.

Both approaches can be used together — inline values are additive to external configMaps/secrets references.


Amazon S3

Environment Variables

Variable Required Description
SNAPSHOT_AWS_BUCKET Yes S3 bucket name
SNAPSHOT_AWS_REGION No AWS region (uses SDK default if empty)
SNAPSHOT_AWS_ACCESS_KEY No Static access key ID (empty = use default credential chain)
SNAPSHOT_AWS_SECRET_KEY No Static secret access key
SNAPSHOT_AWS_ROLE_ARN No IAM role ARN to assume via STS (for cross-account access)
SNAPSHOT_AWS_EXTERNAL_ID No External ID for the STS AssumeRole call
SNAPSHOT_CLOUD_PREFIX No Key prefix in the bucket (e.g. snapshots/)

Authentication Methods

Credentials are resolved in this order:

  1. Static credentials -- If SNAPSHOT_AWS_ACCESS_KEY is set, static credentials are used directly.
  2. STS AssumeRole -- If SNAPSHOT_AWS_ROLE_ARN is also set, the static (or default) credentials are used to assume the given IAM role. This is useful for cross-account S3 access.
  3. AWS default credential chain -- When no static credentials are provided, the SDK default chain is used:
    • IRSA (EKS service account token) -- recommended for production on EKS
    • EC2 instance profile
    • Standard AWS environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, etc.)
    • Shared credentials file (~/.aws/credentials)

The provider validates bucket access on startup via HeadBucket. If the bucket is inaccessible, the hub will fail to start.

Example: Inline Values (simplest approach)

tap:
  snapshots:
    cloud:
      provider: "s3"
      s3:
        bucket: my-kubeshark-snapshots
        region: us-east-1

Or with static credentials via --set:

helm install kubeshark kubeshark/kubeshark \
  --set tap.snapshots.cloud.provider=s3 \
  --set tap.snapshots.cloud.s3.bucket=my-kubeshark-snapshots \
  --set tap.snapshots.cloud.s3.region=us-east-1 \
  --set tap.snapshots.cloud.s3.accessKey=AKIA... \
  --set tap.snapshots.cloud.s3.secretKey=wJal...

IAM Roles for Service Accounts (IRSA) lets EKS pods assume an IAM role without static credentials. EKS injects a short-lived token into the pod automatically.

Prerequisites:

  1. Your EKS cluster must have an OIDC provider associated with it.
  2. An IAM role with a trust policy that allows the Kubeshark service account to assume it.

Step 1 — Create an IAM policy scoped to your bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:GetObjectVersion",
        "s3:DeleteObjectVersion",
        "s3:ListBucket",
        "s3:ListBucketVersions",
        "s3:GetBucketLocation",
        "s3:GetBucketVersioning"
      ],
      "Resource": [
        "arn:aws:s3:::my-kubeshark-snapshots",
        "arn:aws:s3:::my-kubeshark-snapshots/*"
      ]
    }
  ]
}

For read-only access, remove s3:PutObject, s3:DeleteObject, and s3:DeleteObjectVersion.

Step 2 — Create an IAM role with IRSA trust policy:

# Get your cluster's OIDC provider URL
OIDC_PROVIDER=$(aws eks describe-cluster --name CLUSTER_NAME \
  --query "cluster.identity.oidc.issuer" --output text | sed 's|https://||')

# Create a trust policy
# The default K8s SA name is "<release-name>-service-account" (e.g. "kubeshark-service-account")
cat > trust-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:NAMESPACE:kubeshark-service-account",
          "${OIDC_PROVIDER}:aud": "sts.amazonaws.com"
        }
      }
    }
  ]
}
EOF

# Create the role and attach your policy
aws iam create-role \
  --role-name KubesharkS3Role \
  --assume-role-policy-document file://trust-policy.json

aws iam put-role-policy \
  --role-name KubesharkS3Role \
  --policy-name KubesharkSnapshotsBucketAccess \
  --policy-document file://bucket-policy.json

Step 3 — Create a ConfigMap with bucket configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-s3-config
data:
  SNAPSHOT_AWS_BUCKET: my-kubeshark-snapshots
  SNAPSHOT_AWS_REGION: us-east-1

Step 4 — Set Helm values with tap.annotations to annotate the service account:

tap:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT_ID:role/KubesharkS3Role
  snapshots:
    cloud:
      provider: "s3"
      configMaps:
        - kubeshark-s3-config

Or via --set:

helm install kubeshark kubeshark/kubeshark \
  --set tap.snapshots.cloud.provider=s3 \
  --set tap.snapshots.cloud.s3.bucket=my-kubeshark-snapshots \
  --set tap.snapshots.cloud.s3.region=us-east-1 \
  --set tap.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::ACCOUNT_ID:role/KubesharkS3Role

No accessKey/secretKey is needed — EKS injects credentials automatically via the IRSA token.

Example: Static Credentials

Create a Secret with credentials:

apiVersion: v1
kind: Secret
metadata:
  name: kubeshark-s3-creds
type: Opaque
stringData:
  SNAPSHOT_AWS_ACCESS_KEY: AKIA...
  SNAPSHOT_AWS_SECRET_KEY: wJal...

Create a ConfigMap with bucket configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-s3-config
data:
  SNAPSHOT_AWS_BUCKET: my-kubeshark-snapshots
  SNAPSHOT_AWS_REGION: us-east-1

Set Helm values:

tap:
  snapshots:
    cloud:
      provider: "s3"
      configMaps:
        - kubeshark-s3-config
      secrets:
        - kubeshark-s3-creds

Example: Cross-Account Access via AssumeRole

Add the role ARN to your ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-s3-config
data:
  SNAPSHOT_AWS_BUCKET: other-account-bucket
  SNAPSHOT_AWS_REGION: eu-west-1
  SNAPSHOT_AWS_ROLE_ARN: arn:aws:iam::123456789012:role/KubesharkCrossAccountRole
  SNAPSHOT_AWS_EXTERNAL_ID: my-external-id   # optional, if required by the trust policy

The hub will first authenticate using its own credentials (IRSA, static, or default chain), then assume the specified role to access the bucket.


Azure Blob Storage

Environment Variables

Variable Required Description
SNAPSHOT_AZBLOB_STORAGE_ACCOUNT Yes Azure storage account name
SNAPSHOT_AZBLOB_CONTAINER Yes Blob container name
SNAPSHOT_AZBLOB_STORAGE_KEY No Storage account access key (empty = use DefaultAzureCredential)
SNAPSHOT_CLOUD_PREFIX No Key prefix in the container (e.g. snapshots/)

Authentication Methods

Credentials are resolved in this order:

  1. Shared Key -- If SNAPSHOT_AZBLOB_STORAGE_KEY is set, the storage account key is used directly.
  2. DefaultAzureCredential -- When no storage key is provided, the Azure SDK default credential chain is used:
    • Workload Identity (AKS pod identity) -- recommended for production on AKS
    • Managed Identity (system or user-assigned)
    • Azure CLI credentials
    • Environment variables (AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET)

The provider validates container access on startup via GetProperties. If the container is inaccessible, the hub will fail to start.

Example: Inline Values

tap:
  snapshots:
    cloud:
      provider: "azblob"
      azblob:
        storageAccount: mykubesharksa
        container: snapshots
        storageKey: "base64-encoded-storage-key..."  # optional, omit for DefaultAzureCredential

Create a ConfigMap with storage configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-azblob-config
data:
  SNAPSHOT_AZBLOB_STORAGE_ACCOUNT: mykubesharksa
  SNAPSHOT_AZBLOB_CONTAINER: snapshots

Set Helm values:

tap:
  snapshots:
    cloud:
      provider: "azblob"
      configMaps:
        - kubeshark-azblob-config

The hub pod's service account must be configured for AKS Workload Identity with a managed identity that has the Storage Blob Data Contributor role on the container.

Example: Storage Account Key

Create a Secret with the storage key:

apiVersion: v1
kind: Secret
metadata:
  name: kubeshark-azblob-creds
type: Opaque
stringData:
  SNAPSHOT_AZBLOB_STORAGE_KEY: "base64-encoded-storage-key..."

Create a ConfigMap with storage configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-azblob-config
data:
  SNAPSHOT_AZBLOB_STORAGE_ACCOUNT: mykubesharksa
  SNAPSHOT_AZBLOB_CONTAINER: snapshots

Set Helm values:

tap:
  snapshots:
    cloud:
      provider: "azblob"
      configMaps:
        - kubeshark-azblob-config
      secrets:
        - kubeshark-azblob-creds

Google Cloud Storage

Environment Variables

Variable Required Description
SNAPSHOT_GCS_BUCKET Yes GCS bucket name
SNAPSHOT_GCS_PROJECT No GCP project ID
SNAPSHOT_GCS_CREDENTIALS_JSON No Service account JSON key (empty = use Application Default Credentials)
SNAPSHOT_CLOUD_PREFIX No Key prefix in the bucket (e.g. snapshots/)

Authentication Methods

Credentials are resolved in this order:

  1. Service Account JSON Key -- If SNAPSHOT_GCS_CREDENTIALS_JSON is set, the provided JSON key is used directly.
  2. Application Default Credentials -- When no JSON key is provided, the GCP SDK default credential chain is used:
    • Workload Identity (GKE pod identity) -- recommended for production on GKE
    • GCE instance metadata (Compute Engine default service account)
    • Standard GCP environment variables (GOOGLE_APPLICATION_CREDENTIALS)
    • gcloud CLI credentials

The provider validates bucket access on startup via Bucket.Attrs(). If the bucket is inaccessible, the hub will fail to start.

Required IAM Permissions

The service account needs different IAM roles depending on the access level:

Read-only (download, list, and sync snapshots from cloud):

Role Permissions provided Purpose
roles/storage.legacyBucketReader storage.buckets.get, storage.objects.list Hub startup (bucket validation) + listing snapshots
roles/storage.objectViewer storage.objects.get, storage.objects.list Downloading snapshots, checking existence, reading metadata
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member="serviceAccount:SA_EMAIL" \
  --role="roles/storage.legacyBucketReader"
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member="serviceAccount:SA_EMAIL" \
  --role="roles/storage.objectViewer"

Read-write (upload and delete snapshots in addition to read):

Add roles/storage.objectAdmin instead of roles/storage.objectViewer to also grant storage.objects.create and storage.objects.delete:

Role Permissions provided Purpose
roles/storage.legacyBucketReader storage.buckets.get, storage.objects.list Hub startup (bucket validation) + listing snapshots
roles/storage.objectAdmin storage.objects.* Full object CRUD (upload, download, delete, list, metadata)
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member="serviceAccount:SA_EMAIL" \
  --role="roles/storage.legacyBucketReader"
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member="serviceAccount:SA_EMAIL" \
  --role="roles/storage.objectAdmin"

Example: Inline Values (simplest approach)

tap:
  snapshots:
    cloud:
      provider: "gcs"
      gcs:
        bucket: my-kubeshark-snapshots
        project: my-gcp-project

Or with a service account key via --set:

helm install kubeshark kubeshark/kubeshark \
  --set tap.snapshots.cloud.provider=gcs \
  --set tap.snapshots.cloud.gcs.bucket=my-kubeshark-snapshots \
  --set tap.snapshots.cloud.gcs.project=my-gcp-project \
  --set-file tap.snapshots.cloud.gcs.credentialsJson=service-account.json

Create a ConfigMap with bucket configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-gcs-config
data:
  SNAPSHOT_GCS_BUCKET: my-kubeshark-snapshots
  SNAPSHOT_GCS_PROJECT: my-gcp-project

Set Helm values:

tap:
  snapshots:
    cloud:
      provider: "gcs"
      configMaps:
        - kubeshark-gcs-config

Configure GKE Workload Identity to allow the Kubernetes service account to impersonate the GCP service account:

# Ensure the GKE cluster has Workload Identity enabled
# (--workload-pool=PROJECT_ID.svc.id.goog at cluster creation)

# Create a GCP service account (if not already created)
gcloud iam service-accounts create kubeshark-gcs \
  --display-name="Kubeshark GCS Snapshots"

# Grant bucket access (read-write — see Required IAM Permissions above)
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member="serviceAccount:kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/storage.legacyBucketReader"
gcloud storage buckets add-iam-policy-binding gs://BUCKET_NAME \
  --member="serviceAccount:kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# Allow the K8s service account to impersonate the GCP service account
# Note: the K8s SA name is "<release-name>-service-account" (default: "kubeshark-service-account")
gcloud iam service-accounts add-iam-policy-binding \
  kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com \
  --role="roles/iam.workloadIdentityUser" \
  --member="serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/kubeshark-service-account]"

Set Helm values — the tap.annotations field adds the Workload Identity annotation to the service account:

tap:
  annotations:
    iam.gke.io/gcp-service-account: kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com
  snapshots:
    cloud:
      provider: "gcs"
      configMaps:
        - kubeshark-gcs-config

Or via --set:

helm install kubeshark kubeshark/kubeshark \
  --set tap.snapshots.cloud.provider=gcs \
  --set tap.snapshots.cloud.gcs.bucket=BUCKET_NAME \
  --set tap.snapshots.cloud.gcs.project=PROJECT_ID \
  --set tap.annotations."iam\.gke\.io/gcp-service-account"=kubeshark-gcs@PROJECT_ID.iam.gserviceaccount.com

No credentialsJson secret is needed — GKE injects credentials automatically via the Workload Identity metadata server.

Example: Service Account Key

Create a Secret with the service account JSON key:

apiVersion: v1
kind: Secret
metadata:
  name: kubeshark-gcs-creds
type: Opaque
stringData:
  SNAPSHOT_GCS_CREDENTIALS_JSON: |
    {
      "type": "service_account",
      "project_id": "my-gcp-project",
      "private_key_id": "...",
      "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
      "client_email": "kubeshark@my-gcp-project.iam.gserviceaccount.com",
      ...
    }

Create a ConfigMap with bucket configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kubeshark-gcs-config
data:
  SNAPSHOT_GCS_BUCKET: my-kubeshark-snapshots
  SNAPSHOT_GCS_PROJECT: my-gcp-project

Set Helm values:

tap:
  snapshots:
    cloud:
      provider: "gcs"
      configMaps:
        - kubeshark-gcs-config
      secrets:
        - kubeshark-gcs-creds