github/kubevela

Fork 0

mirror of https://github.com/kubevela/kubevela.git synced 2026-05-18 07:17:05 +00:00

Files

Brian Kane 32ac0d69c7 Feature: Configurable Application Policies

2026-02-09 10:19:05 +00:00

7.7 KiB

Raw Permalink Blame History

Proposal: Store Spec Diffs for Policy Transforms

Problem

Currently when a policy modifies the Application spec, we only set specModified: true. This isn't helpful for debugging:

{
  "name": "inject-sidecar",
  "specModified": true  // ❌ What did it change?
}

Users can't tell:

What exactly was added/removed/modified
How multiple policies interact
If a policy is working correctly

Proposed Solution

Store a structured diff when specModified=true:

{
  "name": "inject-sidecar",
  "specModified": true,
  "specDiff": "eyJhZGRlZCI6e...}",  // Base64 JSON patch
  "specDiffSummary": "Added 1 component, modified 2 properties"
}

Implementation Options

Option 1: JSON Patch (RFC 6902)

Format: Industry standard, compact

[
  {"op": "add", "path": "/components/1", "value": {...}},
  {"op": "replace", "path": "/components/0/properties/replicas", "value": 3}
]

Pros:

Standard format (kubectl diff uses this)
Reversible (can undo changes)
Compact representation
Libraries available

Cons:

Harder to read for humans
Requires JSON marshal/unmarshal

Cost: ~10-15ms overhead

Option 2: Structured Summary

Format: Human-readable summary

{
  "componentsAdded": 1,
  "componentsModified": 0,
  "propertiesChanged": ["components[0].properties.replicas"],
  "beforeHash": "abc123",
  "afterHash": "def456"
}

Pros:

Very readable
Lightweight (no full diff)
Fast to compute (~2-3ms)

Cons:

Can't see actual values
Not reversible
Less useful for complex changes

Cost: ~2-5ms overhead

Option 3: Hybrid Approach (RECOMMENDED)

Store both summary + diff (only if diff is small)

const MaxSpecDiffSize = 10 * 1024 // 10KB

type SpecChange struct {
    Summary      SpecChangeSummary  `json:"summary"`
    FullDiff     string            `json:"fullDiff,omitempty"`  // Base64 JSON patch (if <10KB)
    DiffTooLarge bool              `json:"diffTooLarge,omitempty"`
}

type SpecChangeSummary struct {
    ComponentsAdded    int      `json:"componentsAdded,omitempty"`
    ComponentsModified int      `json:"componentsModified,omitempty"`
    ComponentsRemoved  int      `json:"componentsRemoved,omitempty"`
    FieldsChanged      []string `json:"fieldsChanged,omitempty"`
    BeforeHash         string   `json:"beforeHash"`
    AfterHash          string   `json:"afterHash"`
}

Example:

{
  "name": "inject-sidecar",
  "specModified": true,
  "specChange": {
    "summary": {
      "componentsAdded": 0,
      "componentsModified": 2,
      "fieldsChanged": [
        "components[0].properties.env[0]",
        "components[1].properties.env[0]"
      ],
      "beforeHash": "abc123",
      "afterHash": "def456"
    },
    "fullDiff": "W3sib3AiOiJhZGQiLCJ...",  // Only if <10KB
    "diffTooLarge": false
  }
}

Pros:

Human-readable summary for quick diagnosis
Full diff available for detailed debugging (when needed)
Avoids etcd bloat for large changes
Fast path (summary only) is cheap (~5ms)

Cons:

More complex implementation
Two code paths to maintain

Cost: ~5-15ms depending on size

Scope: Only Diff Spec Changes

Do NOT diff labels/annotations - we already track these explicitly:

"addedLabels": {"team": "platform"},        // ✅ Already clear
"addedAnnotations": {"version": "v1.0"}     // ✅ Already clear

Only diff spec transforms - this is where we need help:

"specModified": true,   // ❌ Not helpful
"specChange": {...}     // ✅ Shows what changed

Computational Impact

Per-Application Overhead:

Option 1 (JSON Patch): ~10-15ms
Option 2 (Summary Only): ~2-5ms
Option 3 (Hybrid): ~5-15ms (average ~8ms)

Context:

Typical reconciliation: 100-500ms
Policy rendering (uncached): 30-100ms
8ms overhead = ~2-5% of total time ✅ Acceptable

When to Skip:

If no spec transform: 0ms overhead
If diff >10KB: compute summary only (~2ms)
Labels/annotations only: 0ms overhead

Storage Impact

etcd Size:

JSON Patch for typical sidecar injection: ~2-5KB
Base64 encoding: +33% → ~3-7KB
5 policies with spec changes: ~15-35KB
Total Application size increase: <5% ✅ Acceptable

etcd Limits:

Max object size: 1.5MB
Typical Application: 20-100KB
With diffs: 25-135KB
Still well under limit ✅

Implementation Plan

Phase 1: Summary Only (Quick Win)

type PolicyChanges struct {
    AddedLabels       map[string]string
    AddedAnnotations  map[string]string
    AdditionalContext map[string]interface{}
    SpecModified      bool
    SpecChangeSummary *SpecChangeSummary  // NEW
}

Benefits:

Low overhead (~2ms)
Helps with debugging
No storage concerns

Phase 2: Add Full Diff (If Needed)

type PolicyChanges struct {
    // ... existing fields
    SpecChange *SpecChange  // Replaces SpecModified + Summary
}

Benefits:

Complete visibility
Can show diffs in UI
Enables "undo" functionality

Alternative: External Diff Storage

If storage is a concern, store diffs externally:

type AppliedGlobalPolicy struct {
    // ... existing fields
    SpecDiffRef string  // "configmap/my-app-policy-diffs/inject-sidecar"
}

Create a ConfigMap per Application with all policy diffs:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-app-policy-diffs
data:
  inject-sidecar: |
    [{"op": "add", ...}]
  resource-limits: |
    [{"op": "replace", ...}]

Pros:

Doesn't bloat Application status
Can be cleaned up separately
No etcd concerns

Cons:

Extra API call to view diffs
More objects to manage
Lifecycle management complexity

Decision Criteria

When to Implement Full Diffs:

✅ YES if:

Users frequently ask "what did this policy change?"
Debugging complex spec transforms is common
UI/CLI tools will display diffs
"Undo policy effects" is a requirement

❌ NO if:

Current tracking (labels/annotations/specModified) is sufficient
Performance is critical (every ms counts)
Storage is limited

Recommendation:

Start with Phase 1 (Summary Only):

Low cost (~2ms, ~1KB)
Immediate value for debugging
Easy to implement
Can upgrade to full diffs later if needed

Add Phase 2 (Full Diffs) if:

Users request it after using summaries
UI/CLI tools are built to display diffs
"Policy dry-run" feature is added

Open Questions

Should diffs be human-readable or machine-parseable?
- JSON Patch (machine) vs. kubectl-style diff (human)
Should we store diffs for all policies or just spec changes?
- Current proposal: Only spec changes
Should diffs be compressed?
- Could use gzip before base64 (saves ~60% space)
Retention policy?
- Clear diffs on successful reconciliation?
- Keep last N diffs?
Should we support "reverting" policy changes?
- Would require storing inverse patches

Example Usage

CLI Tool

# Show what a policy changed
kubectl vela policy diff my-app inject-sidecar

# Output:
Spec changes by policy 'inject-sidecar':
  + Added component 'monitoring-sidecar'
  ~ Modified components[0].properties.env
    + Added env var: SIDECAR_ENABLED=true

# Show full JSON patch
kubectl vela policy diff my-app inject-sidecar --format=json-patch

UI Dashboard

Application: my-app
Applied Policies:
  ✅ inject-sidecar (vela-system)
     Spec Changes:
       ├─ Added 1 component
       ├─ Modified 2 properties
       └─ [View Full Diff]

Conclusion

Summary diffs (~2ms, ~1KB) provide 80% of the value with 20% of the cost.

Recommend:

✅ Implement Phase 1 (Summary) now
🤔 Evaluate Phase 2 (Full Diff) based on usage
📊 Add metrics to track diff size distribution
🔍 Monitor performance impact in production

7.7 KiB Raw Permalink Blame History

Proposal: Store Spec Diffs for Policy Transforms

Problem

Proposed Solution

Implementation Options

Option 1: JSON Patch (RFC 6902)

Option 2: Structured Summary

Option 3: Hybrid Approach (RECOMMENDED)

Scope: Only Diff Spec Changes

Computational Impact

Per-Application Overhead:

Context:

When to Skip:

Storage Impact

etcd Size:

etcd Limits:

Implementation Plan

Phase 1: Summary Only (Quick Win)

Phase 2: Add Full Diff (If Needed)

Alternative: External Diff Storage

Decision Criteria

When to Implement Full Diffs:

Recommendation:

Open Questions

Example Usage

CLI Tool

UI Dashboard

Conclusion

7.7 KiB

Raw Permalink Blame History