github/krkn

Fork 0

mirror of https://github.com/krkn-chaos/krkn.git synced 2026-02-14 09:59:59 +00:00

Files

Darshan Jain 819191866d

Functional & Unit Tests / Functional & Unit Tests (push) Failing after 1m38s

Functional & Unit Tests / Generate Coverage Badge (push) Has been skipped

Manage Stale Issues and Pull Requests / Mark and Close Stale Issues and PRs (push) Successful in 6s

Add CLAUDE.md for AI-assisted development (#1141 )

Signed-off-by: ddjain <darjain@redhat.com>

2026-01-31 23:41:49 +05:30

8.9 KiB

Raw Blame History

CLAUDE.md - Krkn Chaos Engineering Framework

Project Overview

Krkn (Kraken) is a chaos engineering tool for Kubernetes/OpenShift clusters. It injects deliberate failures to validate cluster resilience. Plugin-based architecture with multi-cloud support (AWS, Azure, GCP, IBM Cloud, VMware, Alibaba, OpenStack).

Repository Structure

krkn/
├── krkn/
│   ├── scenario_plugins/        # Chaos scenario plugins (pod, node, network, hogs, etc.)
│   ├── utils/                   # Utility functions
│   ├── rollback/                # Rollback management
│   ├── prometheus/              # Prometheus integration
│   └── cerberus/                # Health monitoring
├── tests/                       # Unit tests (unittest framework)
├── scenarios/                   # Example scenario configs (openshift/, kube/, kind/)
├── config/                      # Configuration files
└── CI/                          # CI/CD test scripts

Quick Start

# Setup (ALWAYS use virtual environment)
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Run Krkn
python run_kraken.py --config config/config.yaml

# Note: Scenarios are specified in config.yaml under kraken.chaos_scenarios
# There is no --scenario flag; edit config/config.yaml to select scenarios

# Run tests
python -m unittest discover -s tests -v
python -m coverage run -a -m unittest discover -s tests -v

Critical Requirements

Python Environment

Python 3.9+ required
NEVER install packages globally - always use virtual environment
CRITICAL: docker must be <7.0 and requests must be <2.32 (Unix socket compatibility)

Key Dependencies

krkn-lib (5.1.13): Core library for Kubernetes/OpenShift operations
kubernetes (34.1.0): Kubernetes Python client
docker (<7.0), requests (<2.32): DO NOT upgrade without verifying compatibility
Cloud SDKs: boto3 (AWS), azure-mgmt-* (Azure), google-cloud-compute (GCP), ibm_vpc (IBM), pyVmomi (VMware)

Plugin Architecture (CRITICAL)

Strictly enforced naming conventions:

Naming Rules

Module files: Must end with _scenario_plugin.py and use snake_case
- Example: pod_disruption_scenario_plugin.py
Class names: Must be CamelCase and end with ScenarioPlugin
- Example: PodDisruptionScenarioPlugin
- Must match module filename (snake_case ↔ CamelCase)
Directory structure: Plugin dirs CANNOT contain "scenario" or "plugin"
- Location: krkn/scenario_plugins/<plugin_name>/

Plugin Implementation

Every plugin MUST:

Extend AbstractScenarioPlugin
Implement run() method
Implement get_scenario_types() method

from krkn.scenario_plugins import AbstractScenarioPlugin

class PodDisruptionScenarioPlugin(AbstractScenarioPlugin):
    def run(self, config, scenarios_list, kubeconfig_path, wait_duration):
        pass
    
    def get_scenario_types(self):
        return ["pod_scenarios", "pod_outage"]

Creating a New Plugin

Create directory: krkn/scenario_plugins/<plugin_name>/
Create module: <plugin_name>_scenario_plugin.py
Create class: <PluginName>ScenarioPlugin extending AbstractScenarioPlugin
Implement run() and get_scenario_types()
Create unit test: tests/test_<plugin_name>_scenario_plugin.py
Add example scenario: scenarios/<platform>/<scenario>.yaml

DO NOT: Violate naming conventions (factory will reject), include "scenario"/"plugin" in directory names, create plugins without tests.

Testing

Unit Tests

# Run all tests
python -m unittest discover -s tests -v

# Specific test
python -m unittest tests.test_pod_disruption_scenario_plugin

# With coverage
python -m coverage run -a -m unittest discover -s tests -v
python -m coverage html

Test requirements:

Naming: test_<module>_scenario_plugin.py
Mock external dependencies (Kubernetes API, cloud providers)
Test success, failure, and edge cases
Keep tests isolated and independent

Functional Tests

Located in CI/tests/. Can be run locally on a kind cluster with Prometheus and Elasticsearch set up.

Setup for local testing:

Deploy Prometheus and Elasticsearch on your kind cluster:
- Prometheus setup: https://krkn-chaos.dev/docs/developers-guide/testing-changes/#prometheus
- Elasticsearch setup: https://krkn-chaos.dev/docs/developers-guide/testing-changes/#elasticsearch

Or disable monitoring features in config/config.yaml:

performance_monitoring:
    enable_alerts: False
    enable_metrics: False
    check_critical_alerts: False

Note: Functional tests run automatically in CI with full monitoring enabled.

Cloud Provider Implementations

Node chaos scenarios are cloud-specific. Each in krkn/scenario_plugins/node_actions/<provider>_node_scenarios.py:

AWS, Azure, GCP, IBM Cloud, VMware, Alibaba, OpenStack, Bare Metal

Implement: stop, start, reboot, terminate instances.

When modifying: Maintain consistency with other providers, handle API errors, add logging, update tests.

Adding Cloud Provider Support

Create: krkn/scenario_plugins/node_actions/<provider>_node_scenarios.py
Extend: abstract_node_scenarios.AbstractNodeScenarios
Implement: stop_instances, start_instances, reboot_instances, terminate_instances
Add SDK to requirements.txt
Create unit test with mocked SDK
Add example scenario: scenarios/openshift/<provider>_node_scenarios.yml

Configuration

Main config: config/config.yaml

kraken: Core settings
cerberus: Health monitoring
performance_monitoring: Prometheus
elastic: Elasticsearch telemetry

Scenario configs: scenarios/ directory

- config:
    scenario_type: <type>  # Must match plugin's get_scenario_types()

Code Style

Import order: Standard library, third-party, local imports
Naming: snake_case (functions/variables), CamelCase (classes)
Logging: Use Python's logging module
Error handling: Return appropriate exit codes
Docstrings: Required for public functions/classes

Exit Codes

Krkn uses specific exit codes to communicate execution status:

0: Success - all scenarios passed, no critical alerts
1: Scenario failure - one or more scenarios failed
2: Critical alerts fired during execution
3+: Health check failure (Cerberus monitoring detected issues)

When implementing scenarios:

Return 0 on success
Return 1 on scenario-specific failures
Propagate health check failures appropriately
Log exit code reasons clearly

Container Support

Krkn can run inside a container. See containers/ directory.

Building custom image:

cd containers
./compile_dockerfile.sh  # Generates Dockerfile from template
docker build -t krkn:latest .

Running containerized:

docker run -v ~/.kube:/root/.kube:Z \
  -v $(pwd)/config:/config:Z \
  -v $(pwd)/scenarios:/scenarios:Z \
  krkn:latest

Git Workflow

NEVER commit directly to main
NEVER use --force without approval
ALWAYS create feature branches: git checkout -b feature/description
ALWAYS run tests before pushing

Conventional commits: feat:, fix:, test:, docs:, refactor:

git checkout main && git pull origin main
git checkout -b feature/your-feature-name
# Make changes, write tests
python -m unittest discover -s tests -v
git add <specific-files>
git commit -m "feat: description"
git push -u origin feature/your-feature-name

Environment Variables

KUBECONFIG: Path to kubeconfig
AWS_*, AZURE_*, GOOGLE_APPLICATION_CREDENTIALS: Cloud credentials
PROMETHEUS_URL, ELASTIC_URL, ELASTIC_PASSWORD: Monitoring config

NEVER commit credentials or API keys.

Common Pitfalls

Missing virtual environment - always activate venv
Running functional tests without cluster setup
Ignoring exit codes
Modifying krkn-lib directly (it's a separate package)
Upgrading docker/requests beyond version constraints

Before Writing Code

Check for existing implementations
Review existing plugins as examples
Maintain consistency with cloud provider patterns
Plan rollback logic
Write tests alongside code
Update documentation

When Adding Dependencies

Check if functionality exists in krkn-lib or current dependencies
Verify compatibility with existing versions
Pin specific versions in requirements.txt
Check for security vulnerabilities
Test thoroughly for conflicts

Common Development Tasks

Modifying Existing Plugin

Read plugin code and corresponding test
Make changes
Update/add unit tests
Run: python -m unittest tests.test_<plugin>_scenario_plugin

Writing Unit Tests

Create: tests/test_<module>_scenario_plugin.py
Import unittest and plugin class
Mock external dependencies
Test success, failure, and edge cases
Run: python -m unittest tests.test_<module>_scenario_plugin

8.9 KiB Raw Blame History