mirror of
https://github.com/nubenetes/awesome-kubernetes.git
synced 2026-05-26 02:55:26 +00:00
133 KiB
133 KiB
Cloud Based Integration & Messaging. Data Processing & Streaming (aka Data Pipeline). Open Data Hub
!!! info "Architectural Context" Detailed reference for Cloud Based Integration & Messaging. Data Processing & Streaming (aka Data Pipeline). Open Data Hub in the context of Data & Advanced Analytics.
Standard Reference
- Redpanda is now Free & Source Available [COMMUNITY-TOOL]
- Orchestration Made Easy with Zeebe and Kafka [COMMUNITY-TOOL]
- Banzai Cloud 🌟 [COMMUNITY-TOOL]
- Wikipedia: Message Broker [COMMUNITY-TOOL]
- Wikipedia: Event-driven messaging [COMMUNITY-TOOL]
- Wikipedia: Streaming Data [COMMUNITY-TOOL]
- dzone: Event-Driven Architecture as a Strategy [COMMUNITY-TOOL]
- wikipedia: Enterprise service bus [COMMUNITY-TOOL]
- cncf.io: The need for Kubernetes Native Messaging Platform in Hybrid Cloud' Environment [COMMUNITY-TOOL]
- wiprodigital.com: A Guide to Enterprise Event-Driven Architecture [COMMUNITY-TOOL]
- medium: Introduction to Event-Driven Architecture 🌟 [COMMUNITY-TOOL]
- sebalopezz.medium.com: Monolith to Microservices + Event-Driven Architecture' 🌟 [COMMUNITY-TOOL]
- medium: Introduction to Message Queues 🌟 [COMMUNITY-TOOL]
- headspring.com: Is Kafka or RabbitMQ the right messaging tool for you? [COMMUNITY-TOOL]
- baeldung.com: Pub-Sub vs. Message Queues 🌟 [COMMUNITY-TOOL]
- medium: Monolithic to Microservices Architecture with Patterns & Best' Practices 🌟 [COMMUNITY-TOOL]
- dzone: RESTful Applications in An Event-Driven Architecture [COMMUNITY-TOOL]
- jinwookim928.medium.com: Why Not Event Driven Architecture? [COMMUNITY-TOOL]
- blog.direktiv.io: Event driven orchestration with Knative (part 1) [COMMUNITY-TOOL]
- blog.direktiv.io: Redefining event-driven orchestration for automation &' applications [COMMUNITY-TOOL]
- pub.towardsai.net: Deep Dive into Event-Driven architecture | Gul Ershad [COMMUNITY-TOOL]
- developer.com: An Introduction to Event Driven Microservices [COMMUNITY-TOOL]
- dzone.com: What Are Microservices and The Event Aggregator Pattern? 🌟 [COMMUNITY-TOOL]
- irfanyusanif.medium.com: Best practices to communicate between microservices [COMMUNITY-TOOL]
- swapnil-chougule.medium.com: Rapid Feature Engineering through SQL [COMMUNITY-TOOL]
- blog.twitter.com: Processing billions of events in real time at Twitter [COMMUNITY-TOOL]
- medium.com/tinyclues-vision: 4 Design Principles for Robust Data Pipelines [COMMUNITY-TOOL]
- medium.com/fiverr-engineering: How to Share Data Between Microservices on' High Scale [COMMUNITY-TOOL]
- medium.com/codex: Microservices Communication — Queues Topics and Streams [COMMUNITY-TOOL]
- emirayhan.medium.com: What is the difference Message Queue and Message' Bus? 🌟 [COMMUNITY-TOOL]
- medium.com/event-driven-utopia: Comparing Stateful Stream Processing and' Streaming Databases [COMMUNITY-TOOL]
- dzone: Resilient MultiCloud Messaging [COMMUNITY-TOOL]
- juhache.substack.com: From Data Engineer to YAML Engineer [COMMUNITY-TOOL]
- medium.com/dev-jam: TIBCO Business Works vs. Apache Camel — A short Comparison' 🌟 [COMMUNITY-TOOL]
- Dzone: Introduction to Message Brokers. Part 1: Apache Kafka vs. RabbitMQ [COMMUNITY-TOOL]
- Dzone: Introduction to Message Brokers. Part 2: ActiveMQ vs. Redis Pub/Sub [COMMUNITY-TOOL]
- medium.com: RabbitMQ vs. Kafka [COMMUNITY-TOOL]
- medium.com/@paolo.gazzola: How to deploy a high available and fault tolerant' RabbitMQ service in an on-premise Kubernetes multi-node cluster environment [COMMUNITY-TOOL]
- betterprogramming.pub: The Perfect Message Queue Solution Based on the Redis' Stream Type [COMMUNITY-TOOL]
- Apache Camel [COMMUNITY-TOOL]
- Quora.com: What's the difference between Apache Camel and Kafka? [COMMUNITY-TOOL]
- dzone: Hybrid multi-cloud event mesh architectural design [COMMUNITY-TOOL]
- dzone: KubeMQ: A Modern Alternative to Kafka [COMMUNITY-TOOL]
- Wikipedia: Cloud Based Integration (iPaaS) [COMMUNITY-TOOL]
- blog.axway.com: What is iPaaS? [COMMUNITY-TOOL]
- A good explanation of how to avoid distributed transactions using outbox' pattern: Transaction Log Tailing With Debezium [COMMUNITY-TOOL]
- medium.com: Stream Your Database into Kafka with Debezium [COMMUNITY-TOOL]
- medium: Change Data Capture — Using Debezium [COMMUNITY-TOOL]
- pradeepdaniel.medium.com: Creating an ETL data pipeline to sync data to' Snowflake using Kafka and Debezium [COMMUNITY-TOOL]
- medium: A Visual Introduction to Debezium 🌟 [COMMUNITY-TOOL]
- satishchandragupta.com: Scalable Efficient Big Data Pipeline Architecture [COMMUNITY-TOOL]
- medium: Logs & Offsets: (Near) Real Time ELT with Apache Kafka + Snowflake [COMMUNITY-TOOL]
- medium: Apache Kafka Startup Guide: System Design Architectures: Notification' System, Web Activity Tracker, ELT Pipeline, Storage System 🌟 [COMMUNITY-TOOL]
- medium: Getting Started With Kafka on OpenShift [COMMUNITY-TOOL]
- banzaicloud.com: Kafka Schema Registry on Kubernetes the declarative way [COMMUNITY-TOOL]
- banzaicloud.com: Bulletproof Kafka, and the tale of an Amazon outage [COMMUNITY-TOOL]
- levelup.gitconnected.com: Kafka for Engineers 🌟 [COMMUNITY-TOOL]
- banzaicloud.com: Kafka on Kubernetes - using etcd 🌟 [COMMUNITY-TOOL]
- medium: Processing guarantees in Kafka [COMMUNITY-TOOL]
- medium: How Pinterest runs Kafka at scale [COMMUNITY-TOOL]
- medium: Google Pub/Sub Lite for Kafka Users [COMMUNITY-TOOL]
- medium: 4 Microservices Caching Patterns at Wix [COMMUNITY-TOOL]
- medium: Microservices in Rust with Kafka [COMMUNITY-TOOL]
- medium: Apache Kafka in a Nutshell 🌟 [COMMUNITY-TOOL]
- medium: Solutions to Communication Problems in Microservices using Apache' Kafka and Kafka Lens [COMMUNITY-TOOL]
- dzone.com: Microservices, Event-Driven Architecture and Kafka 🌟 [COMMUNITY-TOOL]
- medium: Understanding Kafka Topic Partitions [COMMUNITY-TOOL]
- instaclustr.com: Apache Kafka Architecture: A Complete Guide 🌟 [COMMUNITY-TOOL]
- developers.redhat.com: Getting started with Red Hat OpenShift Streams for' Apache Kafka [COMMUNITY-TOOL]
- baeldung.com: List Active Brokers in a Kafka Cluster Using Shell Commands' 🌟 [COMMUNITY-TOOL]
- dzone: Next-Gen Data Pipes With Spark, Kafka and k8s 🌟 [COMMUNITY-TOOL]
- cloudhut.dev: Running Apache Kafka on Kubernetes successfully [COMMUNITY-TOOL]
- medium: Running Kafka in Kubernetes, Part 1: Why we migrated our Kafka clusters' to Kubernetes [COMMUNITY-TOOL]
- betterprogramming.pub: How to Handle Duplicate Messages and Message Ordering' in Kafka [COMMUNITY-TOOL]
- medium: Optimizing Kafka Streams Apps on Kubernetes by Splitting Topologies [COMMUNITY-TOOL]
- inder-devops.medium.com: Kafka- Best practices & Lessons Learned | By Inder [COMMUNITY-TOOL]
- blog.workwell.io: How to manage your Kafka consumers from the producer [COMMUNITY-TOOL]
- adam-kotwasinski.medium.com: Kafka mesh filter in Envoy [COMMUNITY-TOOL]
- medium.com/airwallex-engineering: Kafka Streams: Iterative Development and' Blue-Green Deployment [COMMUNITY-TOOL]
- medium.com/udemy-engineering: Introducing Hot and Cold Retries on Apache' Kafka [COMMUNITY-TOOL]
- medium.com/dna-technology: Why we dropped event sourcing with Kafka Streams' when given a second chance [COMMUNITY-TOOL]
- betterprogramming.pub: Everything You Need To Know About Kafka 🌟 [COMMUNITY-TOOL]
- blog.developer.adobe.com: Exploring Kafka Producer’s Internals 🌟 [COMMUNITY-TOOL]
- medium.com/altitudehq: Kafka retries and maintaining the order of retry' events 🌟 [COMMUNITY-TOOL]
- medium.com/cloudnesil: Kafka Streams State Store at Scale [COMMUNITY-TOOL]
- towardsdev.com: Performance Testing Your Kubernetes Kafka Cluster [COMMUNITY-TOOL]
- medium.com/@hardiktaneja_99752: Lessons after running Kafka in production' 🌟 [COMMUNITY-TOOL]
- betterprogramming.pub: Monitoring Kafka Applications — Implementing Healthchecks' and Tracking Lag [COMMUNITY-TOOL]
- blog.datumo.io: Setting up Kafka on Kubernetes - an easy way [COMMUNITY-TOOL]
- medium.com/wix-engineering: Troubleshooting Kafka for 2000 Microservices' at Wix [COMMUNITY-TOOL]
- medium.com/@rramiz.rraza: Kafka metrics monitoring with Prometheus and Grafana' 🌟 [COMMUNITY-TOOL]
- dzone: Visualize your Apache Kafka Streams using the Quarkus Dev UI [COMMUNITY-TOOL]
- medium: Mastering Apache Kafka on Kubernetes — Strimzi K8s operator [COMMUNITY-TOOL]
- medium.com/@ahmed.farhan: Kafka Setup in Kubernetes Using Strimzi K8s operator' — Part 2 [COMMUNITY-TOOL]
- medium.com/adaltas: Operating Kafka in Kubernetes with Strimzi [COMMUNITY-TOOL]
- The benefits of integrating Apache Kafka with Istio [COMMUNITY-TOOL]
- Hazelcast JET [COMMUNITY-TOOL]
- wikipedia: Workflow Engine [COMMUNITY-TOOL]
- dzone: Apache Airflow Architecture on OpenShift [COMMUNITY-TOOL]
- betterprogramming.pub: Running Airflow Using Kubernetes Executor and Kubernetes' Pod Operator with Istio [COMMUNITY-TOOL]
- dataengineeringcentral.substack.com: Why is everyone trying to kill Airflow?' 🌟 [COMMUNITY-TOOL]
- blog.devgenius.io: Send information from Databricks to Airflow [COMMUNITY-TOOL]
- medium.com/apache-airflow: Passing Data Between Tasks with the KubernetesPodOperator' in Apache Airflow 🌟 [COMMUNITY-TOOL]
- medium.com/@piyush_74867: Apache Airflow on Kubernetes at scale — a peak' under the hood [COMMUNITY-TOOL]
- medium.com/@alfahreiza: Building an ELT Pipeline: From CSV to BigQuery using' dbt [COMMUNITY-TOOL]
- medium.com/apache-airflow: What we learned after running Airflow on Kubernetes' for 2 years [COMMUNITY-TOOL]
- Red Hat AMQ overview [COMMUNITY-TOOL]
Application Integration
Cloud Managed Services
Pub-Sub Pattern
- (2026) ==Google Cloud Platform Pub/Sub== [DOCUMENTATION] 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] [ENTERPRISE-STABLE] — Comprehensive documentation for GCP Pub/Sub, an enterprise-grade, globally distributed, fully-managed asynchronous messaging service. It provides consistent sub-second latencies at arbitrary scale. It features seamless integrations with Google Cloud's data analytics stacks.
Enterprise Integration Patterns
Event-Driven Systems
- developers.redhat.com: Design event-driven integrations with Kamelets and' Camel K [EMERGING] [GUIDE] — Introduces "Kamelets" (Camel Route Snippets), which act as reusable cloud-native integration building blocks. It explains how non-developers or low-code frameworks can plug Kamelets into serverless topologies for immediate data flow orchestration on Kubernetes.
Middleware
- (2025) Red Hat Fuse [ADVANCED LEVEL] [DOCUMENTATION] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Red Hat's enterprise-grade, distributed integration platform, heavily utilizing Apache Camel, ActiveMQ, and CXF. It provides a highly stable middleware environment designed to bind heterogeneous enterprise workloads and APIs under unified orchestration rules.
Serverless Integration
- developers.redhat.com: Integrating systems with Apache Camel and Quarkus' on Red Hat OpenShift [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] [LEGACY] — Practical walk-through highlighting the integration of legacy enterprise environments using Apache Camel Quarkus extensions (Camel K) on OpenShift. It demonstrates how Quarkus' sub-second startup times enable serverless, cloud-native integration routes.
Event Streaming
Enterprise Integration Patterns (1)
- kai-waehner.de: When to use Apache Camel vs. Apache Kafka? 🌟 [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Examines the complementarity and core differences between Apache Camel (an integration framework implementing Enterprise Integration Patterns) and Apache Kafka (a distributed streaming platform). It outlines architectures where Camel acts as a producer/consumer or edge connector for Kafka pipelines.
Kafka Connectors
- developers.redhat.com: Extending Kafka connectivity with Apache Camel Kafka' connectors [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Outlines how the Camel Kafka Connector framework allows developers to utilize Camel's extensive component suite as standard Kafka Connect sources or sinks. This simplifies ingestion and delivery to hundreds of external enterprise systems without custom code.
Local Development
Containerization
- geshan.com.np: How to use RabbitMQ and Node.js with Docker and Docker-compose [COMMUNITY-TOOL] [GUIDE] — A hands-on tutorial outlining the setup of a localized asynchronous worker pipeline using Node.js, RabbitMQ, and Docker Compose. It serves as an accessible entry point to grasp queue-based application decoupling. Includes configuration templates ready for development workflows.
Low-Code Integration
Enterprise Integration Patterns (2)
- Syndesis open source integration platform [DOCUMENTATION] [COMMUNITY-TOOL] [LEGACY] — Syndesis is an open-source, cloud-native low-code integration platform designed to run natively on Kubernetes and OpenShift. It enables drag-and-drop connections between diverse business APIs and internal databases, utilizing Apache Camel under the hood. Note: The project has recently transitioned to legacy status.
Microservices
- developers.redhat.com: Low-code microservices orchestration with Syndesis [COMMUNITY-TOOL] [GUIDE] — Detailed demonstration of leveraging Syndesis for visual, low-code orchestration of enterprise microservices. It highlights quick deployment cycles, declarative configuration models, and integration with Red Hat OpenShift resources.
Message Brokers
Clustering
- developers.redhat.com: Implementing Apache ActiveMQ-style broker meshes' with Apache Artemis [ADVANCED LEVEL] [COMMUNITY-TOOL] [GUIDE] — Focuses on establishing distributed, multi-broker network configurations (broker meshes) using Apache Artemis. It highlights migration techniques from classic ActiveMQ network-of-brokers architectures. It explains target configuration profiles to optimize reliability across complex enterprise regions.
Evaluation Frameworks
- developers.redhat.com: Choosing the right asynchronous-messaging infrastructure' for the job [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Lays out a decision framework for choosing between broker-based messaging (e.g., AMQP, ActiveMQ), event-streaming (e.g., Apache Kafka), and cloud-native serverless event routing. It evaluates criteria like throughput, ordering guarantees, consumer groups, and message preservation. This is an essential architectural comparative reference.
- kubemq.io: Kafka VS KubeMQ 🌟 [DOCUMENTATION] [COMMUNITY-TOOL] [GUIDE] — Provides a detailed comparison between Apache Kafka and KubeMQ, focusing on memory footprint, container resource demands, and operational complexity. It presents KubeMQ as a highly localized, easy-to-manage container broker, contrasting it with Kafka's robust, distributed cluster topology.
- kai-waehner.de: Comparison: JMS Message Queue vs. Apache Kafka [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Details the technical tradeoffs, design limitations, and complementary features of JMS broker specifications versus Apache Kafka. It assists system engineers in distinguishing transaction-heavy classic queuing requirements from massive event streaming workloads.
Event Streaming (1)
- (2021) blog.rabbitmq.com: First Application With RabbitMQ Streams 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Introduces RabbitMQ Streams, a high-throughput, log-append-only streaming protocol introduced in RabbitMQ 3.9. It compares RabbitMQ Streams' sub-millisecond latencies and message retention directly with traditional AMQP queues and Apache Kafka. The walkthrough showcases a complete consumer-producer application setup.
High-Performance Messaging
- (2026) ==Apache Artemis JMeter== ⭐ 1017 [ADVANCED LEVEL] 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] [ENTERPRISE-STABLE] — The official GitHub mirror of Apache ActiveMQ Artemis, housing the high-performance non-blocking asynchronous message broker. It provides native support for AMQP, MQTT, STOMP, and OpenWire. It delivers ultra-low latency and scalable message distribution under extreme workloads.
JMS
- Apache ActiveMQ [DOCUMENTATION] [ENTERPRISE-STABLE] [LEGACY] — An iconic, mature open-source multi-protocol message broker supporting JMS 1.1 and 2.0, AMQP, MQTT, and STOMP. Known for enterprise-grade reliability and complex message routing patterns. It remains a foundational asset in legacy integration environments globally.
- ActiveMQ 5.x "classic" [DOCUMENTATION] [ENTERPRISE-STABLE] [LEGACY] — The classic implementation of Apache ActiveMQ, continuing to power millions of production enterprise nodes. It offers rich support for JMS client specifications alongside robust clustering and persistence. Ideal for traditional integration architecture, though increasingly superseded by Artemis.
Kubernetes Native
- KubeMQ.io: Kubernetes Native Message Queue Broker [DOCUMENTATION] [ENTERPRISE-STABLE] — KubeMQ is an enterprise-grade, ultra-lightweight message broker engineered specifically for Kubernetes container ecosystems. Delivered in a minimal footprint, it supports pub/sub, queues, and streams with native GRPC and REST support. It avoids external operational dependencies.
- devops.com: Best of 2019: Implementing Message Queue in Kubernetes [COMMUNITY-TOOL] [GUIDE] — Evaluates the operational paradigms, stateful challenges, and strategies when setting up distributed message brokers natively inside Kubernetes environments. Discusses dynamic volume allocations, stateful sets, and persistent cloud networking protocols.
- github.com/kubemq-io/kubemq-community 🌟 ⭐ 668 [ENTERPRISE-STABLE] — The community-driven core repository for KubeMQ. It offers a lightweight, high-performance messaging interface for microservices on Kubernetes. Supports standard asynchronous protocols and integrates natively with Kubernetes patterns.
Pub-Sub Pattern (1)
- (2026) Redis Pub/sub [DOCUMENTATION] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Official developer documentation detailing Redis' built-in Pub/Sub and Streams features. It provides technical blueprints for lightweight, fire-and-forget message passing and log-append streaming. This allows developers to construct fast messaging queues without setting up heavy broker architectures.
Orchestration
Kubernetes Operators
- (2024) Apache Camel K [ADVANCED LEVEL] [DOCUMENTATION] 🌟🌟🌟🌟 [DE FACTO STANDARD] [ENTERPRISE-STABLE] — Core documentation for Apache Camel K, a lightweight cloud-native integration platform built on Kubernetes. It utilizes the Operator Pattern to run integration DSL routes serverlessly. It drastically simplifies deploying complex integration patterns across cloud-native domains.
- thenewstack.io: Camel K Brings Apache Camel to Kubernetes for Event-Driven' Architectures [COMMUNITY-TOOL] [GUIDE] — A comprehensive review of Camel K's architecture, analyzing its integration with Knative and Kubernetes-native messaging patterns. It describes how Camel K reduces traditional ESB resource consumption to support high-density container layouts.
Reference Architecture
- github.com/osa-ora/camel-k-samples [COMMUNITY-TOOL] — A curated collection of practical code templates and sample deployment topologies demonstrating Camel K in action. Covers integrations with relational databases, message queues, and cloud endpoints. This repository is a valuable tool for accelerated prototyping.
Serverless Integration (1)
- developers.redhat.com: Six reasons to love Camel K [COMMUNITY-TOOL] [GUIDE] — Outlines six key architectural advantages of Camel K, including fast deployment loops, native Quarkus optimization, low memory footprints, and serverless scale-to-zero capabilities via Knative. Highly useful for architects modernizing traditional ESBs.
Cloud Infrastructure
Kubernetes
Service Mesh
- (2021) Service meshes to the rescue: Load balancing and scaling long-lived connections in Kubernetes 🌟 [ADVANCED LEVEL] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — A deep dive into the engineering challenge of load balancing long-lived connections (gRPC, HTTP/2, WebSockets) within Kubernetes. It explains how standard L4 kube-proxy load balancing fails to distribute traffic evenly and presents L7 proxies and service meshes (like Linkerd or Istio) as the definitive architectural solution.
Cloud Native Architecture
Domain-Driven Design
Messaging Architectures
- verraes.net: DDD and Messaging Architectures 🌟 [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Synthesizes core concepts of Domain-Driven Design (DDD) with message-oriented middleware patterns. It examines bounded contexts, aggregate boundaries, and the strategic distribution of domain events. It provides deep conceptual clarity on decoupling enterprise service boundaries using asynchronous message paths.
Event-Driven Systems (1)
Foundations
- thenewstack.io: The Rise of Event-Driven Architecture [COMMUNITY-TOOL] [GUIDE] — Traces the industry shift from request-response synchronous APIs to asynchronous event-driven models. It outlines the architectural advantages regarding system resilience, temporal decoupling, and scalability. The analysis evaluates standard broker technologies that enable reactive cloud-native systems.
Patterns
- codeopinion.com: Event Sourcing vs Event Driven Architecture [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Clarifies the critical distinctions and synergies between Event Sourcing (capturing state transitions as events) and Event-Driven Architecture (broadcasting state changes). It uses architectural examples to prevent common integration anti-patterns. This assists architects in deciding when to combine or isolate these patterns.
Standards
- (2022) salaboy.com: Event-Driven applications with CloudEvents on Kubernetes [ADVANCED LEVEL] 🌟🌟🌟🌟 [EMERGING] [ENTERPRISE-STABLE] [GUIDE] — Explains how the CNCF CloudEvents specification standardizes event metadata format across distinct systems. It integrates CloudEvents within Kubernetes architectures using tools like Knative Eventing. This provides an excellent overview of building vendor-neutral, highly reactive event mesh fabrics.
Foundations (1)
Introductory Patterns
- (2024) ibm.com: Event-driven cloud-native applications (microservices) [DOCUMENTATION] 🌟🌟 [COMMUNITY-TOOL] [GUIDE] — Explains core principles of cloud-native architecture, including containerization, microservices, and reactive behaviors. It outlines the foundational tenets necessary to design robust applications optimized for public and private clouds. It serves as a high-level conceptual reference for infrastructure modernization.
Inter-Service Communication
Performance
- particular.net: RPC vs. Messaging – which is faster? [COMMUNITY-TOOL] [GUIDE] — Provides a detailed comparative benchmark of Remote Procedure Call (RPC) protocols versus messaging-based asynchronous protocols. It highlights how latency, queue depths, network overhead, and decoupling impact application performance under high load. It concludes that throughput gains in asynchronous messaging often outweigh synchronous RPC latency benefits.
Microservices (1)
Change Data Capture CDC
- developers.redhat.com: Decoupling microservices with Apache Camel and Debezium [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Explains how to decouple distributed database structures in microservices by employing a combination of Debezium (for Change Data Capture) and Apache Camel (for integration and transformation pipelines). It ensures low latency, resilient state updates.
- developers.redhat.com: Change data capture for microservices without writing' any code [ENTERPRISE-STABLE] [GUIDE] — Walkthrough detailing how to set up out-of-the-box Change Data Capture architectures using Debezium without custom application-level code. It demonstrates immediate real-time synchronization from database transactions straight to Kafka-enabled microservices.
Distributed Transactions
- developers.redhat.com: Distributed transaction patterns for microservices' compared [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Analyzes and contrasts critical transactional strategies for microservice boundaries, including 2PC (Two-Phase Commit), Sagas, and Outbox patterns. It highlights how asynchronous message-passing mitigates the failure modes of distributed transactions. Practical implementation guidelines focus on maintaining eventual consistency without tight coupling.
Event Sourcing
- blog.bitsrc.io: Why Microservices Should use Event Sourcing 🌟 [COMMUNITY-TOOL] [GUIDE] — Argues the case for event sourcing as a primary mechanism to store state in distributed microservice topologies. It highlights capabilities such as complete audit trails, high-performance writes, and historical state reconstruction. The post warns of common pitfalls including schema evolution complexity and read projection overhead.
Event-Driven Design
- infoq.com: Turning Microservices Inside-Out [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — This foundational architectural piece by Martin Kleppmann argues for treating database tables as streams of changes rather than static silos. By turning the database "inside out" using event streams (like Kafka), microservices can achieve decentralized state management and projection consistency. It bridges the gap between stream processing and relational storage.
Orchestration (1)
Kubernetes Pod Lifecycle
- K8s prevent queue worker Pod from being killed during deployment [ADVANCED LEVEL] [ENTERPRISE-STABLE] [GUIDE] — Provides concrete technical implementation strategies to prevent abrupt termination of active queue worker Pods during rolling Kubernetes updates. It details the effective utilization of
preStophooks and graceful shutdown signals within Pod specifications. It ensures zero-loss processing of long-running asynchronous messages.
Cloud Native Infrastructure
High Availability
Kafka on Kubernetes
- (2022) ==learnk8s.io/kafka-ha-kubernetes: Designing and testing a highly available Kafka cluster on Kubernetes 🌟== [ADVANCED LEVEL] 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] [GUIDE] — Curator Insight: Outstanding reference tutorial for validating HA setups in production. Live Grounding: Outlines designing multi-AZ deployments, tuning pod disruption budgets (PDBs), and performing simulated chaos engineering tests (network partitions, node restarts) to prove zero-data-loss capabilities.
Kafka on Kubernetes (1)
Application Integration (1)
- itnext.io: Sending Messages to Kafka in Kubernetes [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Practical post on establishing low-latency, secure client connections to Kafka brokers inside a Kubernetes network boundary. Live Grounding: Reviews internal DNS routing, ingress endpoints, and SASL authentication configs to safely bridge containerized publishers and consumer workloads.
Deployments
- (2023) thelinuxnotes.com: How to deploy Kafka in Kubernetes with Helm chart + kafdrop [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Deployment tutorial utilizing established community Helm packages. Live Grounding: Guides deployment of a functional Kafka test cluster integrated directly with Kafdrop as a companion visual administration interface.
Guides
- linkedin.com: Kafka Cluster Setup on Kubernetes [ENTERPRISE-STABLE] [GUIDE] — Curator Insight: Step-by-step technical guide for provisioning Kafka on Kubernetes using direct manifests. Live Grounding: Covers statefulsets, headless service definitions, volume claim templates, and environment variables targeting manual multi-broker cluster creation.
Local Development (1)
- dev.to: Running Kafka on kubernetes for local development [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Practical setup workflow for running local Kafka configurations under Minikube or Docker Desktop. Live Grounding: Explains minimal YAML profiles using Helm or lightweight operators to quickly spin up development broker instances for sandboxed microservices validation.
Kubernetes Strategy
Infrastructure Decisions
- thenewstack.io: Kafka on Kubernetes: Should You Adopt a Managed Solution? [ENTERPRISE-STABLE] — Curator Insight: Strategic evaluation of managed SaaS Kafka setups versus DIY operator approaches on Kubernetes. Live Grounding: Compares total cost of ownership (TCO), maintenance scaling, day-2 operations complexity, and custom flexibility demands.
Security
Amazon EKS
- itnext.io: Securely Decoupling Kubernetes-based Applications on Amazon EKS' using Kafka with SASL/SCRAM [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Hardening guide detailing connection security between AWS EKS-based consumers and Kafka clusters. Live Grounding: Demonstrates configuring Kafka users with SASL/SCRAM credentials, managing secrets within Kubernetes natively, and establishing encrypted TLS connectivity over VPC boundaries.
Serverless Data Platforms
Elastic Kafka
- confluent.io: Making Apache Kafka Serverless: Lessons From Confluent Cloud [ADVANCED LEVEL] [CASE STUDY] [COMMUNITY-TOOL] — Curator Insight: Architectural retrospective on how Confluent engineered a multi-tenant, elastic serverless Kafka platform. Live Grounding: Explores storage-compute decoupling, automated partition rebalancing, and custom multi-tenant billing-aware resource allocators.
Stateful Workloads
Kafka on Kubernetes (2)
- phoenixnap.com: How to Set Up and Run Kafka on Kubernetes 🌟 [ENTERPRISE-STABLE] [GUIDE] — Curator Insight: Comprehensive guide to running stateful Kafka clusters on Kubernetes platforms. Live Grounding: Outlines deploying Kafka utilizing statefulsets, configuring persistent volumes, and handling network routing. Explores the advantages of operator-managed setups versus standard manual deployments.
Strimzi Operator
- strimzi.io: Kafka upgrade improvements [ADVANCED LEVEL] [COMMUNITY-TOOL] — Curator Insight: Direct technical update from the Strimzi maintainers regarding Kafka upgrade orchestration. Live Grounding: Details the architectural improvements in Strimzi's reconciliation loops, enabling automated, zero-downtime rolling upgrades of stateful Kafka pods with strict schema protection.
- strimzi.io [ADVANCED LEVEL] [DE FACTO STANDARD] — Curator Insight: The leading open-source CNCF sandbox operator platform for running Kafka on Kubernetes. Live Grounding: Orchestrates secure topologies, cluster expansion, user management, and seamless rolling upgrades using fully declarative Kubernetes Custom Resources (CRDs).
- developers.redhat.com: Introduction to Strimzi: Apache Kafka on Kubernetes' (KubeCon Europe 2020) 🌟 [ENTERPRISE-STABLE] — Curator Insight: Detailed breakdown of the Strimzi operator architectural internals from KubeCon. Live Grounding: Evaluates how the operator automates bootstrap, health monitoring, protocol configurations, TLS generation, and storage management for Kafka on Kubernetes.
Tooling and UI
- (2024) pepy.tech/project/strimzi-kafka-cli 🌟 🌟🌟🌟 [COMMUNITY-TOOL] — Curator Insight: Download analytics and overview of the Strimzi Kafka CLI. Live Grounding: Provides python-based CLI tools to interactively administer Strimzi-managed custom resources, simplifying manual deployment, topic configuration, and user creation operations.
Cloud-Native Infrastructure
Event Streaming (2)
GitOps Practices
- confluent.io: DevOps for Apache Kafka with Kubernetes and GitOps 🌟 [ADVANCED LEVEL] [ENTERPRISE-STABLE] — Examines GitOps models for coordinating declarative configurations of schemas, partitions, ACLs, and topics across multiple Kubernetes-hosted Kafka environments using automated pipelines.
Infrastructure Operations
- tecmint: How to Install Apache Kafka in CentOS/RHEL 7 [COMMUNITY-TOOL] [GUIDE] — A technical operational guide detailing the installation and configuration of Zookeeper and Apache Kafka services directly on bare-metal or VM instances running CentOS and RHEL 7.
Kafka Connect Operators
- developers.redhat.com: Improve your Kafka Connect builds of Debezium. [COMMUNITY-TOOL] — A practical walk-through of the Strimzi Operator build processes for deploying optimized Kafka Connect container environments on Kubernetes. Illustrates declarative custom resource setups to bundle custom Debezium connector packages safely.
Kubernetes Operators (1)
- (2021) openshift.com: How to Orchestrate Data Pipelines with Applications Deployed on OpenShift [ADVANCED LEVEL] 🌟🌟🌟 [COMMUNITY-TOOL] — Demonstrates deployment patterns for co-locating high-throughput data processing pipelines (Argo, Apache Spark, Strimzi) directly alongside backend application microservices inside Red Hat OpenShift clusters.
- (2020) containerjournal.com: Red Hat Platform Brings Kafka Closer to Kubernetes 🌟🌟🌟 [COMMUNITY-TOOL] — Discusses Strimzi operators on Red Hat OpenShift, evaluating how declaratively defined operators streamline stateful deployments and coordinate safe, automated rolling updates of Kafka nodes on Kubernetes.
- thenewstack.io: Beyond the Quickstart: Running Apache Kafka as a Service' on Kubernetes [ADVANCED LEVEL] [COMMUNITY-TOOL] — Provides architectural guidelines on hosting stable, production-grade Apache Kafka clusters on Kubernetes. Explores state persistence requirements, advanced container network setups, and operational recovery pipelines.
Hybrid Cloud Platforms
Anthos Deployments
- (2020) confluent.fr: Infrastructure Modernization with Google Anthos and Apache Kafka [FRENCH CONTENT] [ADVANCED LEVEL] [CASE STUDY] 🌟🌟🌟 [CASE STUDY] [COMMUNITY-TOOL] — Provides architectural guidelines for deploying federated Confluent Kafka setups across local datacenters and public Google Cloud regions using Google Anthos configuration models [FRENCH CONTENT].
Modernization Strategy
- kai-waehner.de: App Modernization and Hybrid Cloud Architectures with Apache' Kafka [LEGACY] — Examines legacy migration models utilizing Apache Kafka as an integration buffer. Discusses streaming events between on-prem mainframe systems and agile cloud-native microservices with zero downtime.
Infrastructure as Code IaC
Event-Driven Provisioning
- daily.dev: Building a fault-tolerant event-driven architecture with Google' Cloud, Pulumi and Debezium [COMMUNITY-TOOL] [GUIDE] — An infrastructure guide detailing the automated configuration of a fault-tolerant event-driven backend. Demonstrates the step-by-step deployment of Google Cloud resources and Debezium instances using Pulumi for stateful, declarative management.
Serverless Computing
Knative Eventing
- piotrminkowski.com: Knative Eventing with Kafka and Quarkus [ADVANCED LEVEL] [ENTERPRISE-STABLE] — Illustrates the deployment of Knative serverless application endpoints coordinated with Apache Kafka event feeds. Utilizes Quarkus microservices to demonstrate scale-to-zero configurations that adapt automatically to stream ingestion.
Data Architecture
Data Lakehouse
Iceberg Integration
- debezium.io: Using Debezium to Create a Data Lake with Apache Iceberg [ADVANCED LEVEL] [ENTERPRISE-STABLE] — Explains how to feed streaming transaction logs directly into Apache Iceberg storage using Debezium CDC and Kafka Connect. Outlines strategies for supporting dynamic schema evolution and ensuring transactional ACID-level safety on cheap cloud object stores.
Data Mesh
Cloud-Native Platforms
- mrpaulandrew.com: BUILDING A DATA MESH ARCHITECTURE IN AZURE – PART 2 [ADVANCED LEVEL] [COMMUNITY-TOOL] — A platform implementation guide focusing on assembling a production-ready Data Mesh within Microsoft Azure. Explores multi-workspace configurations utilizing Azure Synapse, Azure Purview, and Data Factory within enterprise environments.
Domain-Driven Design (1)
- towardsdatascience.com: Data Domains and Data Products [COMMUNITY-TOOL] — Focuses on building discrete, discoverable, and governed domain-centric data products. Reviews core responsibilities for product engineering teams and logical boundaries required to achieve seamless interoperability within a Data Mesh.
Foundational Principles
- martinfowler.com: Data Mesh Principles and Logical Architecture [ADVANCED LEVEL] [DE FACTO STANDARD] — The seminal architectural blueprint by Zhamak Dehghani introducing Data Mesh principles. Focuses on the core four pillars: domain-driven decentralized data ownership, data-as-a-product, self-serve data infrastructure platforms, and federated computational governance.
Migration Strategies
- martinfowler.com: How to Move Beyond a Monolithic Data Lake to a Distributed' Data Mesh [ADVANCED LEVEL] [DE FACTO STANDARD] — The pioneering analysis introducing the Data Mesh framework. Outlines failure modes of traditional centralized databases and enterprise data lakes, presenting a distributed, domain-driven data topology as a scalable alternative.
Strategic Overview
- infoq.com: Data Mesh Principles and Logical Architecture Defined [COMMUNITY-TOOL] — An executive summary analyzing Zhamak Dehghani's foundational Data Mesh concepts. Contemplates the operational and architectural pivot from centralized monolithic data pools to distributed, domain-centric, and governed team landscapes.
Data Science Platform
Real-Time Machine Learning
- confluent.io: How to Build and Deploy Scalable Machine Learning in Production' with Apache Kafka [ADVANCED LEVEL] [COMMUNITY-TOOL] — Reviews the architecture designs required for deploying, evaluating, and monitoring analytical machine learning models against fast event-streams. Utilizes Apache Kafka as the backbone for scalable ingestion.
Event Streaming (3)
Architectural Patterns
- davidxiang.com: Kafka As A Database? Yes Or No [ADVANCED LEVEL] [COMMUNITY-TOOL] — Evaluates the controversial 'Kafka as a database' design model. Analyzes the trade-offs of using Kafka for data persistence, explaining limits on random queries and index lookups relative to typical relational/NoSQL setups.
Audio Curation
- softwareengineeringdaily.com: Kafka Applications with Tim Berglund (podcast)' 🌟 [ENTERPRISE-STABLE] — A podcast conversation detailing real-world patterns for building highly decoupled event-driven systems. Explores streaming architecture choices, data evolution, and microservice communication patterns with Tim Berglund.
Cluster Management
- AKHQ (previously known as KafkaHQ) 🌟 ⭐ 3808 [DE FACTO STANDARD] [ENTERPRISE-STABLE] — A powerful, feature-rich web console for administering Kafka cluster resources. Supports direct topic data browsing, consumer group rebalancing monitoring, schema registry integrations, and multi-tenant ACL audits.
- confluent.io: Simplifying Apache Kafka Multi-Cluster Management Using Control' Center and Cluster Registry [ADVANCED LEVEL] [COMMUNITY-TOOL] — Explains methods for operating federated or geographically-dispersed Kafka clusters. Details patterns for maintaining centralized visibility and configuring multi-cluster pipelines using Confluent Control Center.
Consumer Coordination
- (2021) blog.cloudera.com: Scalability of Kafka Messaging using Consumer Groups 🌟🌟🌟 [COMMUNITY-TOOL] — Analyzes the scaling behaviors of consumer groups in Apache Kafka. Outlines dynamic partition rebalancing algorithms, protocol designs, and strategies to minimize processing lag in active networks.
Data Pipelines
- Single Message Transformations - The Swiss Army Knife of Kafka Connect [ENTERPRISE-STABLE] — An deep-dive breakdown of Single Message Transformations (SMTs) within Kafka Connect. Shows how to filter, modify, anonymize, and restructure record payloads on-the-fly without requiring customized stream computing logic.
Development Tutorials
- (2021) ==kafka-tutorials.confluent.io 🌟== 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] [GUIDE] — The premier tutorial index hosted by Confluent. Provides a rich set of runnable recipes demonstrating microservice streaming actions, temporal joins, window operations, and message transformations using ksqlDB and Kafka Streams.
- (2021) kafka-tutorials.confluent.io: How to count messages in a Kafka topic 🌟🌟🌟 [COMMUNITY-TOOL] [GUIDE] — A precise development recipe outlining how to count incoming records inside Apache Kafka topics using ksqlDB. Details the construction of stateful materialized views for monitoring live volumes.
Foundational Principles (1)
- (2021) Confluent.io: Intro to Apache Kafka: How Kafka Works 🌟 [DOCUMENTATION] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — A foundational, highly descriptive reference for Kafka basics. Explains structural layouts of partitions, records, offsets, log retention, and replication, ensuring developers master core broker fundamentals.
IoT Telemetry Integration
- kai-waehner.de: Apache Kafka and MQTT (Part 1 of 5) – Overview and Comparison [ENTERPRISE-STABLE] — A systematic deep dive comparing MQTT brokers to Kafka. Reviews how to build end-to-end telemetry systems, deploying MQTT at the edge for lightweight transport and Kafka at the core for analytical streaming.
Message Brokers (1)
- Apache Kafka [DOCUMENTATION] [DE FACTO STANDARD] — The main portal for Apache Kafka, the industry de facto standard distributed event streaming engine. It outlines critical capabilities including partition clustering, transactional controls, offset management, and high-performance ingestion designs.
Metadata Management KRaft
- (2021) devclass.com: Apache Kafka 2.8.0 previews life without ZooKeeper 🌟🌟🌟 [COMMUNITY-TOOL] — Analyzes the operational and administrative benefits of ZooKeeper removal. Reviews how KRaft architecture improves cluster limits, simplifies administrator overhead, and accelerates recovery speeds during node failures.
Resource Indexes
- Awesome Streaming [ENTERPRISE-STABLE] — A highly curated meta-resource listing frameworks, engine architectures, academic publications, and database connectors within the streaming data ecosystem. Covers key analytical and event-driven technologies.
- Awesome Kafka [COMMUNITY-TOOL] — A rich community collection of operational utilities, libraries, and GUI packages optimized for developers and administrators deploying and scaling Apache Kafka systems.
Video Tutorials
- youtube playlist: Kafka Connect Tutorials | Kafka Connect 101: REST API' 🌟 [ENTERPRISE-STABLE] — An educational video series illustrating the programmatic administration of Kafka Connect connectors. Details the layout of the Connect REST API for creating, validating, scaling, and debugging stateful data integrations.
Event-Driven Data
Change Data Capture CDC (1)
- developers.redhat.com: Capture database changes with Debezium Apache Kafka' connectors [COMMUNITY-TOOL] — A hands-on manual detailing the implementation of Debezium to safely convert relational database modifications into real-time Kafka event feeds. Outlines event formatting, partition strategy, and recovery procedures.
- vladmihalcea.com: A beginner’s guide to CDC (Change Data Capture) [ENTERPRISE-STABLE] — A comprehensive structural overview of Change Data Capture (CDC) design patterns. It details transaction log parsing, dual-writes mitigation, and the key architectural differences between query-based and log-based CDC solutions. This acts as an essential primer for development groups transitioning from monolith DB schemas to real-time event streaming systems.
- shopify.engineering: Capturing Every Change From Shopify’s Sharded Monolith [ADVANCED LEVEL] [CASE STUDY] [CASE STUDY] [ENTERPRISE-STABLE] — This architectural case study highlights Shopify's high-throughput solution for real-time mutation extraction from sharded MySQL clusters. By combining Debezium with customized Apache Kafka configurations, the system secures sub-second delivery while safely preserving transaction-order invariants at massive scale.
- Build a simple cloud-native change data capture pipeline [COMMUNITY-TOOL] — Illustrates how to engineer a low-latency, cloud-native CDC pipeline utilizing Debezium connectors alongside open database architectures. Explores data serialization optimizations and horizontal scale metrics.
Compliance Systems
- infoq.com: Building a SQL Database Audit System using Kafka, MongoDB and' Maxwell's Daemon [ADVANCED LEVEL] [COMMUNITY-TOOL] — Examines a robust architectural design for database auditing. Uses Maxwell's Daemon to ingest database binlogs, routing changes through Kafka into MongoDB to form a tamper-resistant historical record.
Data Federation
- Event streaming and data federation: A citizen integrator’s story [LEGACY] — Examines low-code patterns for connecting real-time streaming architectures with legacy enterprise databases. Explains how federation tools bridge information gaps between non-technical users and distributed message networks.
Debezium Connectors
- developers.redhat.com: Db2 and Oracle connectors coming to Debezium 1.4' GA [COMMUNITY-TOOL] — A technical release breakdown detailing the production-grade integration of IBM Db2 and Oracle connectors within the Debezium 1.4 ecosystem. It reviews performance benchmarks, log-mining mechanisms, and setup procedures critical for cloud-native enterprise migrations.
Schema Governance
Event-Driven Governance
- developers.redhat.com: Event-driven APIs and schema governance for Apache' Kafka: Get ready for Kafka Summit Europe 2021 [COMMUNITY-TOOL] — Synthesizes key themes of Kafka Summit Europe 2021, detailing patterns in distributed schema governance, central API catalogs, and microservice integration design protocols.
Microservices Design Patterns
- (2021) redhat.com: Using a schema registry to ensure data consistency between microservices 🌟🌟🌟 [COMMUNITY-TOOL] — Analyzes the operational role of schema registries in maintaining system stability. Highlights how decoupled producers and consumers leverage registries for backward and forward schema compatibility, protecting distributed microservices from payload parsing errors.
Service Registry
- Red Hat Integration service registry [COMMUNITY-TOOL] — An introductory architecture guide describing the capabilities of the Red Hat Integration Service Registry. Reviews standard patterns for managing API schemas (Avro, JSON Schema, Protobuf) to guarantee strong message-contract enforcement in decoupled broker networks.
- Apicurio Registry ⭐ 806 [ENTERPRISE-STABLE] — Apicurio Registry is a high-performance, open-source centralized schema registry. It enables teams to maintain and store OpenAPI, AsyncAPI, Avro, Protobuf, and JSON schemas, supporting real-time validation layers in high-throughput microservice pipelines.
Stream Processing
Evolutionary Topologies
- thenewstack.io: Part 1: The Evolution of Data Pipeline Architecture [COMMUNITY-TOOL] — Traces the structural progress of big data analytics pipelines. Focuses on the architectural evolution from high-latency batch map-reduce jobs to real-time Kappa and Lambda messaging topologies.
Managed Pipelines
- (2021) cloudblog.withgoogle.com: Turn any Dataflow pipeline into a reusable template 🌟🌟🌟 [COMMUNITY-TOOL] — A technical look at creating and managing GCP Dataflow templates. Details packaging patterns that permit modular, parameter-driven run execution of streaming data pipelines across multi-tenant infrastructures.
Microservices Frameworks
- Build a data streaming pipeline using Kafka Streams and Quarkus [COMMUNITY-TOOL] — Demonstrates the construction of microsecond-responsive streams using the Kafka Streams API paired with Quarkus. Explores native execution compilation patterns to reduce JVM memory overhead and launch latency.
Data Engineering
Change Data Capture CDC (2)
Cloud Managed Services (1)
- debezium.io: Lessons Learned from Running Debezium with PostgreSQL on Amazon' RDS [ADVANCED LEVEL] [CASE STUDY] [ENTERPRISE-STABLE] — A highly valuable technical case study sharing performance profiles, optimization constraints, and gotchas when operating Debezium alongside Amazon RDS PostgreSQL. It details replication slot configurations, WAL storage management, and handling heavy transaction volumes under AWS limitations.
PostgreSQL
- (2021) info.crunchydata.com: PostgreSQL Change Data Capture With Debezium [ADVANCED LEVEL] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] [GUIDE] — A comprehensive operational manual focused on establishing Debezium CDC connectors specifically for enterprise-grade PostgreSQL deployments. It details WAL level adjustments, logical replication slot configuration, and the extraction of mutation events for consumer engines.
Real-Time Data Streaming
- Debezium: [ADVANCED LEVEL] [DOCUMENTATION] [DE FACTO STANDARD] [ENTERPRISE-STABLE] — The industry-leading, open-source distributed platform for Change Data Capture (CDC). Built on top of Apache Kafka, it taps database transaction logs in real-time, streaming row-level mutations downstream without querying databases. Essential for low-latency event-driven microservices.
Stream Processing (1)
- noti.st: Change Data Capture with Flink SQL and Debezium 🌟 [ADVANCED LEVEL] [EMERGING] [ENTERPRISE-STABLE] [GUIDE] — A visual presentation sharing architectural strategies to integrate Debezium with Apache Flink SQL for high-speed continuous stream processing. Explains patterns for building real-time materialized views, continuous aggregations, and live analytics directly from database mutation logs.
Data Culture
Real-Time Data Streaming (1)
- linkedin.com: How to Move From a “Wait for it...” Batch-Processing Culture' to a “Get It Now” Real-Time Data Culture [COMMUNITY-TOOL] [GUIDE] — Discusses the cultural and systemic paradigm shift required for enterprises moving from batch scheduling to real-time event-driven insights. It touches on organizational friction, architectural changes, and immediate business advantages. It serves as a non-technical strategic guide for data transformation.
Data Pipelines (1)
Cloud Native Architectures
- towardsdatascience.com: Architecture for High-Throughput Low-Latency Big' Data Pipeline on Cloud 🌟 [ADVANCED LEVEL] [COMMUNITY-TOOL] [GUIDE] — Evaluates design principles for high-throughput, low-latency cloud-native big data architectures. The guide details how to integrate ingestion layers with stream processing engines and distributed analytical databases. It presents structured architectural templates for unified analytical and machine learning workloads.
Data on Kubernetes
Orchestration (2)
- thenewstack.io: The Path to Getting the Full Data Stack on Kubernetes [ADVANCED LEVEL] [EMERGING] [GUIDE] — Explores the evolutionary path of running complex, stateful database and data streaming systems natively on Kubernetes. It addresses the maturity of operators, storage classes, and orchestrators that facilitate the deployment of the complete data pipeline. The article details challenges regarding resource management and high availability.
In-Memory Databases
Caching
- Redis [DOCUMENTATION] [DE FACTO STANDARD] — The definitive open-source, in-memory data store used as a database, cache, message broker, and streaming engine. Offers unmatched low-latency read-write cycles and versatile data structures. Highly valued for real-time applications requiring low overhead.
Real-Time Data Streaming (2)
Data Stack
- thenewstack.io: Streaming Data and the Modern Real-Time Data Stack [COMMUNITY-TOOL] [GUIDE] — Discusses the components constituting the modern real-time data stack, emphasizing continuous streaming over traditional batch ETL. It explores the roles of message logs, stream processors, and real-time OLAP databases. This provides a blueprint for engineering low-latency analytics systems.
Foundations (2)
- thenewstack.io: How to Get Started with Data Streaming [COMMUNITY-TOOL] [GUIDE] — A beginner-to-intermediate guide outlining initial workflows for setting up real-time stream ingestion and processing pipelines. It reviews primary tooling such as Apache Kafka and Apache Flink. It offers guidance on mapping traditional batch datasets into real-time pipelines.
Data Infrastructure
Data Architecture (1)
Data as a Service
Integrations
- (2020) mongodb.com: DaaS with MongoDB and Confluent [ADVANCED LEVEL] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — An architecture case study exploring how to design a modern Data-as-a-Service (DaaS) paradigm using MongoDB and Confluent. Focuses on real-time CDC synchronization mechanisms and state persistence across high-throughput microservices.
Event Streaming (4)
Apache Kafka
Enterprise Distribution
- confluent.io [DE FACTO STANDARD] — The enterprise cloud-native streaming data platform built on top of Apache Kafka. Confluent provides fully managed SaaS offerings, enterprise schema management, cloud-to-local replication, and declarative connectors for data warehouses.
Integrations (1)
- strimzi.io: Using HTTP Bridge as a Kubernetes sidecar [COMMUNITY-TOOL] — An exploration of deploy-time design patterns using the Strimzi HTTP Bridge as a Kubernetes sidecar container. This integration simplifies microservices communications by providing standard HTTP REST endpoints to interact with underlying Kafka event-driven pipelines.
Management Tools
- conduktor.io 🌟 [DE FACTO STANDARD] — An enterprise-grade desktop and cloud management platform for Apache Kafka that simplifies queue monitoring, schema registry auditing, and multi-cluster testing. It features advanced user security, performance monitoring, and message generation tools.
Monitoring
- strimzi/strimzi-canary ⭐ 42 [COMMUNITY-TOOL] — A deployment-ready diagnostic tool that acts as a canary monitor within Kafka clusters. It helps ops teams measure round-trip message latency, validation success, and consumer group responsiveness under realistic workloads.
- confluent.io: Monitoring Your Event Streams: Integrating Confluent with' Prometheus and Grafana [COMMUNITY-TOOL] — A guide showing how to set up robust monitoring patterns for Apache Kafka cluster metrics using Prometheus and Grafana. Details exact exporter configurations and provides ready-to-use visualizations of critical performance telemetry.
Operators
- (2024) Banzai Kafka Operator ⭐ 792 [ADVANCED LEVEL] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — The Banzai Cloud Koperator simplifies Apache Kafka operations on top of Kubernetes clusters. It implements granular auto-scaling, Cruise Control-assisted broker load rebalancing, and self-healing systems directly within the cluster scheduler.
Security (1)
- strimzi.io: Using Open Policy Agent with Strimzi and Apache Kafka [ADVANCED LEVEL] [ENTERPRISE-STABLE] — A detailed security-focused guide illustrating the integration of Open Policy Agent (OPA) with Strimzi and Apache Kafka. Explains how to enforce centralized, declarative, and fine-grained access control policies across streaming clusters.
Strimzi Operators
- strimzi/kafka-kubernetes-config-provider: Kubernetes Configuration Provider' for Apache Kafka ⭐ 30 [COMMUNITY-TOOL] — A specialized Kubernetes configuration provider for Apache Kafka that enables Kafka applications to read configuration data dynamically from Kubernetes Secrets and ConfigMaps. It simplifies the secure mounting of TLS certificates and broker credentials directly in client workloads.
- strimzi.io: Using Kubernetes Configuration Provider to load data from Secrets' and Config Maps [COMMUNITY-TOOL] — This blog post details the implementation of the Strimzi Kubernetes Config Provider. It demonstrates how to decouple configurations from code by directly mapping Kafka properties to Kubernetes-managed infrastructure configurations.
- blog.jromanmartin.io: How to upgrade Strimzi Operator using the CLI [COMMUNITY-TOOL] — Provides a practical, command-line step-by-step procedure to upgrade Strimzi Operator versions on live Kubernetes/OpenShift clusters while avoiding message pipeline downtime.
Architectural Patterns (1)
Comparisons
- softkraft.co: WS Kinesis vs Kafka comparison: Which is right for you? 🌟 [COMMUNITY-TOOL] — An objective comparative analysis contrasting Amazon Kinesis and Apache Kafka across parameters like performance, architecture, pricing, and infrastructure overhead. Helps architects select the ideal event engine for specific scaling targets.
- Pulsar vs Kafka – Comparison and Myths Explored [ADVANCED LEVEL] [COMMUNITY-TOOL] — A detailed technical breakdown comparing Apache Kafka and Apache Pulsar. Evaluates performance benchmarks, architecture complexities, replication topologies, and real-world deployment challenges.
Business Ecosystem
Partnerships
- (2021) confluent.io: Confluent and Microsoft Announce Strategic Alliance 🌟🌟 [COMMUNITY-TOOL] — Analysis of the strategic alignment between Microsoft and Confluent. Describes integrations for native resource provisioning, unified billing portals, and security optimizations within the Azure cloud environment.
Cloud-Native Streaming
AWS
- (2026) ==AWS Kinesis== [DOCUMENTATION] 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] — Official AWS documentation for Kinesis Data Streams. Highly resilient, fully managed cloud service for real-time data streaming at scale, designed for seamless integrations within the AWS ecosystem and serverless application designs.
Modern Alternatives
Apache Pulsar
- Apache Pulsar [ADVANCED LEVEL] [DE FACTO STANDARD] — A highly scalable cloud-native event streaming model that separates compute (Apache Pulsar brokers) from state/storage (Apache BookKeeper). Ideal for multi-tenant, geographically distributed messaging workloads that require decoupled horizontal scaling.
Interviews
- softwareengineeringdaily.com: Redpanda: Kafka Alternative with Alexander' Gallego 🌟 [ADVANCED LEVEL] [ENTERPRISE-STABLE] — An analytical podcast interview featuring Redpanda founder Alexander Gallego. Delves into the C++ implementation details, thread-per-core architectures, and the structural decisions that differentiate Redpanda from JVM-based event processors.
Redpanda
- (2026) ==Redpanda 🌟== 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] — An ultra-fast, C++ based, Seastar engine implementation of Kafka API protocols. Redpanda acts as a direct, lightweight replacement for Apache Kafka that removes heavy JVM tuning and ZooKeeper/KRaft runtimes, significantly lowering hardware footprints.
Red Hat AMQ Streams
Components
- Understanding Red Hat AMQ Streams components for OpenShift and Kubernetes 🌟 [COMMUNITY-TOOL] — Part 1 of an analytical breakdown detailing the components of Red Hat AMQ Streams (built on Strimzi). Explains operators, ZooKeeper configurations, and Kafka broker deployment patterns within enterprise Kubernetes clusters.
Integrations (2)
- developers.redhat.com: HTTP-based Kafka messaging with Red Hat AMQ Streams [LEGACY] — An architectural guide detailing how to use the HTTP Bridge component inside AMQ Streams. Allows web and legacy application services to publish and consume event data via lightweight REST HTTP requests.
Security (2)
- Set up Red Hat AMQ Streams custom certificates on OpenShift [ADVANCED LEVEL] [COMMUNITY-TOOL] — Demonstrates replacing auto-generated certificates with custom enterprise CA certs to implement secured TLS and mTLS configurations inside Strimzi-managed AMQ Streams.
Slides
- speakerdeck.com: Apache Kafka with Red Hat AMQ Streams 🌟 [COMMUNITY-TOOL] — An informative slide presentation charting Apache Kafka deployments on OpenShift via Red Hat AMQ Streams. Visualizes operator behaviors and declarative infrastructure patterns.
In-Memory Computing
Distributed Compute
Hazelcast
- devops.com: Hazelcast Simplifies Streaming for Extremely Fast Event Processing' in IoT, Edge and Cloud Environments [COMMUNITY-TOOL] — This article highlights Hazelcast's integration of stream processing and in-memory data store capabilities. Focuses on low-latency streaming applications for edge environments, high-throughput IoT networks, and real-time analytical portals.
IoT Messaging
Mosquitto
OpenShift
- developers.redhat.com: Deploying the Mosquitto MQTT message broker on Red' Hat OpenShift, Part 1 [COMMUNITY-TOOL] — A developer-focused walkthrough on deploying Eclipse Mosquitto, an open-source MQTT message broker, onto Red Hat OpenShift. Ideal for scaling lightweight telemetry components adjacent to containerized enterprise layers.
Protocols
MQTT
- mqtt.org [DOCUMENTATION] [DE FACTO STANDARD] — The main specification portal for MQTT, an ISO standard lightweight publish-subscribe network protocol. Widely adopted for edge environments, remote telemetry, and machine-to-machine integrations requiring minimal memory footprint and network load.
Message Brokers (2)
ActiveMQ
Artemis
- (2026) ==Apache ActiveMQ Artemis broker== 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] — Apache ActiveMQ Artemis provides a non-blocking, multi-protocol, highly performant asynchronous message broker designed for enterprise messaging. It supports advanced queue architectures, JMS/AMQP protocols, and cloud cluster deployments.
High Availability (1)
- developers.redhat.com: JDBC Master-Slave Persistence setup with Activemq' using Postgresql database [ADVANCED LEVEL] [COMMUNITY-TOOL] — Guides system administrators through building a JDBC Master-Slave active/passive high-availability messaging layer in ActiveMQ backed by a PostgreSQL cluster.
Enterprise Middleware
Red Hat AMQ
- Red Hat AMQ [ENTERPRISE-STABLE] — Official product home of Red Hat AMQ, an enterprise-grade messaging suite. Delivers highly-available JMS, AMQP, and MQTT engines along with robust Strimzi Kafka integration for complex enterprise data layers.
Red Hat AMQ (1)
OpenShift Routing
- developers.redhat.com: Connecting external clients to Red Hat AMQ Broker' on Red Hat OpenShift [COMMUNITY-TOOL] — A walkthrough explaining how to securely configure ingress and network protocols to route client applications external to OpenShift directly into containerized AMQ brokers.
Message Queues
Alternative Architectures
PostgreSQL (1)
- dagster.io: Postgres: a better message queue than Kafka? [COMMUNITY-TOOL] — An architectural exploration evaluating the use of PostgreSQL as a highly concurrent transactional queue using
FOR UPDATE SKIP LOCKED. Suggests a lightweight operational alternative to Apache Kafka for low-to-medium scale applications.
Stream Processing (2)
Architectural Patterns (2)
Comparisons (1)
- (2025) Kafka Streams and ksqlDB Compared – How to Choose [DOCUMENTATION] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — An extensive comparison guide from Confluent mapping out when to use the lightweight Kafka Streams Java client library versus ksqlDB database abstraction layers. Analyzes development environments, deployment scales, and infrastructure constraints.
Distributed Processing
Apache Flink
- Apache Flink [ADVANCED LEVEL] [DE FACTO STANDARD] — A highly performant distributed processing framework designed for stateful stream processing over bounded and unbounded data structures. Features sub-millisecond execution latencies and robust exactly-once transaction guarantees.
Kubernetes Native (1)
- flink.apache.org: How to natively deploy Flink on Kubernetes with High-Availability' (HA) [ADVANCED LEVEL] [ENTERPRISE-STABLE] — Provides architectural guidelines on natively deploying Apache Flink on Kubernetes clusters with robust High Availability (HA) configurations. Covers resource scheduling, Zookeeper integrations, and Kubernetes-native active scaling strategies.
SQL Engines
ksqlDB
- (2026) ksqlDB 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Official product home of ksqlDB, an event-streaming database tailored to construct stream-processing platforms on top of Apache Kafka. Translates complex Java/Scala stream pipelines into standard SQL definitions.
Data Platform
Customer Data
iPaaS
- (2026) rudderstack.com iPaaS [ADVANCED LEVEL] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — An enterprise-grade Customer Data Platform (CDP) designed specifically for developers, serving as a specialized iPaaS for telemetry and event streaming.
- Built to run securely on top of existing cloud data warehouses (Snowflake, BigQuery).
- Enables real-time event routing, transformation, and identity resolution with strict privacy controls.
Data Engineering (1)
Event Streaming (5)
- (2018) ==O'Really: Streaming data== [ADVANCED LEVEL] 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] — The definitive conceptual companion to the Apache Beam and Google Cloud Dataflow models of stream processing.
- Details critical patterns of out-of-order data handling.
- Explains event-time vs. processing-time, windowing, and triggering paradigms crucial for building resilient stream processing pipelines.
Data and Databases
Stream Processing (3)
Streaming Databases
- thenewstack.io: The Rise of the Event Streaming Database 🌟 [ADVANCED LEVEL] [COMMUNITY-TOOL] — An analytical piece exploring the convergence of databases and stream processing systems to create unified event-streaming databases. It addresses how modern architectures require real-time log computation. Grounding tracks its evolution toward modern systems like ksqlDB and Materialize.
Event-Driven Architecture
API Management
Schema Governance (1)
- developers.redhat.com: Managing the API life cycle in an event-driven architecture:' A practical approach 🌟 [ADVANCED LEVEL] [ENTERPRISE-STABLE] — Curator Insight: Best practices on governing async APIs and schema definitions inside microservice ecosystems. Live Grounding: Focuses on AsyncAPI specifications, Schema Registry integration, and decoupling publisher/consumer evolution paths to ensure backward compatibility and operational maturity.
Apache Kafka (1)
Architecture Evolution
- (2021) confluent.io: Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper [ADVANCED LEVEL] [COMMUNITY-TOOL] — Curator Insight: Announcement and architectural breakdown of Kafka's transition away from ZooKeeper in favor of KRaft (Kafka Raft metadata mode). Live Grounding: Discusses the architectural simplification, metadata scalability improvements, and decreased operational footprint of removing the external ZooKeeper dependency.
Fundamentals
- gentlydownthe.stream [COMMUNITY-TOOL] [GUIDE] — Curator Insight: A highly acclaimed visual, interactive introduction to Apache Kafka and stream processing. Live Grounding: Leverages hand-drawn diagrams and narrative storytelling to explain complex streaming concepts such as replication, consumer offsets, and transaction semantics in an exceptionally digestible manner.
- freecodecamp.org: The Apache Kafka Handbook – How to Get Started Using Kafka' 🌟 [ENTERPRISE-STABLE] — Curator Insight: Comprehensive handbook targeting developers getting started with event streams. Live Grounding: Explains underlying storage patterns, consumers, producers, and practical command-line exercises, making it an excellent onboarding guide.
Learning Resources
- ==conduktor.io/kafka: Learn Apache Kafka like never before== 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] [GUIDE] — Curator Insight: Conduktor's comprehensive learning catalog targeting advanced Kafka operations. Live Grounding: Step-by-step guides covering schema evolution, security architectures (SASL/mTLS), custom interceptors, and stream processing with Kafka Streams and ksqlDB.
- developer.confluent.io 🌟🌟 [DE FACTO STANDARD] [GUIDE] — Curator Insight: Confluent's central education portal containing exhaustive learning paths for Apache Kafka. Live Grounding: Houses premium-grade technical videos, tutorials, sample applications, and comprehensive courses covering stream processing, Kafka streams, and event-driven architecture patterns.
Local Development (2)
- github.com/lensesio/fast-data-dev (Lenses Box) ⭐ 2079 [ENTERPRISE-STABLE] — Curator Insight: Highly popular, all-in-one Docker image comprising Kafka, ZooKeeper, Schema Registry, and REST Proxy. Live Grounding: Excellent for local developer validation and integration pipelines needing a pre-wired, enterprise-ready playground instance.
Performance Optimization
- (2021) newrelic.com: Effective Strategies for Kafka Topic Partitioning 🌟 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Curator Insight: Deep-dive tutorial on optimizing Kafka throughput via smart partitioning schemes. Live Grounding: Analyzes consumer group balancing, message ordering requirements, and custom partitioning algorithms. Provides architectural guidelines for sizing partition counts to balance throughput and rebalance overhead.
Performance Testing
- (2023) KLoadGen - Kafka + (Avro/Json Schema) Load Generator 🌟 ⭐ 218 🌟🌟🌟 [COMMUNITY-TOOL] — Curator Insight: Purpose-built CLI and tool to simulate heavy load scenarios utilizing Schemas. Live Grounding: Streamlines load testing of schema-validated topics by generating synthetic Avro or JSON messages at target event rates.
Tooling and UI (1)
- (2026) ==redpanda-data/kowl== 🌟🌟🌟🌟🌟 [DE FACTO STANDARD] — Curator Insight: Excellent web UI (now Redpanda Console) designed for debugging and exploring event streams. Live Grounding: Outstanding user experience presenting topology, schema registry mapping, consumer tracking, and high-performance message search.
- towardsdatascience.com: Overview of UI Tools for Monitoring and Management' of Apache Kafka Clusters [COMMUNITY-TOOL] — Curator Insight: Comparative review of leading open-source and commercial administration portals for Kafka. Live Grounding: Compares visual management capabilities, schema registration support, and partition offset visualization across tools like AKHQ, Kafdrop, and Lenses.
- Kafdrop – Kafka Web UI 🌟 ⭐ 6135 [DE FACTO STANDARD] [ENTERPRISE-STABLE] — Curator Insight: Highly popular, lightweight web UI for monitoring and managing Apache Kafka. Live Grounding: Renders cluster info, brokers, topics, partition offsets, consumer group lag, and allows active JSON/protobuf message payload inspection.
- dev.to: Learn how to use Kafkacat – the most versatile Kafka CLI client' 🌟 [ENTERPRISE-STABLE] [GUIDE] — Curator Insight: Guide to Kafkacat (now rebranded as kcat), the developer's favorite Swiss Army knife CLI. Live Grounding: Walks through real-world piping, consuming from dynamic offsets, producing raw file contents, and query configurations using the command line.
- github.com/sauljabin/kaskade ⭐ 1013 [ENTERPRISE-STABLE] — Curator Insight: Modern Terminal User Interface (TUI) client for Apache Kafka. Live Grounding: Employs an elegant console layout allowing engineering teams to navigate topics, inspect raw schema properties, and watch streaming events dynamically right from the terminal.
Application Integration (2)
Java Spring Boot
- piotrminkowski.com: Concurrency with Kafka and Spring Boot [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Optimization guide for Spring Boot engineers processing high-throughput event logs. Live Grounding: Examines concurrent message listener configurations, partition distribution strategies, and thread-safe processing to fully maximize JVM resources.
Architectural Evaluation
Anti-Patterns
- kai-waehner.de: When NOT to use Apache Kafka? [ADVANCED LEVEL] [DE FACTO STANDARD] — Curator Insight: Essential architectural review pointing out Kafka anti-patterns. Live Grounding: Evaluates hard constraints of Kafka, comparing it against traditional message queues (RabbitMQ), data warehouses, and API gateways. Ideal for teams auditing if Kafka is the appropriate fit.
Architectural Patterns (3)
Resiliency
- developers.redhat.com: Building resilient event-driven architectures with' Apache Kafka [COMMUNITY-TOOL] — Curator Insight: A practical guide from Red Hat engineers on building resilient EDA systems using Kafka. Live Grounding: Explains foundational patterns such as retries, dead-letter queues (DLQ), and stateful stream processing to prevent message loss and maintain system availability during downstream failures.
Disaster Recovery
High Availability (2)
- (2021) tech.ebayinc.com: Resiliency and Disaster Recovery with Kafka [ADVANCED LEVEL] [CASE STUDY] [CASE STUDY] [COMMUNITY-TOOL] — Curator Insight: Real-world operational strategies for disaster recovery from eBay's engineering team. Live Grounding: Focuses on multi-region active-passive and active-active Kafka setups, addressing replication lag, mirror maker configurations, and failover automation challenges at extreme scale.
Integration Patterns
Transactional Outbox
- developers.redhat.com: The outbox pattern with Apache Kafka and Debezium' 🌟 [ADVANCED LEVEL] [ENTERPRISE-STABLE] — Curator Insight: Deep technical analysis of resolving dual-write problems using CDC and the Outbox Pattern. Live Grounding: Uses Debezium and Apache Kafka to stream database transaction events reliably, ensuring strict eventual consistency across decoupled microservices without 2PC overhead.
Multi-Cluster Strategy
Governance
- developers.redhat.com: Which is better: A single Kafka cluster to rule them' all, or many? [ADVANCED LEVEL] [COMMUNITY-TOOL] — Curator Insight: Comparative design analysis evaluating consolidated vs. decentralized cluster strategies. Live Grounding: Evaluates multi-tenancy, risk blast radius, organizational boundaries, billing allocation, and schema maintenance overhead across both topological models.
Performance Optimization (1)
Architectural Patterns (4)
- (2022) redhat.com: How we use Apache Kafka to improve event-driven architecture performance 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Curator Insight: Analysis of leveraging Kafka's performance characteristics within complex corporate environments. Live Grounding: Covers tuning throughput and reducing processing latency in microservices by optimization of batch sizes, compression parameters, and consumer allocation.
Broker Operations
- strimzi.io: Optimizing Kafka producers [COMMUNITY-TOOL] — Curator Insight: Diagnostic guide to fine-tuning publisher performance. Live Grounding: Explains structural impacts of compression types (lz4, zstd), batch.size configurations, linger.ms, and broker request limits on latency and message pipeline delivery.
- strimzi.io: Optimizing Kafka consumers 🌟 [ENTERPRISE-STABLE] — Curator Insight: In-depth study on maximizing consumer ingestion performance. Live Grounding: Analyzes consumer fetch sizes, commit mechanisms, partition assignments, and session timeout options to prevent unneeded offset rebalancing in enterprise settings.
Scale Operations
Automation
- slack.engineering: Building Self-driving Kafka clusters using open source' components [ADVANCED LEVEL] [CASE STUDY] [CASE STUDY] [COMMUNITY-TOOL] — Curator Insight: Insightful deep dive from Slack engineers on automating cluster maintenance. Live Grounding: Analyzes their usage of LinkedIn's Cruise Control to automate cluster balancing, partition reassignment, and self-healing under heavy operational scaling pressures.
Case Studies
- analyticsindiamag.com: How Uber is Leveraging Apache Kafka For More Than' 300 Micro Services [ADVANCED LEVEL] [CASE STUDY] [CASE STUDY] [COMMUNITY-TOOL] — Curator Insight: High-level overview of Uber's multi-cluster global event bus setup. Live Grounding: Discusses operating trillions of daily messages over 300+ microservices, highlighting custom proxying layers, dead-letter routing structures, and regional backpressure mitigation strategies.
- thenewstack.io: LinkedIn Layered Architecture Minimizes Kafka Scaling Issues [ADVANCED LEVEL] [CASE STUDY] [CASE STUDY] [COMMUNITY-TOOL] — Curator Insight: Case study detailing how LinkedIn redesigned their backend streaming pipeline layers. Live Grounding: Explains how a layered model decouples the client APIs from physical clusters, mitigating client-induced connection bloat and simplifying routing management.
Schema Governance (2)
Security (3)
- developers.redhat.com: How to secure Apache Kafka schemas with Red Hat Integration' Service Registry 2.0 [ADVANCED LEVEL] [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Security-focused guide for configuring and shielding the API schema registry. Live Grounding: Details RBAC integration, TLS communication, and schema evolution restrictions using Red Hat Integration Service Registry 2.0 (based on Apicurio Registry) to protect message payloads.
Security (4)
Data Compliance
- developers.redhat.com: End-to-end field-level encryption for Apache Kafka' Connect [ADVANCED LEVEL] [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Security implementation guide on field-level crypto for data pipelines. Live Grounding: Addresses PCI/GDPR requirements by demonstrating Cryptographic SMTs (Simple Message Transforms) within Kafka Connect, ensuring data is encrypted before hitting log segments.
Kafka Connect
- developers.redhat.com: Using secrets in Kafka Connect configuration [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Security patterns for avoiding plain-text secrets inside Kafka Connect configurations. Live Grounding: Outlines setting up native SecretProviders (such as directory or file-based providers) inside properties files to map dynamic environment secrets securely.
Zero Trust
- engineering.grab.com: Zero trust with Kafka [ADVANCED LEVEL] [CASE STUDY] [CASE STUDY] [COMMUNITY-TOOL] — Curator Insight: Production study from Grab's engineering team on implementing zero-trust network boundaries for messaging. Live Grounding: Covers mutual TLS (mTLS) for broker-client transport, fine-grained ACL authorization, and automating credential lifecycle rotation.
Observability
Monitoring (1)
Performance Metrics
- datadoghq.com: Monitoring Kafka performance metrics [COMMUNITY-TOOL] — Curator Insight: The gold-standard diagnostic reference for key Kafka metrics. Live Grounding: Breaks down critical under-replicated partition counts, active controller counts, consumer lag, and I/O network thread usage, offering concrete troubleshooting actions for operational stability.
Orchestration and Workflow
BPMN Orchestration
Architectural Patterns (5)
Comparisons (2)
- infoq.com: Event Streams and Workflow Engines – Kafka and Zeebe 🌟 [ADVANCED LEVEL] [COMMUNITY-TOOL] — Discusses integrating Apache Kafka's distributed streaming logs with Zeebe's stateful workflow management. Analyzes patterns to maintain reliable, long-running saga transactions across microservices.
Zeebe
Camunda
- (2026) Zeebe workflow engine 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Camunda's cloud-native workflow engine, Zeebe. Built specifically to orchestrate distributed microservices, Zeebe parses BPMN 2.0 structures and implements high-throughput, horizontally scalable state machines directly on top of Kubernetes.
Data Pipelines (2)
Apache Airflow
Advanced Patterns
- (2025) docs.astronomer.io: Dynamically generating DAGs in Airflow [ADVANCED LEVEL] [DOCUMENTATION] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Technical documentation illustrating design patterns to construct dynamically generated DAG pipelines in Apache Airflow. Covers generation templates, dynamic parameters, and runtime optimization.
Architecture
- towardsdatascience.com: Apache Airflow Architecture 🌟 [COMMUNITY-TOOL] — A structured architectural deep dive explaining how Apache Airflow schedules and executes pipelines. Outlines relationships between scheduler loops, state synchronization databases, and executors.
Basics
- dev.to: Get started with Apache Airflow [COMMUNITY-TOOL] — A step-by-step introduction to Apache Airflow design patterns. Covers the core orchestration concepts including DAG definitions, basic Python operators, scheduler parameters, and task execution workflows.
Configuration
- airflow.apache.org: Add Owner Links to DAG [DOCUMENTATION] [COMMUNITY-TOOL] — A practical guide showing how to map ownership contacts, support channels, and documentation links to pipeline owners within Airflow dashboards for rapid operations management.
Deployments (1)
- Apache Airflow official helm chart 🌟 [DOCUMENTATION] [DE FACTO STANDARD] — The official Apache Airflow community Helm Chart. Provides pre-configured, modular, and enterprise-hardened templates for deploying schedulers, webservers, worker nodes, and scalable Celery or Kubernetes executors.
- youtube: Airflow Helm Chart : Quick Start For Beginners in 10mins [COMMUNITY-TOOL] — A concise introductory video training illustrating how to deploy Apache Airflow quickly using standard Helm commands. Walks through default configurations, worker provisioning, and web interface verification.
Kubernetes Native (2)
- towardsdatascience.com: Apache Airflow for containerized data-pipelines [COMMUNITY-TOOL] — Explores patterns to deploy containerized data processing networks via Apache Airflow. Focuses on orchestrating individual pipeline stages inside isolated runtime structures on top of cloud infrastructure.
- airflow.apache.org: KubernetesPodOperator 🌟🌟🌟 [ADVANCED LEVEL] [DOCUMENTATION] [DE FACTO STANDARD] — Official engineering reference for the KubernetesPodOperator. Explains how to spin up isolated, dedicated pods within a target Kubernetes namespace dynamically for each individual Airflow DAG task execution.
Monitoring (2)
- redhat.com: Monitoring Apache Airflow using Prometheus [COMMUNITY-TOOL] — A technical tutorial on integrating Apache Airflow orchestration endpoints with Prometheus. Illustrates how to pull scheduler workloads, active runner pools, and pipeline errors into centralized monitoring systems.
Machine Learning Orchestration
Data Platforms
Open Data Hub
- Open Data Hub [ADVANCED LEVEL] [ENTERPRISE-STABLE] — The main portal for Open Data Hub, an AI/ML platform reference architecture on Red Hat OpenShift. Orchestrates tools like Kubeflow, Spark, and Kafka into a standardized workspace for ML operations.
Releases
- Open Data Hub 0.6 brings component updates and Kubeflow architecture [COMMUNITY-TOOL] — Highlights components within Open Data Hub v0.6 release, evaluating the integrated Kubeflow-aligned architectural updates for containerized machine learning pipelines.
Roadmaps
- A development roadmap for Open Data Hub [COMMUNITY-TOOL] — A development roadmap overview outlining core development visions and tooling tracks designed for the Open Data Hub analytics platform.
Python SDKs
Couler
- Couler ⭐ 944 [COMMUNITY-TOOL] — An open-source Python SDK focused on orchestrating workloads on Kubernetes. Simplifies constructing declarative workflows across native schedulers like Argo or Tekton using programmatic expressions.
Serverless
Knative
Declarative Configuration
- itnext.io: Configuring Kafka Sources and Sinks declaratively in Kubernetes' using Knative [COMMUNITY-TOOL] [GUIDE] — Curator Insight: Hands-on exploration of declarative serverless ingestion pipelines on Kubernetes. Live Grounding: Focuses on setting up Knative Eventing Kafka sources and sinks, showcasing how to abstract underlying broker complexities into native Kubernetes custom resource definitions (CRDs).
Event-Driven Integration
- piotrminkowski.com: Knative Eventing with Quarkus, Kafka and Camel [ADVANCED LEVEL] [COMMUNITY-TOOL] [GUIDE] — Curator Insight: A step-by-step implementation guide showing serverless integration patterns using Knative, Quarkus, Kafka, and Apache Camel. Live Grounding: Demonstrates how to build efficient, fast-booting containerized JVM microservices that react dynamically to Kafka events routed via Knative's eventing framework.
Python Microservices
- rogulski.it: Consume Kafka events with Knative service and FastAPI on kubernetes' 🌟 [ENTERPRISE-STABLE] [GUIDE] — Curator Insight: Practical reference implementation for python-based serverless consumers. Live Grounding: Illustrates setting up Knative Eventing with a KafkaSource trigger to dynamically scale a FastAPI container from zero to process inbound streaming records.
Software Architecture
Case Studies (1)
Event Delivery
- (2016) engineering.atspotify.com: Spotify’s Event Delivery – The Road to the Cloud (Part I) [ADVANCED LEVEL] [CASE STUDY] 🌟🌟🌟🌟 [CASE STUDY] [ENTERPRISE-STABLE] — Part one of Spotify's highly detailed case study documenting their massive shift from on-premise infrastructure to Google Cloud Platform event infrastructure. It details scaling to deliver billions of events daily without data loss. Grounding validates this as an classic, essential read for distributed systems architects.
Event-Driven Architecture (1)
Application Design
- stackoverflow.blog: How event-driven architecture solves modern web app' problems 🌟 [COMMUNITY-TOOL] — A highly clear explanation of how event-driven patterns resolve high concurrency and cross-system latency challenges. Grounding demonstrates its continued popularity as a primary training introduction to decoupled publishing and subscribing paradigms.
Infrastructure Design
- (2021) redhat.com: Event-driven architecture: Understanding the essential benefits 🌟 🌟🌟🌟 [COMMUNITY-TOOL] — A deep dive by Red Hat explaining how event-driven designs foster agility, decoupling, and high horizontal scalability. It discusses integration paths with Kubernetes, Apache Kafka, and Knative. Grounding shows its essential role for platform engineers planning enterprise application modernized routes.
Integration Patterns (1)
iPaaS (1)
- (2023) quandarycg.com: Everything You Need To Know About System Integration (And IPaaS) 🌟 🌟🌟🌟🌟 [ENTERPRISE-STABLE] [LEGACY] — A comprehensive architectural primer outlining the foundational concepts of Enterprise Application Integration (EAI) and Integration Platform as a Service (iPaaS).
- Details the transition from legacy point-to-point connections to modern hub-and-spoke models.
- Provides evaluation frameworks for cloud-native middleware alternatives.
- blog.hubspot.com: The 22 Best iPaaS Vendors for Any Budget [COMMUNITY-TOOL] — A commercial and technical overview of the top 22 Integration Platform as a Service (iPaaS) vendor solutions. Useful for architectural selection phases to compare enterprise offerings like MuleSoft, Workato, and Zapier across cloud compatibility, throughput limits, and ease of orchestration.
- Mulesoft [ADVANCED LEVEL] [DE FACTO STANDARD] [LEGACY] — The industry-standard enterprise integration platform (Anypoint Platform) providing high-density API management, ESB capabilities, and iPaaS routing. Mulesoft is highly suited for large-scale legacy modernization and hybrid-cloud orchestration, though it introduces significant runtime complexity and enterprise licensing costs.
Java Ecosystem
Microservices (2)
- adambien.blog - 75th airhacks.tv Questions and Answers: Kafka, JAX-RS,' MicroProfile, JSON-B, GSON, JWT, VSC, NetBeans, Java Fullstack [COMMUNITY-TOOL] — A technical Q&A session highlighting architectural strategies using Kafka, JAX-RS, MicroProfile, and JSON-B. This resource offers pragmatic patterns for decoupling enterprise Java applications and migrating monolithic structures to cloud-native, microservice-based runtimes.
Microservices (3)
Event-Driven Architecture (2)
- (2022) confluent.io: Event-Driven Microservices Architecture (white paper) 🌟 [ADVANCED LEVEL] [DOCUMENTATION] 🌟🌟🌟🌟 [ENTERPRISE-STABLE] — Confluent's authoritative white paper on designing and scaling event-driven microservices around Apache Kafka log segments. It addresses CQRS, Event Sourcing, and transactional schemas. Grounding solidifies this as a core reference for large-scale enterprise data mesh topologies.
Monolith Migration
Event-Driven Architecture (3)
- infoq.com: From Monolith to Event-Driven: Finding Seams in Your Future Architecture [ADVANCED LEVEL] [ENTERPRISE-STABLE] — An InfoQ guide detailing how to use Domain-Driven Design (DDD) to isolate domain boundaries and discover 'seams' within large-scale monoliths. Grounding confirms its position as a primary methodology for refactoring to decoupled, event-driven pipelines.