Files
awesome-kubernetes/v2-docs/monitoring.md

299 lines
49 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Monitoring and Performance. Prometheus, Grafana, APMs and more
!!! info "Architectural Context"
Detailed reference for Monitoring and Performance. Prometheus, Grafana, APMs and more in the context of Architectural Foundations.
## Standard Reference
- [Monitoring jenkins using instana](https://www.ibm.com/think) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [victoriametrics.com: Q2 2024 Round Up: VictoriaMetrics & VictoriaLogs Updates](https://victoriametrics.com/blog/q2-2024-round-up-victoriametrics-and-victorialogs-updates/index.html) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: The Challenges of Monitoring Kubernetes and OpenShift](https://thenewstack.io/kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [sysdig.com: Kubernetes Monitoring with Prometheus, the ultimate guide 🌟](https://www.sysdig.com/blog/kubernetes-monitoring-prometheus) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [sysdig.com: How to monitor kube-proxy 🌟](https://www.sysdig.com/blog/monitor-kube-proxy) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [getenroute.io: TSDB, Prometheus, Grafana In Kubernetes: Tracing A Variable Across The OSS Monitoring Stack](https://www.saaras.io/blog/leverage-open-source-oss-derive-insights-grafana-prometheus) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [opsdis.com: Building a custom monitoring solution with Grafana, Prometheus and Loki](https://binero.com/observability) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [harness.io: Metrics to Improve Continuous Integration Performance](https://www.harness.io/blog/continuous-integration-performance-metrics) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [skilledfield.com.au: Monitoring Kubernetes and Docker Container Logs](https://skillfield.com.au/blog/monitoring-kubernetes-and-docker-container-logs) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [forbes.com: Who Should Own The Job Of Observability In DevOps?](https://www.forbes.com/councils/forbestechcouncil/2021/09/03/who-should-own-the-job-of-observability-in-devops/?streamIndex=0) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [infoworld.com: The RED method: A new strategy for monitoring microservices](https://www.infoworld.com/article/2270578/the-red-method-a-new-strategy-for-monitoring-microservices.html) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [forbes.com: From Data Collection To Delivering KPIs: A Roadmap To A Mature Observability Strategy](https://www.forbes.com/councils/forbestechcouncil/2024/03/08/from-data-collection-to-delivering-kpis-a-roadmap-to-a-mature-observability-strategy/?streamIndex=0) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span> — - The observability space is a multi-billion-dollar market and for good reason—there are a lot of benefits when you implement a robust observability strategy. But extracting value is not as simple as adopting a tool, throwing your data into a black box and expecting it to spit out business-relevant, contextualized insights and helpful visualizations.
- As they say, “Nothing good comes easy”—but when done right, a mature observability strategy will pay for itself over and over again.
- [KPIs](https://www.kpi.org/KPI-Basics) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Promster: Use Prometheus in huge deployments with dynamic clustering and scrape sharding capabilities based on ETCD service registration](http://github.com/flaviostutz/promster) <span class='md-tag md-tag--info'>⭐ 31</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [timescale.com: Prometheus vs. OpenTelemetry Metrics: A Complete Guide](https://www.tigerdata.com/blog/prometheus-vs-opentelemetry-metrics-a-complete-guide) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [acloudguru.com: Getting started with the Elastic Stack](https://www.pluralsight.com/resources/blog/cloud/getting-started-with-the-elastic-stack) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [aws.amazon.com: Amazon Elasticsearch Service Is Now Amazon OpenSearch Service and Supports OpenSearch 1.0](https://aws.amazon.com/blogs/aws/announcing-amazon-opensearch-service-which-supports-opensearch-10) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Transitive blocks](https://fastthread.io/ft-error.jsp&s=t) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span> — - [Unresponsive JVM](https://fastthread.io/ft-error.jsp&s=t)
- [Sudden CPU spike](https://fastthread.io/ft-error.jsp&s=t)
- [Thread Leaks](https://fastthread.io/ft-error.jsp&s=t)
- [Remote Debugging of Java Applications on OpenShift](https://www.redhat.com/en/blog/remote-debugging-java-applications-openshift) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Dapper](https://research.google/pubs/dapper-a-large-scale-distributed-systems-tracing-infrastructure) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [newrelic.com: OpenTracing, OpenCensus, OpenTelemetry, and New Relic (Best overview of OpenTelemetry)](https://newrelic.com/blog/dem/opentelemetry-opentracing-opencensus) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [sentry.io](https://sentry.io/welcome) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span> — - APMs like Dynatrace, etc.
- [Elastic APM](https://www.elastic.co/observability/application-performance-monitoring) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Elastic APM Server](https://www.elastic.co/docs/solutions/observability/apm) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [adictosaltrabajo.com: Monitorización y análisis de rendimiento de aplicaciones con Dynatrace APM](https://adictosaltrabajo.com/2016/10/26/monitorizacion-y-analisis-de-rendimiento-de-aplicaciones-con-dynatrace) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: openshift monitoring](https://www.dynatrace.com/hub/detail/red-hat-openshift) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: Deploy OneAgent on OpenShift Container Platform](https://docs.dynatrace.com/docs/ingest-from/setup-on-container-platforms/kubernetes/legacy/deploy-oneagent-operator-openshift-legacy) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [My Dynatrace proof of concept 🌟](https://github.com/nubenetes/awesome-kubernetes/blob/master/pdf/dynatrace_demo.pdf) <span class='md-tag md-tag--info'>⭐ 661</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: Monitoring of Kubernetes Infrastructure for day 2 operations](https://www.dynatrace.com/news/blog/what-is-kubernetes-2) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: The Power of OpenShift, The Visibility of Dynatrace](https://www.dynatrace.com/news/blog/what-is-openshift-2) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [A Distributed Tracing Adventure in Apache Beam](http://rion.io/2020/07/04/a-distributed-tracing-adventure-in-apache-beam) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [rancher.com: Monitor Etcd with Prometheus and Grafana using Rancher](https://www.suse.com/c/rancher_blog/monitor-etcd-with-prometheus-and-grafana-using-rancher) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [openshift.com: Monitoring Infrastructure Openshift 4.x Using Zabbix Operator](https://www.redhat.com/en/blog/monitoring-infrastructure-openshift-4.x-using-zabbix-operator) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [openshift.com: How to Monitor Openshift 4.x with Zabbix using Prometheus - Part 2](https://www.redhat.com/en/blog/how-to-monitoring-openshift-4.x-with-zabbix-using-prometheus-part-2) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [replex.io](https://www.splunk.com/en_us/appdynamics-joins-splunk.html?301=appdynamics) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [**Micrometer** Collector](http://micrometer.io) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [en.wikipedia.org/wiki/List_of_performance_analysis_tools](https://en.wikipedia.org/wiki/List_of_performance_analysis_tools) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [InspectIT](https://en.wikipedia.org/wiki/InspectIT) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [VisualVM](https://en.wikipedia.org/wiki/VisualVM) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [OverOps](https://en.wikipedia.org/wiki/OverOps) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Dzone: 14 Best Performance Testing Tools and APM Solutions](https://dzone.com/articles/14-best-performance-testing-tools-and-apm-solution) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [victoriametrics.com](https://victoriametrics.com) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Monitor your Azure cloud estate - Cloud Adoption Framework](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/manage/monitor#reference-for-monitoring-azure-services) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Wikipedia: Application Performance Index](https://en.wikipedia.org/wiki/Apdex) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Observability vs Monitoring](https://middleware.io/blog/observability-vs-monitoring) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone.com: Kubernetes Monitoring: Best Practices, Methods, and Existing' Solutions](https://dzone.com/articles/kubernetes-monitoring-best-practices-methods-and-e) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [CNCF End User Technology Radar: Observability, September 2020 🌟](https://www.cncf.io/blog/2020/09/11/cncf-end-user-technology-radar-observability-september-2020) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [magalix.com: Monitoring Kubernetes Clusters Through Prometheus & Grafana' 🌟](https://www.magalix.com/blog/monitoring-of-kubernetes-cluster-through-prometheus-and-grafana) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [learnsteps.com: Monitoring Infrastructure System Design](https://www.learnsteps.com/monitoring-infrastructure-system-design) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [bravenewgeek.com: The Observability Pipeline](https://bravenewgeek.com/the-observability-pipeline) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: 3 Key Configuration Challenges for Kubernetes Monitoring' with Prometheus](https://thenewstack.io/3-key-configuration-challenges-for-kubernetes-monitoring-with-prometheus) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Monitoring vs. Observability: Whats the Difference?](https://thenewstack.io/monitoring-vs-observability-whats-the-difference) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dashbird.io: Monitoring vs Observability: Can you tell the difference? 🌟](https://dashbird.io/blog/monitoring-vs-observability) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Monitoring as Code: What It Is and Why You Need It 🌟](https://thenewstack.io/monitoring-as-code-what-it-is-and-why-you-need-it) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Observability Wont Replace Monitoring (Because It Shouldnt)' 🌟](https://thenewstack.io/observability-wont-replace-monitoring-because-it-shouldnt) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [matiasmct.medium.com: Observability at Scale](https://matiasmct.medium.com/observability-at-scale-52d0d9a5fb9b) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: Top 11 Open Source Monitoring Tools for Kubernetes 🌟](https://logz.io/blog/open-source-monitoring-tools-for-kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Kubernetes Observability Challenges in Cloud Native Architecture' 🌟](https://thenewstack.io/kubernetes-observability-challenges-in-cloud-native-architecture) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Best Practices to Optimize Infrastructure Monitoring within' DevOps Teams](https://thenewstack.io/best-practices-to-optimize-infrastructure-monitoring-within-devops-teams) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [faun.pub: DevOps Meets Observability 🌟](https://faun.pub/devops-meets-observability-78775c021b0e) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Growing Adoption of Observability Powers Business Transformation](https://thenewstack.io/growing-adoption-of-observability-powers-business-transformation) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [blog.thundra.io: What CI Observability Means for DevOps 🌟](https://blog.thundra.io/what-ci-observability-means-for-devops) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Monitoring API Latencies After Releases: 4 Mistakes to Avoid](https://thenewstack.io/monitoring-api-latencies-after-releases-4-mistakes-to-avoid) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: DevOps Observability from Code to Cloud](https://thenewstack.io/devops-observability-from-code-to-cloud) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: CI Observability for Effective Change Management 🌟](https://thenewstack.io/ci-observability-for-effective-change-management) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Monitor Your Containers with Sysdig](https://thenewstack.io/monitor-your-containers-with-sysdig) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Applying Basic vs. Advanced Monitoring Techniques](https://thenewstack.io/applying-basic-vs-advanced-monitoring-techniques) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [cloudforecast.io: cAdvisor and Kubernetes Monitoring Guide 🌟](https://cloudforecast.io/blog/cadvisor-and-kubernetes-monitoring-guide) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [hmh.engineering: Musings on microservice observability!](https://hmh.engineering/musings-on-microservice-observability-f7052ac42f04) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [stackoverflow.blog: Observability is key to the future of software (and' your DevOps career)](https://stackoverflow.blog/2021/09/08/observability-is-key-to-the-future-of-software-and-your-devops-career) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: What is observability? Not just logs, metrics and traces](https://www.dynatrace.com/news/blog/what-is-observability-2) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Observability Is the New Kubernetes 🌟](https://thenewstack.io/observability-is-the-new-kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [learnsteps.com: Logging Infrastructure System Design](https://www.learnsteps.com/logging-infrastructure-system-design) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com: Monitoring Microservices - Part 1: Observability | Anderson' Carvalho](https://medium.com/geekculture/monitoring-microservices-part-1-observability-b2b44fa3e67e) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [intellipaat.com: Top 10 DevOps Monitoring Tools](https://intellipaat.com/blog/devops-monitoring-tools) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [cncf.io: How to add observability to your application pipeline](https://www.cncf.io/blog/2021/11/23/how-to-add-observability-to-your-application-pipeline) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [storiesfromtheherd.com: Unpacking Observability](https://storiesfromtheherd.com/unpacking-observability-a-beginners-guide-833258a0591f) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: A Monitoring Reality Check: More of the Same Wont Work](https://logz.io/blog/monitoring-reality-check) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/buildpiper: Observability for Monitoring Microservices — Top' 5 Ways!](https://medium.com/buildpiper/observability-for-monitoring-microservices-top-5-ways-587871e726d0) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@cbkwgl: Continuous Monitoring in DevOps 🌟](https://medium.com/@cbkwgl/continuous-monitoring-in-devops-8d4db48a0e24) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: The Open Source Observability Adoption and Migration Curve](https://logz.io/blog/open-source-observability-adoption-migration-curve) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [devopscube.com: What Is Observability? Comprehensive Beginners Guide](https://devopscube.com/what-is-observability) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [tiagodiasgeneroso.medium.com: Observability Concepts you should know](https://tiagodiasgeneroso.medium.com/observability-concepts-you-should-know-943fc057b208) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [faun.pub: Getting started with Observability](https://faun.pub/getting-started-with-observability-657d57aab1c7) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@badawekoo: Monitoring in DevOps lifecycle](https://medium.com/@badawekoo/monitoring-in-devops-lifecycle-4d9a2f277eb0) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [laduram.medium.com: The Future of Observability](https://laduram.medium.com/the-future-of-observability-c33cd7ff644a) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [devops.com: Where Does Observability Stand Today, and Where is it Going' Next?](https://devops.com/where-does-observability-stand-today-and-where-is-it-going-next) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/kubeshop-i: Top 8 Open-Source Observability & Testing Tools](https://medium.com/kubeshop-i/top-8-open-source-observability-testing-tools-9341a361a634) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone: 11 Observability Tools You Should Know 🌟](https://dzone.com/articles/11-observability-tools-you-should-know-in-2023) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/devops-techable: Setup monitoring with Prometheus and Grafana' in Kubernetes — Start monitoring your Kubernetes cluster resources](https://medium.com/devops-techable/setup-monitoring-with-prometheus-and-grafana-in-kubernetes-start-monitoring-your-kubernetes-a3071f083fa6) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: What Is Container Monitoring?](https://thenewstack.io/what-is-container-monitoring) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [devops.com: Why Monitoring-as-Code Will be a Must for DevOps Teams](https://devops.com/why-monitoring-as-code-will-be-a-must-for-devops-teams) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/cloud-native-daily: Why You Shouldnt Fear to Adopt OpenTelemetry' for Observability](https://medium.com/cloud-native-daily/why-you-shouldnt-fear-to-adopt-opentelemetry-for-observability-fcb6371ea8fe) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@bijit211987: Observability Driven Development (ODD)-Enhancing' System Reliability](https://medium.com/@bijit211987/observability-driven-development-2bc2cdde8661) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/performance-engineering-for-the-ordinary-barbie: Why profiling' should be part of regular software development workflow 🌟](https://medium.com/performance-engineering-for-the-ordinary-barbie/why-profiling-should-be-part-of-regular-software-development-workflow-8b19b7f52b38) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [github.com/prometheus-operator](https://github.com/prometheus-operator) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [redhat.com: **How to gather and display metrics in Red Hat OpenShift** (Prometheus' + Grafana)](https://www.redhat.com/en/blog/how-gather-and-display-metrics-red-hat-openshift) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Generally Available today: Red Hat OpenShift Container Platform 3.11 is' ready to power enterprise Kubernetes deployments 🌟](https://www.redhat.com/en/blog/generally-available-today-red-hat-openshift-container-platform-311-ready-power-enterprise-kubernetes-deployments) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [OCP 3.11 Metrics and Logging](https://docs.openshift.com/container-platform/3.11/release_notes/ocp_3_11_release_notes.html#ocp-311-metrics-and-logging) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Prometheus Cluster Monitoring 🌟](https://docs.openshift.com/container-platform/3.11/install_config/prometheus_cluster_monitoring.html) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Leveraging Kubernetes and OpenShift for automated performance tests (part 1)](https://developers.redhat.com/blog/2018/11/22/automated-performance-testing-kubernetes-openshift) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Building an observability stack for automated performance tests on Kubernetes and OpenShift (part 2) 🌟](https://developers.redhat.com/blog/2019/01/03/leveraging-openshift-or-kubernetes-for-automated-performance-tests-part-2) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [developers.redhat.com: Monitoring .NET Core applications on Kubernetes](https://developers.redhat.com/blog/2020/08/05/monitoring-net-core-applications-on-kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Systems Monitoring with Prometheus and Grafana](https://flightaware.engineering/systems-monitoring-with-prometheus-grafana) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [cncf.io: Monitoring micro-front ends on Kubernetes with NGINX 🌟](https://www.cncf.io/blog/2023/02/01/monitoring-micro-front-ends-on-kubernetes-with-nginx) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone: Getting Started With Kibana Advanced Searches](https://dzone.com/articles/getting-started-with-kibana-advanced-searches) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone: Kibana Hacks: 5 Tips and Tricks](https://dzone.com/articles/kibana-hacks-5-tips-and-tricks) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [juanonlab.com: Dashboards de Kibana](https://www.juanonlab.com/blog/es/dashboards-de-kibana) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dev.to: Beginner's guide to understanding the relevance of your search' with Elasticsearch and Kibana](https://dev.to/lisahjung/beginner-s-guide-to-understanding-the-relevance-of-your-search-with-elasticsearch-and-kibana-29n6) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [devops.com: How Centralized Log Management Can Save Your Company](https://devops.com/how-centralized-log-management-can-save-your-company) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [betterprogramming.pub: The Art of Logging](https://betterprogramming.pub/creating-a-human-and-machine-freindly-logging-format-bb6d4bb01dca) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [zdnet.com: AWS, as predicted, is forking Elasticsearch](https://www.zdnet.com/article/aws-as-predicted-is-forking-elasticsearch) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [amazon.com: Stepping up for a truly open source Elasticsearch](https://aws.amazon.com/blogs/opensource/stepping-up-for-a-truly-open-source-elasticsearch) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Store NGINX access logs in Elasticsearch with Logging operator 🌟](https://banzaicloud.com/docs/one-eye/logging-operator/quickstarts/es-nginx) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Rancher Logging Operator 🌟](https://rancher.com/docs/rancher/v2.x/en/logging/v2.5) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [blog.streammonkey.com: How We Serverlessly Migrated 1.58 Billion Elasticsearch' Documents](https://blog.streammonkey.com/how-we-serverlessly-migrated-1-58-billion-elasticsearch-documents-33ad3d0d7c4f) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [youtube: ELK for beginners - by XavkiEn 🌟](https://www.youtube.com/playlist?list=PLWZKNB9waqIX00uj5q4nX_TOFiX3if1z3) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [javatechonline.com: How To Monitor Spring Boot Microservices Using ELK Stack?](https://javatechonline.com/how-to-monitor-spring-boot-microservices-using-elk-stack) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone: Running Elasticsearch on Kubernetes](https://dzone.com/articles/running-elasticsearch-on-kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium: Which Elasticsearch Provider is Right For You? 🌟](https://medium.com/gigasearch/which-elasticsearch-provider-is-right-for-you-3d596a65e704) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [jertel/elastalert2](https://github.com/jertel/elastalert2) <span class='md-tag md-tag--info'>⭐ 1119</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/hepsiburadatech: Hepsiburada Search Engine on Kubernetes](https://medium.com/hepsiburadatech/hepsiburada-search-engine-on-kubernetes-1fe03a3e71a3) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dev.to/sagary2j: ELK Stack Deployment using MiniKube single node architecture](https://dev.to/sagary2j/elk-stack-deployment-using-minikube-single-node-architecture-16cl) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [search-guard.com/sgctl-elasticsearch: SGCTL - TAKE BACK CONTROL](https://search-guard.com/sgctl-elasticsearch) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium: A detailed guide to deploying Elasticsearch on Elastic Cloud on' Kubernetes (ECK)](https://medium.com/99dotco/a-detail-guide-to-deploying-elasticsearch-on-elastic-cloud-on-kubernetes-eck-31808ac60466) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [opensearch.org 🌟](https://opensearch.org) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [amazon.com: Introducing OpenSearch](https://aws.amazon.com/blogs/opensource/introducing-opensearch) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: Logz.io Announces Support for OpenSearch; A Community-driven Open' Source Fork of Elasticsearch and Kibana](https://logz.io/news-posts/logz-io-announces-support-for-opensearch-a-community-driven-open-source-fork-of-elasticsearch-and-kibana) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [techrepublic.com: OpenSearch: AWS rolls out its open source Elasticsearch' fork](https://www.techrepublic.com/article/opensearch-aws-rolls-out-its-open-source-elasticsearch-fork) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: This Week in Programming: AWS Completes Elasticsearch Fork' with OpenSearch](https://thenewstack.io/this-week-in-programming-aws-completes-elasticsearch-fork-with-opensearch) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: OpenSearch Is Now Generally Available!](https://logz.io/blog/opensearch-1-0-ga-generally-available-elasticsearch-kibana-fork) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: This Week in Programming: The ElasticSearch Saga Continues](https://thenewstack.io/this-week-in-programming-the-elasticsearch-saga-continues) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [aws.amazon.com: Keeping clients of OpenSearch and Elasticsearch compatible' with open source](https://aws.amazon.com/blogs/opensource/keeping-clients-of-opensearch-and-elasticsearch-compatible-with-open-source) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium: Logging with EFK - Pratyush Mathur](https://medium.com/@pratyush.mathur/logging-with-efk-1c2e131496d) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@CuriousLearner: Deploying EFK stack on Kubernetes](https://medium.com/@CuriousLearner/deploying-efk-stack-on-kubernetes-c25ba2682c99) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@tech_18484: Simplifying Kubernetes Logging with EFK Stack](https://medium.com/@tech_18484/simplifying-kubernetes-logging-with-efk-stack-158da47ce982) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: A Beginners Guide to Logstash Grok](https://logz.io/blog/logstash-grok) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [logz.io: Grok Pattern Examples for Log Parsing](https://logz.io/blog/grok-pattern-examples-for-log-parsing) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [devops.com: The Fallacy of Continuous Integration, Delivery and Testing](https://devops.com/the-fallacy-of-continuous-integration-delivery-and-testing) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone.com: The Keys to Performance Tuning and Testing](https://dzone.com/articles/the-keys-to-performance-tuning-and-testing) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone.com: Performance Patterns in Microservices-Based Integrations](https://dzone.com/articles/performance-patterns-in-microservices-based-integr-1) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [How to read a Thread Dump](https://dzone.com/articles/how-to-read-a-thread-dump) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone: 8 Options for Capturing Thread Dumps](https://dzone.com/articles/how-to-take-thread-dumps-7-options) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [blog.arkey.fr: Using JDK FlightRecorder and JDK Mission Control](https://blog.arkey.fr/2020/06/28/using-jdk-flight-recorder-and-jdk-mission-control) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [developers.redhat.com: Troubleshooting java applications on openshift](https://developers.redhat.com/blog/2017/08/16/troubleshooting-java-applications-on-openshift) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [VisualVM: JVisualVM to an Openshift pod](https://fedidat.com/250-jvisualvm-openshift-pod) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [redhat.com: How do I analyze a Java heap dump?](https://access.redhat.com/solutions/18301) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Microservice Observability with Distributed Tracing: **OpenTelemetry.io**' 🌟](https://opentelemetry.io) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [**zipkin.io**](https://zipkin.io) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [**OpenTracing.io**](https://opentracing.io) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [awkwardferny.medium.com: Setting up Distributed Tracing in Kubernetes with' OpenTracing, Jaeger, and Ingress-NGINX](https://awkwardferny.medium.com/setting-up-distributed-tracing-with-opentelemetry-jaeger-in-kubernetes-ingress-nginx-cfdda7d9441d) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [ploffay.medium.com: Five years evolution of open-source distributed tracing' 🌟](https://ploffay.medium.com/five-years-evolution-of-open-source-distributed-tracing-ec1c5a5dd1ac) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [signadot.com: Sandboxes in Kubernetes using OpenTelemetry](https://www.signadot.com/blog/sandboxes-in-kubernetes-using-opentelemetry) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Medium: Distributed Tracing and Monitoring using OpenCensus](https://medium.com/@rghetia/distributed-tracing-and-monitoring-using-opencensus-fe5f6e9479fb) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Dzone: Zipkin vs. Jaeger: Getting Started With Tracing](https://dzone.com/articles/zipkin-vs-jaeger-getting-started-with-tracing) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [opensource.com: Distributed tracing in a microservices world](https://opensource.com/article/18/9/distributed-tracing-microservices-world) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [opensource.com: 3 open source distributed tracing tools](https://opensource.com/article/18/9/distributed-tracing-tools) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Tracing: Why Logs Arent Enough to Debug Your Microservices' 🌟](https://thenewstack.io/tracing-why-logs-arent-enough-to-debug-your-microservices) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [github.com/open-telemetry/opentelemetry-operator](https://github.com/open-telemetry/opentelemetry-operator) <span class='md-tag md-tag--info'>⭐ 1696</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@magstherdev: OpenTelemetry Operator](https://medium.com/@magstherdev/opentelemetry-operator-d3d407354cbf) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: OpenTelemetry Gaining Traction from Companies and Vendors](https://thenewstack.io/opentelemetry-gaining-traction-from-companies-and-vendors) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: How OpenTelemetry Works with Kubernetes](https://thenewstack.io/how-opentelemetry-works-with-kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium.com/@bijit211987: Grafana with OpenTelemetry, Vendor-neutral and' open-source approach](https://medium.com/@bijit211987/grafana-with-opentelemetry-vendor-neutral-and-open-source-approach-ab4bc08f67e9) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium: Jaeger VS OpenTracing VS OpenTelemetry](https://medium.com/jaegertracing/jaeger-and-opentelemetry-1846f701d9f2) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [medium: Using Jaeger and OpenTelemetry SDKs in a mixed environment with' W3C Trace-Context](https://medium.com/jaegertracing/jaeger-clients-and-w3c-trace-context-c2ce1b9dc390) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Jaeger vs. Zipkin: Battle of the Open Source Tracing Tools](https://thenewstack.io/jaeger-vs-zipkin-battle-of-the-open-source-tracing-tools) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Grafana Tempo](https://github.com/grafana/tempo) <span class='md-tag md-tag--info'>⭐ 5276</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span>
- [opensource.com: Get started with distributed tracing using Grafana Tempo](https://opensource.com/article/21/2/tempo-distributed-tracing) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Azure App Service Auto-Heal: Capturing Relevant Data During Performance' Issues](https://techcommunity.microsoft.com/blog/appsonazureblog/azure-app-service-auto-heal-capturing-relevant-data-during-performance-issues/4390351) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [APM in wikipedia](https://en.wikipedia.org/wiki/Application_performance_management) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [github.com/antonarhipov/awesome-apm: Awesome APM](https://github.com/antonarhipov/awesome-apm) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone.com: APM Tools Comparison](https://dzone.com/articles/apm-tools-comparison-which-one-should-you-choose) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dzone.com: Java Performance Monitoring: 5 Open Source Tools You Should Know](https://dzone.com/articles/java-performance-monitoring-5-open-source-tools-you-should-know) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [datadoghq.com](https://www.datadoghq.com) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Mininimum elasticsearch requirement is 6.2.x or higher](https://www.elastic.co/support/matrix#matrix_compatibility) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Elastic APM Server Docker image](https://github.com/sls-dev1/openshift-elastic-apm-server) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Monitoring Java applications with Elastic: Getting started with the Elastic' APM Java Agent](https://www.elastic.co/blog/monitoring-java-applications-and-getting-started-with-the-elastic-apm-java-agent) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Jenkins pipeline shared library for the project Elastic APM 🌟](https://github.com/elastic/apm-pipeline-library) <span class='md-tag md-tag--info'>⭐ 11</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [bqstack.com: Monitoring Application using Elastic APM](https://bqstack.com/b/detail/109) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Successful Kubernetes Monitoring Three Pitfalls to Avoid](https://www.dynatrace.com/news/blog/successful-kubernetes-monitoring-3-pitfalls-to-avoid) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Tutorial: Guide to automated SRE-driven performance engineering 🌟](https://www.dynatrace.com/news/blog/guide-to-automated-sre-driven-performance-engineering-analysis) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: 4 steps to modernize your IT service operations with Dynatrace](https://www.dynatrace.com/news/blog/4-steps-to-modernize-your-it-service-operations-with-dynatrace) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: Analyze all AWS data in minutes with Amazon CloudWatch Metric' Streams available in Dynatrace](https://www.dynatrace.com/news/blog/amazon-cloudwatch-metric-streams-launch-partnership) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: New Dynatrace Operator elevates cloud-native observability' for Kubernetes](https://www.dynatrace.com/news/blog/new-dynatrace-operator-elevates-cloud-native-observability-for-kubernetes) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: How to collect Prometheus metrics in Dynatrace](https://www.dynatrace.com/news/blog/how-to-collect-prometheus-metrics-in-dynatrace) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [dynatrace.com: Automatic connection of logs and traces accelerates AI-driven' cloud analytics](https://www.dynatrace.com/news/blog/automatic-connection-of-logs-and-traces-accelerates-ai-driven-cloud-analytics) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [devops.com: Dynatrace Advances Application Environments as Code](https://devops.com/dynatrace-advances-application-environments-as-code) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [thenewstack.io: Serverless Needs More Observability Tools](https://thenewstack.io/serverless-needs-more-observability-tools) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Apache Beam](https://beam.apache.org) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Krossboard](https://krossboard.app) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Krossboard: A centralized usage analytics approach for multiple Kubernetes](https://itnext.io/in-search-of-converged-usage-analytics-for-multiple-managed-kubernetes-c5108cb7f0e1) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [cloudbees.com: Automated Build and Deploy Feedback Using Jenkins and Instana' 🌟](https://www.cloudbees.com/blog/automated-build-deploy-feedback-using-instana) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Netdata](https://github.com/netdata/netdata) <span class='md-tag md-tag--info'>⭐ 78906</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [PM2](https://github.com/Unitech/pm2) <span class='md-tag md-tag--info'>⭐ 43173</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [Huginn](https://github.com/huginn/huginn) <span class='md-tag md-tag--info'>⭐ 49312</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [OS Query](https://github.com/osquery/osquery) <span class='md-tag md-tag--info'>⭐ 23262</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [Glances](https://github.com/nicolargo/glances) <span class='md-tag md-tag--info'>⭐ 32615</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [TDengine](https://github.com/taosdata/TDengine) <span class='md-tag md-tag--info'>⭐ 24860</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [stackpulse.com: Automated Kubernetes Pod Restarting Analysis with StackPulse](https://stackpulse.com/blog/automated-kubernetes-pod-restarting-analysis-with-stackpulse) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Checkly](https://www.checklyhq.com) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [hashicorp.com: Monitoring as Code with Terraform Cloud and Checkly](https://www.hashicorp.com/blog/monitoring-as-code-with-terraform-cloud-and-checkly) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [network-king.net: IoT use in healthcare grows but has some pitfalls](https://network-king.net/iot-use-in-healthcare-grows-but-has-its-pitfalls) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Zebrium](https://www.zebrium.com) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [louislam/uptime-kuma](https://github.com/louislam/uptime-kuma) <span class='md-tag md-tag--info'>⭐ 87087</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [OpenTelemetry (OTel) vs Application Performance Monitoring (APM)](https://medium.com/@rahul.fiem/opentelemetry-otel-vs-application-performance-monitoring-apm-86ae829877cf) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [OOMKilled in Kubernetes: Understanding and Preventing Hidden Memory Leaks](https://unixarena.com/2025/04/oomkilled-in-kubernetes-the-hidden-memory-leaks-youre-missing.html) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [Grafana](https://grafana.com) <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span>
- [SigNoz: Open source Application Performance Monitoring (APM) & Observability' tool 🌟](https://github.com/SigNoz/signoz) <span class='md-tag md-tag--info'>⭐ 26999</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span>
- [Prometheus JMX Exporter 🌟](https://github.com/prometheus/jmx_exporter) <span class='md-tag md-tag--info'>⭐ 3305</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span>
- [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) <span class='md-tag md-tag--info'>⭐ 7050</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span>
## Cloud Infrastructure
### Service Mesh
#### Istio Mesh
- [Istio.io](https://istio.io) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> — The premier open-source service mesh providing advanced traffic management, end-to-end security, and granular observability. Uses Envoy proxies (via sidecars or Ambient mode) to secure and manage microservice fabrics.
## Cloud Native Infrastructure
### Observability
#### Distributed Tracing
##### Jaeger Platform
- [jaegertracing.io](https://www.jaegertracing.io) <span class='md-tag md-tag--primary'>[DOCUMENTATION]</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — The official gateway for Jaeger, a CNCF-graduated distributed tracing platform. Essential for microservice architectures to monitor transactions, perform root-cause analysis, optimize performance bottlenecks, and visualize complex request propagation paths.
#### Log Analysis
##### Visualization Tools
- [Kibana](https://www.elastic.co/kibana) <span class='md-tag md-tag--primary'>[DOCUMENTATION]</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — The foundational visualization and management interface for the Elastic Stack. Enables operators to search, index, analyze, and construct real-time security dashboards and log analysis patterns for high-throughput microservice applications.
## Cloud Native Languages
### Java
#### Performance Tuning
- [tier1app.com](https://tier1app.com) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — A dedicated APM tool for analyzing Java thread dumps and performance. Provides automated diagnostics for thread contention and deadlocks to optimize JVM application responsiveness.
- [fastthread.io](https://fastthread.io) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — Industrial-grade online Java thread dump analyzer that uses AI diagnostics to identify CPU spikes, thread leaks, and deadlock patterns. Essential for post-mortem analysis of containerized JVM workloads.
- [gceasy.io](https://gceasy.io) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — Machine-learning powered JVM Garbage Collection log analyzer. Automates the detection of memory leaks, GC pauses, and heap sizing misconfigurations, offering actionable recommendations for optimization.
- [heaphero.io](https://heaphero.io) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — An automated cloud-based JVM heap dump analyzer built to parse large memory dumps quickly. Detects memory leaks and optimizes data structure footprints to resolve OutOfMemoryError crashes.
## Event-Driven Architecture
### Apache Kafka
#### Tooling and UI
- [Kafdrop Kafka Web UI 🌟](https://github.com/obsidiandynamics/kafdrop) <span class='md-tag md-tag--info'>⭐ 6135</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> <span class='md-tag md-tag--info'>[ENTERPRISE-STABLE]</span> — Curator Insight: Highly popular, lightweight web UI for monitoring and managing Apache Kafka. Live Grounding: Renders cluster info, brokers, topics, partition offsets, consumer group lag, and allows active JSON/protobuf message payload inspection.
## Infrastructure Operations
### Sysadmin Toolsets
#### Resource Curation
##### Awesome Lists
- [Awesome Sysadmin](https://github.com/awesome-foss/awesome-sysadmin) <span class='md-tag md-tag--info'>⭐ 33981</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> — An incredibly rich curation containing production-grade open source utilities, control planes, networking layers, and security mechanisms used daily by systems architects and site reliability engineers.
## Observability (1)
### Telemetry Standards
#### OpenTelemetry vs Prometheus
- [Prometheus and OpenTelemetry Compatibility Issues](https://thenewstack.io/prometheus-and-opentelemetry-just-couldnt-get-along) <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span> — An informative look at the historical data model incompatibilities between Prometheus and OpenTelemetry (OTel). It details the industry efforts to reconcile standard Prometheus structures with the broader OTel landscape.
## Observability and Performance
### Kubernetes Internals
#### Resource Management
- [The Hidden CPU Throttling Crisis in Kubernetes Clusters](https://www.kubenatives.com/p/the-hidden-cpu-throttling-crisis) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span> — An in-depth analysis exposing the silent threat of CPU throttling inside Kubernetes clusters caused by rigid CFS quota management. Demonstrates how microservices suffer latency spikes even with low aggregate CPU consumption.
### Performance Testing
#### HTTP Benchmarking
- [blog.cloud-mercato.com: New HTTP benchmark tool **pycurlb**](https://blog.cloud-mercato.com/new-http-benchmark-tool-pycurlb) <span class='md-tag md-tag--warning'>[EN CONTENT]</span> <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--info'>[COMMUNITY-TOOL]</span> — A deep-dive introducing `pycurlb`, a fast performance tool wrapping libcurl for rapid HTTP request benchmarking in Python. Explores real-world performance results and technical comparisons.
## Operations and Reliability
### Observability and Monitoring
#### Foundations
- [Monitoring Distributed Systems - Google SRE Book](https://sre.google/sre-book/monitoring-distributed-systems) <span class='md-tag md-tag--critical'>[ADVANCED LEVEL]</span> <span class='md-tag md-tag--primary'>[DOCUMENTATION]</span> <span class='md-tag md-tag--success'>[DE FACTO STANDARD]</span> — The industry-standard chapter from Google's SRE book detailing the implementation of distributed systems monitoring. It defines the 'Four Golden Signals'—latency, traffic, errors, and saturation—providing practical blueprints to prevent alert fatigue and build actionable dashboard designs.
---
💡 **Explore Related:** [Mkdocs](./mkdocs.md) | [Cheatsheets](./cheatsheets.md) | [Git](./git.md)