From aa1cd0ce1f654576914775820aa2d6f2ac25b38e Mon Sep 17 00:00:00 2001 From: Nubenetes Bot Date: Thu, 14 May 2026 23:32:24 +0200 Subject: [PATCH] docs: translate GEMINI.md to English and add sync mandate --- GEMINI.md | 54 +++++++++++++++++++++++++++--------------------------- 1 file changed, 27 insertions(+), 27 deletions(-) diff --git a/GEMINI.md b/GEMINI.md index ff376aed..9e800ab3 100644 --- a/GEMINI.md +++ b/GEMINI.md @@ -1,22 +1,22 @@ # Nubenetes Intelligent Curation: Meta-Instructions & Learning Roadmap -Este archivo contiene las instrucciones acumuladas y la visión de largo plazo para el mantenimiento autónomo de Nubenetes.com. Los agentes de IA deben consultar este documento en cada iteración para garantizar la continuidad del aprendizaje. +This file contains the accumulated instructions and long-term vision for the autonomous maintenance of Nubenetes.com. AI agents must consult this document in every iteration to ensure learning continuity. -## 🧠 Core Mandates (Mandatos Principales) +## 🧠 Core Mandates -1. **Preservación de la Información**: NUNCA elimines resúmenes, comentarios o estrellas (🌟) que acompañan a los enlaces. El bot solo debe actualizar la URL o reorganizar la posición del ítem, nunca borrar el contexto descriptivo. -2. **Aprendizaje Persistente**: Utiliza `src/memory/health_learning.json` para almacenar el conocimiento sobre dominios (bloqueos anti-bot, estrategias exitosas) y patrones de navegación. -3. **Minimum Viable Quality (MVQ)**: For GitHub/GitLab repositories, the bot MUST check the last commit date. If the repository has had NO activity (commits) in more than **4 years**, it must receive a significantly lower `impact_score` and be deprioritized, even if the content remains technically relevant. This ensures Nubenetes stays fresh and focuses on maintained projects. -4. **Style Guide (Descriptive Summaries)**: All injected summaries MUST follow a **Descriptive** style. Avoid generic "clickbait" or action-oriented phrases (e.g., "Check this out"). Instead, provide a clear, neutral description of what the resource contains, its scope, and why it is technically significant for the Kubernetes ecosystem. -5. **Semantic Interlinking**: The bot should identify related categories for each resource. While the full entry is injected into the primary category, a short reference (*"See also: [Title](URL) in [Category]"*) should be added to up to two related categories to improve site navigation. -6. **Visual Health Dashboard**: Every curation run MUST generate a local `report.html` (outside the repo) for visual validation of metrics, quality (MVQ), and AI decisions. -7. **Resiliencia Total**: El workflow debe ser capaz de continuar incluso si hay errores individuales en validaciones de links o archivos. Prioriza generar un resultado (PR) aunque sea parcial. -8. **Consolidación de Repositorios**: Ante un fallo en un enlace profundo de GitHub/GitLab, intenta siempre validar la raíz del repositorio antes de darlo por muerto. Preferimos enlaces estables a raíces de repositorios que deep-links volátiles. -9. **Expansión de URLs**: Todos los enlaces acortados (t.co, bit.ly, buff.ly, etc.) DEBEN ser expandidos a su versión larga original antes de ser evaluados o inyectados. Esto garantiza la homogeneidad del inventario y mejora la precisión de la deduplicación global. -10. **Idioma Oficial (English Only)**: Todo el contenido inyectado (títulos, descripciones, encabezados), los logs de ejecución y las comunicaciones automatizadas (PRs) DEBEN ser exclusivamente en INGLÉS. Nubenetes es un recurso global y la consistencia lingüística es crítica. -11. **Workflow-Config Synchronization**: The GitHub Actions curation workflow form (`agentic_cron.yml`) MUST remain perfectly synchronized with the curation sources configuration file (`data/curation_sources.yaml`). Any addition, removal, or renaming of topics/categories in the configuration file requires a corresponding update to the workflow's input fields (checkboxes) to ensure users can toggle those sources manually. This maintain consistency between data-driven sources and the UI trigger. +1. **Information Preservation**: NEVER delete summaries, comments, or stars (🌟) accompanying links. The bot should only update the URL or reorganize the item's position, never delete the descriptive context. +2. **Persistent Learning**: Use `src/memory/health_learning.json` to store knowledge about domains (anti-bot blocks, successful strategies) and navigation patterns. +3. **Minimum Viable Quality (MVQ)**: For GitHub/GitLab repositories, the bot MUST check the last commit date. If the repository has had NO activity (commits) in more than **4 years**, it must receive a significantly lower `impact_score` and be deprioritized, even if the content remains technically relevant. This ensures Nubenetes stays fresh and focuses on maintained projects. +4. **Style Guide (Descriptive Summaries)**: All injected summaries MUST follow a **Descriptive** style. Avoid generic "clickbait" or action-oriented phrases (e.g., "Check this out"). Instead, provide a clear, neutral description of what the resource contains, its scope, and why it is technically significant for the Kubernetes ecosystem. +5. **Semantic Interlinking**: The bot should identify related categories for each resource. While the full entry is injected into the primary category, a short reference (*"See also: [Title](URL) in [Category]"*) should be added to up to two related categories to improve site navigation. +6. **Visual Health Dashboard**: Every curation run MUST generate a local `report.html` (outside the repo) for visual validation of metrics, quality (MVQ), and AI decisions. +7. **Total Resilience**: The workflow must be able to continue even if there are individual errors in link or file validations. Prioritize generating a result (PR) even if it is partial. +8. **Repository Consolidation**: In case of a failure in a deep GitHub/GitLab link, always try to validate the repository root before considering it dead. We prefer stable links to repository roots over volatile deep-links. +9. **URL Expansion**: All shortened links (t.co, bit.ly, buff.ly, etc.) MUST be expanded to their original long version before being evaluated or injected. This ensures inventory homogeneity and improves global deduplication precision. +10. **Official Language (English Only)**: All injected content (titles, descriptions, headers), execution logs, and automated communications (PRs) MUST be exclusively in ENGLISH. Nubenetes is a global resource and linguistic consistency is critical. +11. **Workflow-Config Synchronization**: The GitHub Actions curation workflow form (`agentic_cron.yml`) MUST remain perfectly synchronized with the curation sources configuration file (`data/curation_sources.yaml`). Any addition, removal, or renaming of topics/categories in the configuration file requires a corresponding update to the workflow's input fields (checkboxes) to ensure users can toggle those sources manually. This maintains consistency between data-driven sources and the UI trigger. -## 🛠️ Structural Evolution & Navigation (Evolución Estructural) +## 🛠️ Structural Evolution & Navigation * **No Link Limits**: There are NO hard limits on the number of links per page or per section (##/###). Nubenetes is built to host thousands of references. * **TOC Consistency**: Every `.md` page (including the main index `docs/index.md`) MUST maintain an internal Table of Contents (TOC) at the beginning. This TOC must include all sections (##) and subsections (###) nested correctly using a numbered list format with working anchors. @@ -33,21 +33,21 @@ Este archivo contiene las instrucciones acumuladas y la visión de largo plazo p * `mkdocs.yml` (Navigation menu). * `docs/index.md` (Main Table of Contents). * The internal TOC of the modified page. - * **Orphan Curation**: Periodically audit the `docs/` folder to find unlinked files and integrate them into the navigation based on their topic. -## 🚀 Estrategias de Evasión de Bloqueos +## 🚀 Block Evasion Strategies -El bot debe rotar entre perfiles para evitar ser detectado: -1. **Desktop/Google**: Petición estándar de escritorio. -2. **Mobile/Twitter**: Petición móvil con Referer de Twitter (alta tasa de éxito). -3. **Playwright/LinkedIn**: Navegación real con JS habilitado. -4. **Firefox/Reddit**: Perfil alternativo de escritorio. +The bot must rotate between profiles to avoid detection: +1. **Desktop/Google**: Standard desktop request. +2. **Mobile/Twitter**: Mobile request with Twitter Referer (high success rate). +3. **Playwright/LinkedIn**: Real navigation with JS enabled. +4. **Firefox/Reddit**: Alternative desktop profile. -## 📈 Diario de Aprendizaje (Historial de Mejoras) +## 📈 Learning Diary (Improvement History) -* **Mayo 2026**: Implementación inicial del motor autónomo con Playwright y Wayback Machine. -* **Mayo 2026**: Añadido sistema de Evasión Multidimensional (5 intentos, rotación de perfiles). -* **Mayo 2026**: Creación del `AgenticCurator` para auditoría de navegación y consolidación de repositorios. -* **Mayo 2026**: Generación de PRs con analíticas visuales (Mermaid) y Matriz de Salud. -* **Mayo 2026**: Implementación de Curaduría vía Backup (JSON/MD) para evitar bloqueos de X.com. +* **May 2026**: Initial implementation of the autonomous engine with Playwright and Wayback Machine. +* **May 2026**: Added Multidimensional Evasion system (5 attempts, profile rotation). +* **May 2026**: Creation of `AgenticCurator` for navigation audit and repository consolidation. +* **May 2026**: Generation of PRs with visual analytics (Mermaid) and Health Matrix. +* **May 2026**: Implementation of Backup-based Curation (JSON/MD) to avoid X.com blocks. +* **May 2026**: Implementation of multi-source curation and category-based filtering in GitHub Workflow.