docs: translate GEMINI.md to English and add sync mandate

This commit is contained in:
Nubenetes Bot
2026-05-14 23:32:24 +02:00
parent 0cfbf65d0f
commit aa1cd0ce1f

View File

@@ -1,22 +1,22 @@
# Nubenetes Intelligent Curation: Meta-Instructions & Learning Roadmap
Este archivo contiene las instrucciones acumuladas y la visión de largo plazo para el mantenimiento autónomo de Nubenetes.com. Los agentes de IA deben consultar este documento en cada iteración para garantizar la continuidad del aprendizaje.
This file contains the accumulated instructions and long-term vision for the autonomous maintenance of Nubenetes.com. AI agents must consult this document in every iteration to ensure learning continuity.
## 🧠 Core Mandates (Mandatos Principales)
## 🧠 Core Mandates
1. **Preservación de la Información**: NUNCA elimines resúmenes, comentarios o estrellas (🌟) que acompañan a los enlaces. El bot solo debe actualizar la URL o reorganizar la posición del ítem, nunca borrar el contexto descriptivo.
2. **Aprendizaje Persistente**: Utiliza `src/memory/health_learning.json` para almacenar el conocimiento sobre dominios (bloqueos anti-bot, estrategias exitosas) y patrones de navegación.
3. **Minimum Viable Quality (MVQ)**: For GitHub/GitLab repositories, the bot MUST check the last commit date. If the repository has had NO activity (commits) in more than **4 years**, it must receive a significantly lower `impact_score` and be deprioritized, even if the content remains technically relevant. This ensures Nubenetes stays fresh and focuses on maintained projects.
4. **Style Guide (Descriptive Summaries)**: All injected summaries MUST follow a **Descriptive** style. Avoid generic "clickbait" or action-oriented phrases (e.g., "Check this out"). Instead, provide a clear, neutral description of what the resource contains, its scope, and why it is technically significant for the Kubernetes ecosystem.
5. **Semantic Interlinking**: The bot should identify related categories for each resource. While the full entry is injected into the primary category, a short reference (*"See also: [Title](URL) in [Category]"*) should be added to up to two related categories to improve site navigation.
6. **Visual Health Dashboard**: Every curation run MUST generate a local `report.html` (outside the repo) for visual validation of metrics, quality (MVQ), and AI decisions.
7. **Resiliencia Total**: El workflow debe ser capaz de continuar incluso si hay errores individuales en validaciones de links o archivos. Prioriza generar un resultado (PR) aunque sea parcial.
8. **Consolidación de Repositorios**: Ante un fallo en un enlace profundo de GitHub/GitLab, intenta siempre validar la raíz del repositorio antes de darlo por muerto. Preferimos enlaces estables a raíces de repositorios que deep-links volátiles.
9. **Expansión de URLs**: Todos los enlaces acortados (t.co, bit.ly, buff.ly, etc.) DEBEN ser expandidos a su versión larga original antes de ser evaluados o inyectados. Esto garantiza la homogeneidad del inventario y mejora la precisión de la deduplicación global.
10. **Idioma Oficial (English Only)**: Todo el contenido inyectado (títulos, descripciones, encabezados), los logs de ejecución y las comunicaciones automatizadas (PRs) DEBEN ser exclusivamente en INGLÉS. Nubenetes es un recurso global y la consistencia lingüística es crítica.
11. **Workflow-Config Synchronization**: The GitHub Actions curation workflow form (`agentic_cron.yml`) MUST remain perfectly synchronized with the curation sources configuration file (`data/curation_sources.yaml`). Any addition, removal, or renaming of topics/categories in the configuration file requires a corresponding update to the workflow's input fields (checkboxes) to ensure users can toggle those sources manually. This maintain consistency between data-driven sources and the UI trigger.
1. **Information Preservation**: NEVER delete summaries, comments, or stars (🌟) accompanying links. The bot should only update the URL or reorganize the item's position, never delete the descriptive context.
2. **Persistent Learning**: Use `src/memory/health_learning.json` to store knowledge about domains (anti-bot blocks, successful strategies) and navigation patterns.
3. **Minimum Viable Quality (MVQ)**: For GitHub/GitLab repositories, the bot MUST check the last commit date. If the repository has had NO activity (commits) in more than **4 years**, it must receive a significantly lower `impact_score` and be deprioritized, even if the content remains technically relevant. This ensures Nubenetes stays fresh and focuses on maintained projects.
4. **Style Guide (Descriptive Summaries)**: All injected summaries MUST follow a **Descriptive** style. Avoid generic "clickbait" or action-oriented phrases (e.g., "Check this out"). Instead, provide a clear, neutral description of what the resource contains, its scope, and why it is technically significant for the Kubernetes ecosystem.
5. **Semantic Interlinking**: The bot should identify related categories for each resource. While the full entry is injected into the primary category, a short reference (*"See also: [Title](URL) in [Category]"*) should be added to up to two related categories to improve site navigation.
6. **Visual Health Dashboard**: Every curation run MUST generate a local `report.html` (outside the repo) for visual validation of metrics, quality (MVQ), and AI decisions.
7. **Total Resilience**: The workflow must be able to continue even if there are individual errors in link or file validations. Prioritize generating a result (PR) even if it is partial.
8. **Repository Consolidation**: In case of a failure in a deep GitHub/GitLab link, always try to validate the repository root before considering it dead. We prefer stable links to repository roots over volatile deep-links.
9. **URL Expansion**: All shortened links (t.co, bit.ly, buff.ly, etc.) MUST be expanded to their original long version before being evaluated or injected. This ensures inventory homogeneity and improves global deduplication precision.
10. **Official Language (English Only)**: All injected content (titles, descriptions, headers), execution logs, and automated communications (PRs) MUST be exclusively in ENGLISH. Nubenetes is a global resource and linguistic consistency is critical.
11. **Workflow-Config Synchronization**: The GitHub Actions curation workflow form (`agentic_cron.yml`) MUST remain perfectly synchronized with the curation sources configuration file (`data/curation_sources.yaml`). Any addition, removal, or renaming of topics/categories in the configuration file requires a corresponding update to the workflow's input fields (checkboxes) to ensure users can toggle those sources manually. This maintains consistency between data-driven sources and the UI trigger.
## 🛠️ Structural Evolution & Navigation (Evolución Estructural)
## 🛠️ Structural Evolution & Navigation
* **No Link Limits**: There are NO hard limits on the number of links per page or per section (##/###). Nubenetes is built to host thousands of references.
* **TOC Consistency**: Every `.md` page (including the main index `docs/index.md`) MUST maintain an internal Table of Contents (TOC) at the beginning. This TOC must include all sections (##) and subsections (###) nested correctly using a numbered list format with working anchors.
@@ -33,21 +33,21 @@ Este archivo contiene las instrucciones acumuladas y la visión de largo plazo p
* `mkdocs.yml` (Navigation menu).
* `docs/index.md` (Main Table of Contents).
* The internal TOC of the modified page.
* **Orphan Curation**: Periodically audit the `docs/` folder to find unlinked files and integrate them into the navigation based on their topic.
## 🚀 Estrategias de Evasión de Bloqueos
## 🚀 Block Evasion Strategies
El bot debe rotar entre perfiles para evitar ser detectado:
1. **Desktop/Google**: Petición estándar de escritorio.
2. **Mobile/Twitter**: Petición móvil con Referer de Twitter (alta tasa de éxito).
3. **Playwright/LinkedIn**: Navegación real con JS habilitado.
4. **Firefox/Reddit**: Perfil alternativo de escritorio.
The bot must rotate between profiles to avoid detection:
1. **Desktop/Google**: Standard desktop request.
2. **Mobile/Twitter**: Mobile request with Twitter Referer (high success rate).
3. **Playwright/LinkedIn**: Real navigation with JS enabled.
4. **Firefox/Reddit**: Alternative desktop profile.
## 📈 Diario de Aprendizaje (Historial de Mejoras)
## 📈 Learning Diary (Improvement History)
* **Mayo 2026**: Implementación inicial del motor autónomo con Playwright y Wayback Machine.
* **Mayo 2026**: Añadido sistema de Evasión Multidimensional (5 intentos, rotación de perfiles).
* **Mayo 2026**: Creación del `AgenticCurator` para auditoría de navegación y consolidación de repositorios.
* **Mayo 2026**: Generación de PRs con analíticas visuales (Mermaid) y Matriz de Salud.
* **Mayo 2026**: Implementación de Curaduría vía Backup (JSON/MD) para evitar bloqueos de X.com.
* **May 2026**: Initial implementation of the autonomous engine with Playwright and Wayback Machine.
* **May 2026**: Added Multidimensional Evasion system (5 attempts, profile rotation).
* **May 2026**: Creation of `AgenticCurator` for navigation audit and repository consolidation.
* **May 2026**: Generation of PRs with visual analytics (Mermaid) and Health Matrix.
* **May 2026**: Implementation of Backup-based Curation (JSON/MD) to avoid X.com blocks.
* **May 2026**: Implementation of multi-source curation and category-based filtering in GitHub Workflow.