diff --git a/README.md b/README.md index 5f624b0a..d3905f59 100644 --- a/README.md +++ b/README.md @@ -307,7 +307,8 @@ To maximize economic efficiency, all AI agents follow a **Database-First** appro ### 6.3. Database Lifecycle and Hygiene To maintain a high-performance "Single Source of Truth", Nubenetes implements automated hygiene protocols: -- **Semantic Drift Detection**: Using **SHA256 Content Fingerprinting**, the system monitors for silent updates in technical resources. If the content of a link changes significantly (e.g., a version update or blog rewrite), it is automatically flagged for AI re-evaluation to refresh its summary and impact score. +- **Physical File Synchronization**: During the health check cycle, the engine performs **surgical line-by-line updates** on the V1 Markdown files. Dead links are physically removed, and permanent redirections (301/302) are updated to their **Canonical URLs**, ensuring the repository remains clean and low-latency. +- **Semantic Drift Detection**: Using **SHA256 Content Fingerprinting**, the system monitors for silent updates. If resource content changes significantly, it is automatically flagged for AI re-evaluation to refresh its summary and impact score. - **GitHub Branch Auto-Heal**: If a deep link returns a 404, the engine automatically attempts to rescue it by migrating the path from `master` to `main`. Verified revivals are automatically updated in the V1 archive. - **Parked Domain Detection**: Using AI-driven content inspection, the engine identifies expired domains displaying "Buy this domain" parking pages, marking them as `DEAD` even if they return an HTTP 200 status. - **Auto-Redirect Fix (Canonical Updates)**: During health checks, if a permanent redirection (301/302) is detected, the engine automatically updates the Markdown files with the final **Canonical URL**. This reduces latency and prevents future link rot. @@ -475,10 +476,9 @@ graph TD The heart of the new Nubenetes is a suite of AI Agents that operate on our `develop` branch: 1. **AgenticCurator (`src/agentic_curator.py`)**: - - **Discovery:** Scans multiple high-trust X.com accounts (defined in [`data/curation_sources.yaml`](data/curation_sources.yaml)) and other curation sources. - - **Evaluation:** Uses Gemini to score resources based on technical significance, impact, and **publication year**. - - **Classification:** Automatically maps new resources to the correct `.md` page using semantic matching and generates professional technical descriptions. - - **Primary Curation Sources (X.com):** + - **Discovery:** Scans multiple high-trust X.com accounts and RSS feeds. + - **Quality Hardening (Mandate 2 & 3):** Systematically filters known blacklisted domains and applies technical impact penalties to stale GitHub repositories (>4 years without activity) to protect V2 Elite standards. + - **Classification:** Automatically maps new resources using the **Recursive technical hierarchy** and generates multi-language descriptions (Native for V1, English for V2). * **K8s & Cloud Native:** `@nubenetes`, `@kubernetesio`, `@cncf`, `@kelseyhightower`, `@memenetes`. * **Hyperscalers:** `@awscloud`, `@Azure`, `@GoogleCloud`, `@0GiS0`, `@NTFAQGuy`, `@cantrillio`, `@pvergadia`, `@QuinnyPig`. * **AI & Agents:** `@OpenAI`, `@AnthropicAI`, `@GoogleDeepMind`, `@GoogleAI`, `@LoganK`, `@NotebookLM`, `@LangChainAI`, `@llama_index`. @@ -564,7 +564,7 @@ graph LR ``` ### 9.5. Automated Mandate Auditing -Every Pull Request generated by the Agentic engine includes a non-blocking **Safety and Mandate Audit** report. This report cross-references the changes against the foundations defined in [`GEMINI.md`](GEMINI.md): +Every Pull Request generated by the system (covering both **Curation** and **Health Check** cycles) includes a non-blocking **Safety and Mandate Audit** report. This report cross-references the changes against the foundations defined in [`GEMINI.md`](GEMINI.md): - **Data Integrity**: Checks for accidental star degradation or V1 description loss. - **Architecture Compliance**: Verifies the recursive O'Reilly hierarchy and TOC presence in V2. - **MVQ Validation**: Audits the Elite portal for stale or low-impact repositories.