# Go vs Node.js Cache Comparison Systematic comparison of Go pipeline output (`_cache/`) vs Node.js production cache (`LIVE_cache/`). Generated by `cmd/comparecache`. Latest run: 2026-03-10. `-no-fetch` rebuild from existing raw data, with variant tagging and legacy export filter applied. ## Summary (latest version only) | Category | Count | Meaning | |----------|-------|---------| | match | 21 | Identical asset filenames at latest version | | go-missing | 4 | Go produces no output (alias, meta-package, no config) | | live-missing | 16 | Package exists in Go but not in live cache | | go-extra-versions | 44 | Go has more version history (deeper fetch) | | live-extra-versions | 11 | Live has newer data (rate-limited Go fetch) | | go-extra-assets | 19 | Go includes assets that Node.js filters out | | live-extra-assets | 41 | Node.js includes assets that Go filters out | | live-has-meta | 41 | Node.js includes meta-assets (checksums, sigs) | Changes since last comparison: variant tagging (bun, pwsh, ollama, git, node, lsd, fish, xcaddy) and legacy export filter (strips Variants-tagged assets and non-legacy formats like .deb, .rpm, .tar.bz2, .tar.zst, .7z) are now active. Match count dropped from 28→21 because .deb files that Node.js keeps are now correctly categorized as live-extra-assets (Go filters them from legacy output). ## Key Observations ### 1. Classification Timing The Node.js cache stores assets with **empty** os/arch/ext fields — `normalize.js` fills those at serve time. The Go pipeline classifies at write time. The Go cache has richer data per-asset. Comparison is done at the **filename level**. ### 2. Meta-Asset Filtering Go's `isMetaAsset()` filters out checksums, signatures, SBOMs, etc. Node.js keeps them. This accounts for 41 packages showing `live-has-meta` differences. **Correct behavior** — Go filters non-installable files at cache time. ### 3. Version Depth Go has deeper version history for most GitHub-sourced packages (fetches all pages unless `-shallow`). Node.js limits to 30 releases per API call. This is a **feature** — Go provides complete histories when doing a full fetch. ### 4. Build Variants (IMPLEMENTED) Variant tagging is now active. Per-package taggers in `internal/releases/{pkg}/`. Assets with Variants are stored but excluded from legacy export. - **bun**: `-profile` → Variants: ["profile"]; non-baseline → Arch: amd64v3 - **ollama**: `-rocm`, `-jetpack5`, `-jetpack6` → Variants - **pwsh**: `-fxdependent`, `-fxdependentWinDesktop` → Variants - **git**: `.exe` and PortableGit → Variants: ["installer"]; `-pdb` → ["pdb"] - **node**: `.msi` → Variants: ["installer"] - **lsd**: `.deb` → ["deb"]; `-msvc` → ["msvc"] - **fish**: `.pkg` → Variants: ["installer"] - **xcaddy**: `.deb` → Variants: ["deb"] ### 5. Legacy Export Filter (IMPLEMENTED) `ExportLegacy` now strips: - Assets with non-empty `Variants` - Formats Node.js doesn't handle: `.deb`, `.rpm`, `.snap`, `.appx`, `.tar.zst`, `.tar.bz2`, `.7z` This means the `_cache/` JSON files only contain assets the Node.js server can actually serve. ### 6. Format Handling All formats are stored in the internal Go model — nothing is dropped at classification time. The legacy filter applies only at export time. - `.pkg` — `pkgutil --expand-full` - `.deb` — `ar x` + `tar xf data.tar.*` - `.dmg` — `hdiutil attach` - `.msi` — `msiexec /a` Only `.exe` is ambiguous (binary vs installer). Installer `.exe` files get `Variants: ["installer"]`. ### 7. Node Multi-Source The `node` package merges official + unofficial builds via `unofficial_url` in releases.conf. Down to 4 differences at latest version: - Live has `.7z` (filtered from Go legacy export) and `.msi` (Go tags as installer) - Go has `.exe` bare binary that live doesn't (naming diff) ## Per-Package Checklist Status: `[x]` reviewed, `[-]` known acceptable, `[ ]` needs work ### Exact Matches at Latest Version (21) - [x] atomicparsley - [x] awless - [x] chromedriver (chromedist) - [x] comrak - [x] dotenv-linter - [x] gpg (gpgdist) - [x] iterm2 (iterm2dist) - [x] julia (juliadist) - [x] koji - [x] lsd — .deb and msvc variants now correctly filtered - [x] mariadb (mariadbdist) - [x] pathman - [x] sass - [x] sd - [x] shellcheck - [x] shfmt - [x] sqlc - [x] terraform (hashicorp) - [x] xcaddy — .deb variants now correctly filtered - [x] xsv - [x] zig (zigdist) ### Go Missing (4) - [-] dashd — alias_of=dashcore (correct) - [-] macos — no releases.conf - [-] pg-essentials — meta-package - [-] zig.vim — gittag source, 0 raw data ### Live Missing — Go-Only (14) - [-] pg — Go alias, live uses postgres - [-] ripgrep — Go alias, live uses rg - [-] rust.vim — symlink to vim-rust - [-] vim-* (11 packages) — gittag packages not in live cache ### Meta-Only Diffs (Go correctly filters, Node.js keeps) - [-] caddy, cilium, cmake, curlie, dashmsg, deno, dotenv, ffuf, fzf, gh, gitdeploy, goreleaser, gprox, grype, hugo, k9s, keypairs, kind, kubectx, kubens, monorel, mutagen, ots, rclone, rg, runzip, sclient, sqlpkg, sttr, syncthing, terramate, watchexec, xz, yq (41 packages) ### Live has .deb/.rpm that Go correctly filters from legacy export - [-] bat — 8 .deb files - [-] caddy — 9 .deb files - [-] delta — 5 .deb files - [-] fd — 8 .deb files - [-] gh — 8 .deb/.rpm files - [-] goreleaser — 18 .deb/.rpm files - [-] grype — 8 .deb/.rpm files - [-] hexyl — 7 .deb files - [-] k9s — 10 .deb/.rpm files - [-] pandoc — 2 .deb files - [-] pwsh — 4 .deb/.rpm files - [-] rclone — 16 .deb/.rpm files - [-] sttr — 9 .deb/.rpm files - [-] syncthing — 5 .deb/.rpm files - [-] terramate — 6 .deb/.rpm files - [-] tinygo — 3 .deb files - [-] trip — 3 .deb files - [-] watchexec — 16 .deb files - [-] zoxide — 4 .deb files ### Remaining Go-Extra-Assets (need review) - [x] bun — baseline builds now serve as legacy amd64 (filename stripped, download URL kept); non-baseline tagged as v3 variant (excluded). - [x] fish — source tarball tagged as variant; linux .tar.xz binaries are correct (Node.js just doesn't have them yet) - [x] git — busybox and pdbs-for-git tagged as variants - [-] hugo — 1 extra: `Linux-64bit.tar.gz` (old naming); keep as-is for now - [-] hugo-extended — 14 extras: non-extended assets leaking in; keep as-is for now - [x] kubectx — asset_filter splits shared release - [x] kubens — asset_filter splits shared release - [x] node — .exe bare binary stored with "bare-exe" variant (Go can serve, legacy excludes); .msi tagged as installer - [x] ollama — Ollama-darwin.zip tagged as "app" variant (Go can install, legacy excludes); .tgz normalized to .tar.gz in filename - [-] uuidv7 — exotic arches correctly classified; resolver filters by request - [x] yq — man_page_only excluded via releases.conf - [x] ffmpeg — asset_filter=ffmpeg excludes ffprobe/ffplay; .LICENSE/.README now caught by isMetaAsset ### Source/Naming Diffs - [-] aliasman — source tarball naming differences (GitHub archive format) - [-] duckdns.sh — source tarball naming differences - [-] serviceman — source naming + version differences ### Stale Data (rate-limited, need re-fetch with token) - [-] go — live has 98 extra versions (Go didn't fetch golang.org) - [-] lf — live has 30 extra versions - [-] postgres, psql — Go has v17, live has v18 - [-] ffmpeg — Go has older, live has newer ### Cross-Package Issues - [x] kubectx/kubens — resolved via asset_filter in releases.conf ## Remaining Action Items 1. ~~**hugo-extended exclude**~~: Deferred — keep matching Node.js behavior for now 2. ~~**kubectx/kubens split**~~: Resolved — asset_filter in releases.conf 3. ~~**bun baseline in legacy**~~: Resolved — baseline is legacy amd64, non-baseline tagged as v3 variant 4. **Re-fetch with GITHUB_TOKEN**: Fix rate-limited/stale packages 5. **Unknown asset notifications**: Log new/unrecognized assets to `_notices/` ## Deferred Decisions 1. **Consolidate cmd/classify and cmd/webicached duplication**: Both have their own `classifyPackage` switch, `isMetaAsset`, `detectFormat`, and GitHub API types (`ghRelease`, `ghAsset`, etc.). `cmd/classify` is a diagnostic tool (CSV output), `cmd/webicached` is the production pipeline (`[]storage.Asset`). Shared pieces could move to `internal/` packages. Keep separate dispatchers since they return different types.