Files
vim-ale/COMPARISON.md
AJ ONeal cdec995183 ref(node): remove node-official/node-unofficial split packages
The node package already merges both sources via unofficial_url
in releases.conf. The split packages were a workaround that
produced cache files not present in the live Node.js cache.
2026-03-10 16:58:26 -06:00

8.3 KiB

Go vs Node.js Cache Comparison

Systematic comparison of Go pipeline output (_cache/) vs Node.js production cache (LIVE_cache/). Generated by cmd/comparecache.

Latest run: 2026-03-10. -no-fetch rebuild from existing raw data, with variant tagging and legacy export filter applied.

Summary (latest version only)

Category Count Meaning
match 21 Identical asset filenames at latest version
go-missing 4 Go produces no output (alias, meta-package, no config)
live-missing 16 Package exists in Go but not in live cache
go-extra-versions 44 Go has more version history (deeper fetch)
live-extra-versions 11 Live has newer data (rate-limited Go fetch)
go-extra-assets 19 Go includes assets that Node.js filters out
live-extra-assets 41 Node.js includes assets that Go filters out
live-has-meta 41 Node.js includes meta-assets (checksums, sigs)

Changes since last comparison: variant tagging (bun, pwsh, ollama, git, node, lsd, fish, xcaddy) and legacy export filter (strips Variants-tagged assets and non-legacy formats like .deb, .rpm, .tar.bz2, .tar.zst, .7z) are now active. Match count dropped from 28→21 because .deb files that Node.js keeps are now correctly categorized as live-extra-assets (Go filters them from legacy output).

Key Observations

1. Classification Timing

The Node.js cache stores assets with empty os/arch/ext fields — normalize.js fills those at serve time. The Go pipeline classifies at write time. The Go cache has richer data per-asset. Comparison is done at the filename level.

2. Meta-Asset Filtering

Go's isMetaAsset() filters out checksums, signatures, SBOMs, etc. Node.js keeps them. This accounts for 41 packages showing live-has-meta differences. Correct behavior — Go filters non-installable files at cache time.

3. Version Depth

Go has deeper version history for most GitHub-sourced packages (fetches all pages unless -shallow). Node.js limits to 30 releases per API call. This is a feature — Go provides complete histories when doing a full fetch.

4. Build Variants (IMPLEMENTED)

Variant tagging is now active. Per-package taggers in internal/releases/{pkg}/. Assets with Variants are stored but excluded from legacy export.

  • bun: -profile → Variants: ["profile"]; non-baseline → Arch: amd64v3
  • ollama: -rocm, -jetpack5, -jetpack6 → Variants
  • pwsh: -fxdependent, -fxdependentWinDesktop → Variants
  • git: .exe and PortableGit → Variants: ["installer"]; -pdb → ["pdb"]
  • node: .msi → Variants: ["installer"]
  • lsd: .deb → ["deb"]; -msvc → ["msvc"]
  • fish: .pkg → Variants: ["installer"]
  • xcaddy: .deb → Variants: ["deb"]

5. Legacy Export Filter (IMPLEMENTED)

ExportLegacy now strips:

  • Assets with non-empty Variants
  • Formats Node.js doesn't handle: .deb, .rpm, .snap, .appx, .tar.zst, .tar.bz2, .7z

This means the _cache/ JSON files only contain assets the Node.js server can actually serve.

6. Format Handling

All formats are stored in the internal Go model — nothing is dropped at classification time. The legacy filter applies only at export time.

  • .pkgpkgutil --expand-full
  • .debar x + tar xf data.tar.*
  • .dmghdiutil attach
  • .msimsiexec /a

Only .exe is ambiguous (binary vs installer). Installer .exe files get Variants: ["installer"].

7. Node Multi-Source

The node package merges official + unofficial builds via unofficial_url in releases.conf. Down to 4 differences at latest version:

  • Live has .7z (filtered from Go legacy export) and .msi (Go tags as installer)
  • Go has .exe bare binary that live doesn't (naming diff)

Per-Package Checklist

Status: [x] reviewed, [-] known acceptable, [ ] needs work

Exact Matches at Latest Version (21)

  • atomicparsley
  • awless
  • chromedriver (chromedist)
  • comrak
  • dotenv-linter
  • gpg (gpgdist)
  • iterm2 (iterm2dist)
  • julia (juliadist)
  • koji
  • lsd — .deb and msvc variants now correctly filtered
  • mariadb (mariadbdist)
  • pathman
  • sass
  • sd
  • shellcheck
  • shfmt
  • sqlc
  • terraform (hashicorp)
  • xcaddy — .deb variants now correctly filtered
  • xsv
  • zig (zigdist)

Go Missing (4)

  • [-] dashd — alias_of=dashcore (correct)
  • [-] macos — no releases.conf
  • [-] pg-essentials — meta-package
  • [-] zig.vim — gittag source, 0 raw data

Live Missing — Go-Only (14)

  • [-] pg — Go alias, live uses postgres
  • [-] ripgrep — Go alias, live uses rg
  • [-] rust.vim — symlink to vim-rust
  • [-] vim-* (11 packages) — gittag packages not in live cache

Meta-Only Diffs (Go correctly filters, Node.js keeps)

  • [-] caddy, cilium, cmake, curlie, dashmsg, deno, dotenv, ffuf, fzf, gh, gitdeploy, goreleaser, gprox, grype, hugo, k9s, keypairs, kind, kubectx, kubens, monorel, mutagen, ots, rclone, rg, runzip, sclient, sqlpkg, sttr, syncthing, terramate, watchexec, xz, yq (41 packages)

Live has .deb/.rpm that Go correctly filters from legacy export

  • [-] bat — 8 .deb files
  • [-] caddy — 9 .deb files
  • [-] delta — 5 .deb files
  • [-] fd — 8 .deb files
  • [-] gh — 8 .deb/.rpm files
  • [-] goreleaser — 18 .deb/.rpm files
  • [-] grype — 8 .deb/.rpm files
  • [-] hexyl — 7 .deb files
  • [-] k9s — 10 .deb/.rpm files
  • [-] pandoc — 2 .deb files
  • [-] pwsh — 4 .deb/.rpm files
  • [-] rclone — 16 .deb/.rpm files
  • [-] sttr — 9 .deb/.rpm files
  • [-] syncthing — 5 .deb/.rpm files
  • [-] terramate — 6 .deb/.rpm files
  • [-] tinygo — 3 .deb files
  • [-] trip — 3 .deb files
  • [-] watchexec — 16 .deb files
  • [-] zoxide — 4 .deb files

Remaining Go-Extra-Assets (need review)

  • bun — baseline builds now serve as legacy amd64 (filename stripped, download URL kept); non-baseline tagged as v3 variant (excluded).
  • fish — source tarball tagged as variant; linux .tar.xz binaries are correct (Node.js just doesn't have them yet)
  • git — busybox and pdbs-for-git tagged as variants
  • [-] hugo — 1 extra: Linux-64bit.tar.gz (old naming); keep as-is for now
  • [-] hugo-extended — 14 extras: non-extended assets leaking in; keep as-is for now
  • kubectx — asset_filter splits shared release
  • kubens — asset_filter splits shared release
  • node — .exe bare binary stored with "bare-exe" variant (Go can serve, legacy excludes); .msi tagged as installer
  • ollama — Ollama-darwin.zip tagged as "app" variant (Go can install, legacy excludes); .tgz normalized to .tar.gz in filename
  • [-] uuidv7 — exotic arches correctly classified; resolver filters by request
  • yq — man_page_only excluded via releases.conf
  • ffmpeg — asset_filter=ffmpeg excludes ffprobe/ffplay; .LICENSE/.README now caught by isMetaAsset

Source/Naming Diffs

  • [-] aliasman — source tarball naming differences (GitHub archive format)
  • [-] duckdns.sh — source tarball naming differences
  • [-] serviceman — source naming + version differences

Stale Data (rate-limited, need re-fetch with token)

  • [-] go — live has 98 extra versions (Go didn't fetch golang.org)
  • [-] lf — live has 30 extra versions
  • [-] postgres, psql — Go has v17, live has v18
  • [-] ffmpeg — Go has older, live has newer

Cross-Package Issues

  • kubectx/kubens — resolved via asset_filter in releases.conf

Remaining Action Items

  1. hugo-extended exclude: Deferred — keep matching Node.js behavior for now
  2. kubectx/kubens split: Resolved — asset_filter in releases.conf
  3. bun baseline in legacy: Resolved — baseline is legacy amd64, non-baseline tagged as v3 variant
  4. Re-fetch with GITHUB_TOKEN: Fix rate-limited/stale packages
  5. Unknown asset notifications: Log new/unrecognized assets to _notices/

Deferred Decisions

  1. Consolidate cmd/classify and cmd/webicached duplication: Both have their own classifyPackage switch, isMetaAsset, detectFormat, and GitHub API types (ghRelease, ghAsset, etc.). cmd/classify is a diagnostic tool (CSV output), cmd/webicached is the production pipeline ([]storage.Asset). Shared pieces could move to internal/ packages. Keep separate dispatchers since they return different types.