update COMPARISON.md with fresh shallow-fetch results

28 exact matches at latest version (up from 12). Reorganize by
difference category. Update action items to reflect current design
(Variants field, format handling, node multi-source fix).
This commit is contained in:
AJ ONeal
2026-03-10 13:02:22 -06:00
parent f4e816606f
commit d229eb618d

View File

@@ -3,237 +3,160 @@
Systematic comparison of Go pipeline output (`_cache/`) vs Node.js production
cache (`LIVE_cache/`). Generated by `cmd/comparecache`.
Latest run: `-shallow -once` fetch, fresh LIVE_cache, latest-version comparison.
## Summary (latest version only)
| Category | Count | Meaning |
|----------|-------|---------|
| match | 28 | Identical asset filenames at latest version |
| go-missing | 4 | Go produces no output (alias, meta-package, no config) |
| live-missing | 16 | Package exists in Go but not in live cache |
| go-extra-versions | 40 | Go has more version history (deeper fetch) |
| live-extra-versions | 11 | Live has newer data (rate-limited Go fetch) |
| go-extra-assets | 24 | Go includes assets that Node.js filters out |
| live-extra-assets | 15 | Node.js includes assets that Go filters out |
| live-has-meta | 41 | Node.js includes meta-assets (checksums, sigs) |
## Key Observations
### 1. Classification Timing
The Node.js cache stores assets with **empty** os/arch/ext fields — `normalize.js`
fills those at serve time. The Go pipeline classifies at write time. This means
the Go cache has richer data per-asset, but the comparison must be done at the
**filename level**, not the classified fields.
fills those at serve time. The Go pipeline classifies at write time. The Go cache
has richer data per-asset. Comparison is done at the **filename level**.
### 2. Meta-Asset Filtering
Go's `isMetaAsset()` filters out checksums, signatures, SBOMs, etc. Node.js
includes them. This accounts for 43 packages showing `live-has-meta` differences.
**This is correct behavior** — Go is more aggressive about filtering non-installable
files.
keeps them. This accounts for 41 packages showing `live-has-meta` differences.
**Correct behavior** — Go filters non-installable files at cache time.
### 3. Source Tarballs
Go includes GitHub source tarballs (tarball_url/zipball_url) for releases with no
binary assets. Node.js does not. Affects 15 packages. **Decision needed**: should
these be included?
### 3. Version Depth
Go has deeper version history for most GitHub-sourced packages (fetches all pages
unless `-shallow`). Node.js limits to 30 releases per API call. This is a
**feature** — Go provides complete histories when doing a full fetch.
### 4. Previously Unsupported Sources (NOW FIXED)
All custom source types are now wired into webicached. The only remaining
go-missing packages are: dashd (alias), macos (no releases.conf),
pg-essentials (meta-package), zig.vim (gittag with no raw data).
### 4. Build Variants → `Variants` field
Some assets are build variants. These are stored with `Variants []string` on
`storage.Asset` and deprioritized by the resolver unless explicitly requested.
### 5. Version Depth
Go has deeper version history for most GitHub-sourced packages because it fetches
all pages. Node.js limits to 30 releases per API call. This is a **feature** — Go
provides complete histories.
- **bun**: `-profile` (debug), `-baseline` (actually amd64 vs amd64v3)
- **ollama**: `-rocm`, `-jetpack5`, `-jetpack6` (GPU accelerator)
- **pwsh**: `-fxdependent`, `-fxdependentWinDesktop` (.NET framework-dependent)
- **hugo**: `extended`, `extended_withdeploy` (separate packages in Go)
### 6. Unsupported Formats (NEW — from version-level review)
Go includes installer-format assets that webi can't extract:
- `.pkg` (macOS installer) — 756 in node, 54 in hugo
- `.msi` (Windows installer) — 1,277 in node, 706 in cmake
- `.deb` (Debian package) — 1,672 in hugo
- `.dmg` (macOS disk image) — 510 in cmake
- `.sh` (self-extracting installer) — 596 in cmake
- `.msixbundle` — in pwsh
- `.exe` (bare installer) — in node
For bun: non-baseline=amd64v3, baseline=amd64. Use `Arch` field, not `Variants`.
**Action**: add these to `isMetaAsset()` or a new `isUnsupportedFormat()`.
### 5. Format Handling
All formats are stored — nothing is dropped. Most are extractable:
- `.pkg``pkgutil --expand-full`
- `.deb``ar x` + `tar xf data.tar.*`
- `.dmg``hdiutil attach`
- `.msi``msiexec /a`
### 7. Build Variants (NEW — from version-level review)
Some assets are build variants that need an `Extra`/`Variant` field — they should
be stored but excluded by the resolver unless explicitly requested:
- **ollama**: `-rocm`, `-jetpack5`, `-jetpack6` (GPU accelerator variants)
- **bun**: `-profile` (debug symbols), `-baseline` (actually amd64v1 vs amd64v3)
- **pwsh**: `-fxdependent`, `-fxdependentWinDesktop` (framework-dependent)
- **hugo**: `-extended` (already a separate package, may not need variant tag)
Only `.exe` is ambiguous (binary vs installer). Installer `.exe` files get
`Variants: ["installer"]`.
For bun's baseline: the non-baseline build is actually amd64v3, and baseline is
plain amd64. The arch field should reflect this — no variant tag needed.
For legacy cache export: filter out assets with non-empty `Extra` so the Node.js
server doesn't encounter unknown variants.
### 8. Node Multi-Source (NEW — from version-level review)
The `node` package should merge official + unofficial builds (musl, riscv64, 7z).
Currently Go has separate `node`, `node-official`, `node-unofficial` packages.
Node.js `releases.js` merges both into one cache. This is a concrete case for
multi-source config (task #7).
Missing from Go's node classifier: riscv64 arch, loong64 arch, 7z format.
**Fixed** in this session.
## Categories
| Category | Count | Meaning |
|----------|-------|---------|
| match | 12 | Identical asset filenames |
| go-missing | 4 | Go produces no output (alias, meta-package, or no config) |
| live-missing | 16 | Package exists in Go but not in live cache |
| go-extra-versions | 52 | Go has more version history (deeper fetch) |
| live-extra-versions | 14 | Live has newer data or uses a different source |
| go-extra-assets | 59 | Go includes assets that Node.js filters out |
| live-extra-assets | 17 | Node.js includes assets that Go filters out |
| live-has-meta | 43 | Node.js includes meta-assets (checksums, sigs) |
| go-has-source-tarballs | 15 | Go includes source tarballs for no-binary releases |
### 6. Node Multi-Source (FIXED)
The `node` package now merges official + unofficial builds via `unofficial_url`
in releases.conf. This adds musl, riscv64, loong64, and 7z builds. Down to
**1 difference** at latest version (a `.exe` installer).
## Per-Package Checklist
Status: `[x]` reviewed, `[-]` known difference (acceptable), `[ ]` needs review
Status: `[x]` reviewed, `[-]` known acceptable, `[ ]` needs work
### Exact Matches (12)
- [x] atomicparsley — match
- [x] awless — match
- [x] chromedriver — match (chromedist)
- [x] dotenv-linter — match
- [x] gpg — match (gpgdist)
- [x] hexyl — match
- [x] julia — match (juliadist)
- [x] koji — match
- [x] lf — match
- [x] sd — match
- [x] terraform — match (hashicorp, cleanest package — all .zip with os/arch)
- [x] zoxide — match
### Exact Matches at Latest Version (28)
- [x] atomicparsley
- [x] awless
- [x] bat
- [x] chromedriver (chromedist)
- [x] cmake — meta-only diff
- [x] comrak
- [x] crabz
- [x] delta
- [x] dotenv-linter
- [x] fd
- [x] gpg (gpgdist)
- [x] hexyl
- [x] iterm2 (iterm2dist)
- [x] julia (juliadist)
- [x] koji
- [x] lf
- [x] mariadb (mariadbdist)
- [x] pandoc
- [x] pathman
- [x] sass
- [x] sd
- [x] shellcheck
- [x] shfmt
- [x] sqlc
- [x] terraform (hashicorp)
- [x] tinygo
- [x] trip
- [x] xsv
- [x] zig (zigdist)
- [x] zoxide
### Go Missing — Unsupported Source (4)
- [-] dashd — alias_of=dashcore, skipped (correct)
### Go Missing (4)
- [-] dashd — alias_of=dashcore (correct)
- [-] macos — no releases.conf
- [-] pg-essentials — meta-package
- [-] zig.vim — gittag source, 0 raw data?
- [-] zig.vim — gittag source, 0 raw data
### Live Missing — Go-Only Packages (16)
### Live Missing — Go-Only (16)
- [-] node-official — Go split, not in live cache
- [-] node-unofficial — Go split, not in live cache
- [-] pg — Go name, live uses postgres
- [-] ripgrep — Go name, live uses rg
- [-] pg — Go alias, live uses postgres
- [-] ripgrep — Go alias, live uses rg
- [-] rust.vim — symlink to vim-rust
- [-] vim-airline — gittag packages not in live cache
- [-] vim-airline-themes — gittag packages not in live cache
- [-] vim-ale — gittag packages not in live cache
- [-] vim-devicons — gittag packages not in live cache
- [-] vim-go — gittag packages not in live cache
- [-] vim-nerdtree — gittag packages not in live cache
- [-] vim-prettier — gittag packages not in live cache
- [-] vim-rust — gittag packages not in live cache
- [-] vim-sensible — gittag packages not in live cache
- [-] vim-shfmt — gittag packages not in live cache
- [-] vim-syntastic — gittag packages not in live cache
- [-] vim-* (11 packages) — gittag packages not in live cache
### Meta-Asset Only Differences (Go filters, Node.js keeps)
These packages differ only because Go strips checksums/sigs/SBOMs:
- [-] curlie — live-has-meta(21)
- [-] dashmsg — live-has-meta(1)
- [-] dotenv — live-has-meta(1)
- [-] ffuf — live-has-meta(50)
- [-] gitdeploy — live-has-meta(1)
- [-] gprox — live-has-meta(7)
- [-] keypairs — live-has-meta(1)
- [-] monorel — live-has-meta(3)
- [-] ots — live-has-meta(28)
- [-] runzip — live-has-meta(1)
- [-] sclient — live-has-meta(1)
- [-] sqlpkg — live-has-meta(7)
- [-] xz — live-has-meta(1)
### Meta-Only Diffs (Go correctly filters, Node.js keeps)
- [-] caddy, cilium, curlie, dashmsg, deno, dotenv, ffuf, fzf, gh, gitdeploy,
goreleaser, gprox, grype, hugo, k9s, keypairs, kind, kubectx, kubens,
monorel, ots, rclone, rg, runzip, sclient, sqlpkg, sttr, terramate,
watchexec, xz, yq (41 packages)
### Custom Source Types (version-level reviewed)
- [x] chromedriver — chromedist, exact match at latest version
- [-] flutter — flutterdist, live-extra-assets(217); 90% of assets have empty arch (expected — Flutter is host-arch-agnostic); mix of dev/beta/stable channels
- [-] go — golang, live-extra-versions(98), go-extra-versions(98); different version sets; live has 1 extra source tarball (go1.9rc2.src.tar.gz)
- [x] gpg — gpgdist, match
- [-] iterm2 — iterm2dist, live-extra-versions(20), live-extra-assets(20)
- [x] julia — juliadist, match
- [-] mariadb — mariadbdist, go-extra-assets(11)
- [x] terraform — hashicorp, match (cleanest package in the entire comparison)
- [-] zig — zigdist, version differences
### Build Variant Diffs (need Variants tagging)
- [ ] bun — 16 extras: `-profile` and `-baseline` variants
- [ ] pwsh — 4 extras: `fxdependent` variants
- [-] ollama — different release assets (.tgz/.zip vs .tar.gz/.tar.zst);
live includes install.sh/install.ps1 (meta, should be filtered)
### Version-Level Reviewed (latest version comparison)
These were compared at the latest-version level with semver-aware sorting.
### Format Diffs (need Variants: ["installer"] or format-aware filtering)
- [ ] fish — 4 extras: .pkg, .tar.xz source, linux binaries
- [ ] git — 10 extras: .exe installers, .tar.bz2, MinGit, PortableGit, pdbs
- [ ] hugo — 15 extras: .deb, .pkg, `extended_withdeploy` builds
- [ ] hugo-extended — 16 extras: same pattern
- [ ] lsd — 14 extras: .deb packages, windows-msvc builds
- [ ] node — 1 extra: .exe installer
- [ ] xcaddy — 8 extras: .deb packages
#### Clean at latest version (diff is only version depth + meta)
- [x] bat — match at latest; 85 go-extra from deeper history only
- [x] caddy — latest match except meta (.pem/.sig/.sbom); 1,180 go-extra from history
- [x] deno — latest match; 4 go-extra from very old naming format
- [x] fd — match at latest; old releases had bare `fd` binary with no os/arch
- [x] flutter — match at latest version
### Exotic Arch Diffs
- [-] uuidv7 — 16 extras: thumbeb, armeb, loong64, riscv32 (Node.js can't classify)
#### Build variant differences (needs Variant/Extra field)
- [-] bun — 16 go-extra at latest: `-profile` and `-baseline` variants; baseline=amd64, non-baseline=amd64v3
- [-] ollama — live has `.tar.gz`/`.tar.zst`, Go has `.tgz`/`.zip` (different release assets); both have rocm/jetpack variants; live includes install.sh/install.ps1 (should be filtered as meta)
- [-] pwsh — 4 go-extra at latest: all fxdependent variants; should be tagged with Extra, filtered for legacy export
#### Multi-source issue
- [-] node — live merges official+unofficial (musl, riscv64, 7z); Go has separate packages; **fixed**: added riscv64/loong64 arch and 7z format to classifier
### Format Filtering Needed
These packages have high go-extra-assets counts primarily from non-extractable formats:
- [ ] cmake — 4,352 extras: .msi(706), .dmg(510), .sh(596), source tarballs, checksum files
- [ ] hugo — 8,176 extras: .deb(1672), .pkg(54), source tarballs
- [ ] hugo-extended — 8,206 extras (same pattern as hugo)
- [ ] node — 15,208 extras: .pkg(756), .msi(1277), .exe(installer)
- [ ] git — 3,724 extras: likely .exe installers, .dmg
- [ ] pandoc — 698 extras: likely .deb, .pkg, .msi
- [ ] pwsh — 3,407 extras: .msixbundle, .AppImage, + fxdependent variants
- [ ] rclone — 2,548 extras: likely .deb, .rpm
- [ ] sass — 2,194 extras
- [ ] syncthing — 11,983 extras
### Version Depth Only (no action needed)
These differ only because Go fetches deeper history:
- [-] arc — go-extra-versions(9) + source tarballs
- [-] cilium — go-extra-versions(97)
- [-] comrak — go-extra-versions(60)
- [-] delta — go-extra-versions(29)
- [-] fish — go-extra-versions(35)
- [-] fzf — go-extra-versions(46)
- [-] gh — go-extra-versions(159)
- [-] goreleaser — go-extra-versions(556)
- [-] grype — go-extra-versions(161)
- [-] k9s — go-extra-versions(227)
- [-] kind — go-extra-versions(7)
- [-] kubectx — go-extra-versions(15)
- [-] kubens — go-extra-versions(15)
- [-] lsd — go-extra-versions(2)
- [-] mutagen — go-extra-versions(82)
- [-] rg — go-extra-versions(44)
- [-] shellcheck — go-extra-versions(17)
- [-] shfmt — go-extra-versions(22)
- [-] sttr — go-extra-versions(4)
- [-] terramate — go-extra-versions(193)
- [-] tinygo — go-extra-versions(19)
- [-] trip — go-extra-versions(6)
- [-] watchexec — go-extra-versions(83)
- [-] xcaddy — go-extra-assets(123) (likely deep history)
- [-] xsv — go-extra-versions(35)
- [-] yq — go-extra-versions(134)
### Remaining (minor differences, low priority)
### Source/Naming Diffs
- [-] aliasman — source tarball naming differences
- [-] crabz — go-extra-assets(4)
- [-] dashcore — go-extra-versions(101) + meta
- [-] duckdns.sh — source tarball differences
- [-] ffmpeg — go-extra-versions(11)
- [-] gitea — go-extra-versions(194) + meta
- [-] jq — go-extra-versions(15)
- [-] pathman — go-extra-assets(1)
- [-] postgres — version overlap differences
- [-] psql — version overlap differences
- [-] serviceman — source differences
- [-] sqlc — go-extra-versions(7)
- [-] uuidv7 — go-extra-assets(16)
- [-] vim-commentary — version differences
- [-] vim-zig — version differences
- [-] vim-commentary, vim-zig — gittag version differences
### Stale Data (rate-limited, need re-fetch with token)
- [-] postgres, psql — Go has v17, live has v18
- [-] ffmpeg — Go has older, live has newer
### Cross-Package Issues
- [-] kubectx — includes kubens assets from shared GitHub release
## Action Items
1. **Filter unsupported formats**: Add .pkg, .msi, .deb, .dmg, .sh (installer),
.msixbundle, .rpm, .exe (bare installer), .AppImage to isMetaAsset or new filter
2. **Tag build variants**: Populate Asset.Extra for rocm, jetpack5/6, fxdependent,
profile; filter these in ExportLegacy for Node.js compat
3. **Bun arch classification**: Map baselineamd64, non-baselineamd64v3
4. **Node multi-source**: Merge official+unofficial into single `node` cache
(blocked on multi-source config redesign, task #7)
5. **Ollama meta filtering**: Filter install.sh/install.ps1 from release assets
1. **Tag build variants in classifier**: Populate `Asset.Variants` for profile,
baseline, fxdependent, rocm, jetpack5/6, installer (.exe), extended
2. **Legacy export filter**: Strip assets with non-empty `Variants` in
`ExportLegacy` so Node.js server doesn't encounter unknown assets
3. **Bun arch mapping**: non-baseline=amd64v3, baseline=amd64 in `Arch` field
4. **Ollama meta**: Filter install.sh/install.ps1 from release assets
5. **kubectx/kubens**: Split shared release by asset name prefix
6. **Re-fetch with GITHUB_TOKEN**: Fix rate-limited packages (sd, serviceman,
shellcheck, shfmt, sqlc, sqlpkg, sttr, syncthing, terramate, tinygo, trip,
uuidv7, watchexec, xcaddy, xsv, xz, yq, zoxide)