- yq: move man_page_only from general isMetaAsset to yq-specific tagger - node: restore .exe as stored asset with "bare-exe" variant (installable by Go, excluded from legacy) - ollama: rename Ollama-darwin.zip variant from "installer" to "app" (.app bundle is installable by Go, just not by legacy Node.js) The distinction: general classification/filter (isMetaAsset) handles truly non-installable files. Installer-specific taggers handle assets that are installable but need variant tagging. Legacy filter strips variants and unsupported formats for Node.js compat.
8.4 KiB
Go vs Node.js Cache Comparison
Systematic comparison of Go pipeline output (_cache/) vs Node.js production
cache (LIVE_cache/). Generated by cmd/comparecache.
Latest run: 2026-03-10. -no-fetch rebuild from existing raw data, with
variant tagging and legacy export filter applied.
Summary (latest version only)
| Category | Count | Meaning |
|---|---|---|
| match | 21 | Identical asset filenames at latest version |
| go-missing | 4 | Go produces no output (alias, meta-package, no config) |
| live-missing | 16 | Package exists in Go but not in live cache |
| go-extra-versions | 44 | Go has more version history (deeper fetch) |
| live-extra-versions | 11 | Live has newer data (rate-limited Go fetch) |
| go-extra-assets | 19 | Go includes assets that Node.js filters out |
| live-extra-assets | 41 | Node.js includes assets that Go filters out |
| live-has-meta | 41 | Node.js includes meta-assets (checksums, sigs) |
Changes since last comparison: variant tagging (bun, pwsh, ollama, git, node, lsd, fish, xcaddy) and legacy export filter (strips Variants-tagged assets and non-legacy formats like .deb, .rpm, .tar.bz2, .tar.zst, .7z) are now active. Match count dropped from 28→21 because .deb files that Node.js keeps are now correctly categorized as live-extra-assets (Go filters them from legacy output).
Key Observations
1. Classification Timing
The Node.js cache stores assets with empty os/arch/ext fields — normalize.js
fills those at serve time. The Go pipeline classifies at write time. The Go cache
has richer data per-asset. Comparison is done at the filename level.
2. Meta-Asset Filtering
Go's isMetaAsset() filters out checksums, signatures, SBOMs, etc. Node.js
keeps them. This accounts for 41 packages showing live-has-meta differences.
Correct behavior — Go filters non-installable files at cache time.
3. Version Depth
Go has deeper version history for most GitHub-sourced packages (fetches all pages
unless -shallow). Node.js limits to 30 releases per API call. This is a
feature — Go provides complete histories when doing a full fetch.
4. Build Variants (IMPLEMENTED)
Variant tagging is now active. Per-package taggers in internal/releases/{pkg}/.
Assets with Variants are stored but excluded from legacy export.
- bun:
-profile→ Variants: ["profile"]; non-baseline → Arch: amd64v3 - ollama:
-rocm,-jetpack5,-jetpack6→ Variants - pwsh:
-fxdependent,-fxdependentWinDesktop→ Variants - git:
.exeand PortableGit → Variants: ["installer"];-pdb→ ["pdb"] - node:
.msi→ Variants: ["installer"] - lsd:
.deb→ ["deb"];-msvc→ ["msvc"] - fish:
.pkg→ Variants: ["installer"] - xcaddy:
.deb→ Variants: ["deb"]
5. Legacy Export Filter (IMPLEMENTED)
ExportLegacy now strips:
- Assets with non-empty
Variants - Formats Node.js doesn't handle:
.deb,.rpm,.snap,.appx,.tar.zst,.tar.bz2,.7z
This means the _cache/ JSON files only contain assets the Node.js server
can actually serve.
6. Format Handling
All formats are stored in the internal Go model — nothing is dropped at classification time. The legacy filter applies only at export time.
.pkg—pkgutil --expand-full.deb—ar x+tar xf data.tar.*.dmg—hdiutil attach.msi—msiexec /a
Only .exe is ambiguous (binary vs installer). Installer .exe files get
Variants: ["installer"].
7. Node Multi-Source
The node package merges official + unofficial builds via unofficial_url
in releases.conf. Down to 4 differences at latest version:
- Live has
.7z(filtered from Go legacy export) and.msi(Go tags as installer) - Go has
.exebare binary that live doesn't (naming diff)
Per-Package Checklist
Status: [x] reviewed, [-] known acceptable, [ ] needs work
Exact Matches at Latest Version (21)
- atomicparsley
- awless
- chromedriver (chromedist)
- comrak
- dotenv-linter
- gpg (gpgdist)
- iterm2 (iterm2dist)
- julia (juliadist)
- koji
- lsd — .deb and msvc variants now correctly filtered
- mariadb (mariadbdist)
- pathman
- sass
- sd
- shellcheck
- shfmt
- sqlc
- terraform (hashicorp)
- xcaddy — .deb variants now correctly filtered
- xsv
- zig (zigdist)
Go Missing (4)
- [-] dashd — alias_of=dashcore (correct)
- [-] macos — no releases.conf
- [-] pg-essentials — meta-package
- [-] zig.vim — gittag source, 0 raw data
Live Missing — Go-Only (16)
- [-] node-official — Go split, not in live cache
- [-] node-unofficial — Go split, not in live cache
- [-] pg — Go alias, live uses postgres
- [-] ripgrep — Go alias, live uses rg
- [-] rust.vim — symlink to vim-rust
- [-] vim-* (11 packages) — gittag packages not in live cache
Meta-Only Diffs (Go correctly filters, Node.js keeps)
- [-] caddy, cilium, cmake, curlie, dashmsg, deno, dotenv, ffuf, fzf, gh, gitdeploy, goreleaser, gprox, grype, hugo, k9s, keypairs, kind, kubectx, kubens, monorel, mutagen, ots, rclone, rg, runzip, sclient, sqlpkg, sttr, syncthing, terramate, watchexec, xz, yq (41 packages)
Live has .deb/.rpm that Go correctly filters from legacy export
- [-] bat — 8 .deb files
- [-] caddy — 9 .deb files
- [-] delta — 5 .deb files
- [-] fd — 8 .deb files
- [-] gh — 8 .deb/.rpm files
- [-] goreleaser — 18 .deb/.rpm files
- [-] grype — 8 .deb/.rpm files
- [-] hexyl — 7 .deb files
- [-] k9s — 10 .deb/.rpm files
- [-] pandoc — 2 .deb files
- [-] pwsh — 4 .deb/.rpm files
- [-] rclone — 16 .deb/.rpm files
- [-] sttr — 9 .deb/.rpm files
- [-] syncthing — 5 .deb/.rpm files
- [-] terramate — 6 .deb/.rpm files
- [-] tinygo — 3 .deb files
- [-] trip — 3 .deb files
- [-] watchexec — 16 .deb files
- [-] zoxide — 4 .deb files
Remaining Go-Extra-Assets (need review)
- bun — baseline builds now serve as legacy amd64 (filename stripped, download URL kept); non-baseline tagged as v3 variant (excluded).
- fish — source tarball tagged as variant; linux .tar.xz binaries are correct (Node.js just doesn't have them yet)
- git — busybox and pdbs-for-git tagged as variants
- [-] hugo — 1 extra:
Linux-64bit.tar.gz(old naming); keep as-is for now - [-] hugo-extended — 14 extras: non-extended assets leaking in; keep as-is for now
- kubectx — asset_filter splits shared release
- kubens — asset_filter splits shared release
- node — .exe bare binary stored with "bare-exe" variant (Go can serve, legacy excludes); .msi tagged as installer
- ollama — Ollama-darwin.zip tagged as "app" variant (Go can install, legacy excludes); .tgz normalized to .tar.gz in filename
- [-] uuidv7 — exotic arches correctly classified; resolver filters by request
- yq — man_page_only tagged as "man-pages" variant in yq-specific tagger
- ffmpeg — asset_filter=ffmpeg excludes ffprobe/ffplay; .LICENSE/.README now caught by isMetaAsset
Source/Naming Diffs
- [-] aliasman — source tarball naming differences (GitHub archive format)
- [-] duckdns.sh — source tarball naming differences
- [-] serviceman — source naming + version differences
Stale Data (rate-limited, need re-fetch with token)
- [-] go — live has 98 extra versions (Go didn't fetch golang.org)
- [-] lf — live has 30 extra versions
- [-] postgres, psql — Go has v17, live has v18
- [-] ffmpeg — Go has older, live has newer
Cross-Package Issues
- kubectx/kubens — resolved via asset_filter in releases.conf
Remaining Action Items
hugo-extended exclude: Deferred — keep matching Node.js behavior for nowkubectx/kubens split: Resolved — asset_filter in releases.confbun baseline in legacy: Resolved — baseline is legacy amd64, non-baseline tagged as v3 variant- Re-fetch with GITHUB_TOKEN: Fix rate-limited/stale packages
- Unknown asset notifications: Log new/unrecognized assets to
_notices/
Deferred Decisions
- Consolidate cmd/classify and cmd/webicached duplication: Both have their
own
classifyPackageswitch,isMetaAsset,detectFormat, and GitHub API types (ghRelease,ghAsset, etc.).cmd/classifyis a diagnostic tool (CSV output),cmd/webicachedis the production pipeline ([]storage.Asset). Shared pieces could move tointernal/packages. Keep separate dispatchers since they return different types.