The node package already merges both sources via unofficial_url
in releases.conf. The split packages were a workaround that
produced cache files not present in the live Node.js cache.
Strips known noise from the live cache before comparison: .deb, .rpm,
.asc, .sig, .gpg, .sbom, .sha256, checksums, install.sh, install.ps1,
.txt, and other non-installable files. Matches went from 16 to 50.
Uses Node.js version range (2nd to 2nd-to-last) as the window.
All Node.js versions in the window are included so missing Go
versions/assets are visible. Go-only versions are hidden since
those are just deeper fetch history, not real gaps.
shellcheck has no Windows builds, xz has no arm64 builds — these are
real upstream gaps that the test suite now surfaces as failures rather
than silently excluding. 891 pass, 2 known upstream gaps.
The resolver now handles:
- ANYOS assets match any query OS
- posix_2017/posix_2024 assets match any non-Windows OS
- ANYARCH assets match any query architecture (ranked below specific)
14 tests covering: exact match, version constraints, arch fallback
(Rosetta 2, Windows ARM64, micro-arch), format preference, libc
filtering, base-over-variant preference, POSIX/ANYOS/ANYARCH fallback,
Survey catalog, and no-match.
- yq: move man_page_only from general isMetaAsset to yq-specific tagger
- node: restore .exe as stored asset with "bare-exe" variant (installable
by Go, excluded from legacy)
- ollama: rename Ollama-darwin.zip variant from "installer" to "app"
(.app bundle is installable by Go, just not by legacy Node.js)
The distinction: general classification/filter (isMetaAsset) handles
truly non-installable files. Installer-specific taggers handle assets
that are installable but need variant tagging. Legacy filter strips
variants and unsupported formats for Node.js compat.
Node.js normalizes .tgz extensions to .tar.gz in the cache name field
while keeping the real .tgz URL in download. Match this behavior so
legacy export filenames are consistent. Affects ollama-darwin.tgz and
any other packages using .tgz.
Node.js index lists "win-x64-exe" but there's no .exe file on the
download server. The MSI installer (separate "msi" entry) is the actual
Windows installer. The "exe" entry was generating a phantom filename.
asset_filter is a substring that asset filenames must contain. Used when
multiple packages share a GitHub release (kubectx/kubens both come from
ahmetb/kubectx). Added as a first-class Conf field and applied in
webicached's applyConfig.
MinGit-busybox is a stripped-down MinGit using busybox instead of MSYS2.
pdbs-for-git-* filenames weren't caught by the existing "-pdb" check.
Both are now tagged as variants and excluded from legacy export.
fish-{version}.tar.xz is an uploaded source tarball with no OS/arch in
the filename. GitHub API doesn't distinguish it from binaries. Tag assets
with no OS and no arch as "source" variant so they're filtered from
legacy export. The linux .tar.xz binaries classify correctly and are
kept — Node.js just doesn't have them yet.
Baseline builds (-baseline suffix) are plain x86_64 and match what Node.js
serves. Strip -baseline from Filename (keep in Download URL) so legacy
export sees a clean name. Non-baseline builds get Arch: x86_64_v3 and
Variants: ["v3"], excluding them from legacy output.
Add .tar.bz2 to classifier format detection (was slipping through
as empty format). Update COMPARISON.md with fresh results: 21 exact
matches, .deb/.rpm/.tar.zst/.tar.bz2 now correctly filtered from
legacy export. Document remaining items for review.
ExportLegacy now skips assets with non-empty Variants (installer,
rocm, fxdependent, etc.) and formats Node.js doesn't handle (.deb,
.rpm, .snap, .appx, .tar.zst, .tar.bz2, .7z). This ensures the
_cache/ JSON files are compatible with the legacy Node.js server.
Also fix test data to use dotted format strings (.tar.gz) matching
what the classifier actually produces.
Drop VariantTagger interface and map-based lookup. Each per-installer
package now exports a plain TagVariants function. webicached dispatches
via a switch on package name, consistent with fetchRaw and
classifyPackage.
Move variant detection logic from inline functions in webicached to
per-installer packages (internal/releases/{bun,fish,git,lsd,node,
ollama,pwsh,xcaddy}). Each exports a Tagger implementing the new
storage.VariantTagger interface. webicached uses an explicit map
of package name → tagger, no magic registration.
28 exact matches at latest version (up from 12). Reorganize by
difference category. Update action items to reflect current design
(Variants field, format handling, node multi-source fix).
Extract shared state (store, client, auth, rawDir, config flags) into
a WebiCache struct. Convert refreshPackage, fetchRaw, and paginated
fetchers (github, gitea, gittag, nodedist) to methods.
Add -shallow flag: fetches only the first page of releases from
paginated sources. Single-index sources (nodedist, chromedist, etc.)
are always complete in one request.
Extra is for version-related sort metadata (build numbers, etc.).
Variants captures build qualifiers like "rocm", "jetpack5",
"fxdependent", "installer" — things the resolver should skip by
default unless explicitly requested.
Also update format classification docs: most formats (.pkg, .deb,
.dmg, .msi) are extractable — only .exe is ambiguous and needs
the "installer" variant tag when it's not the actual binary.
Installer formats (.pkg, .msi, .deb, etc.) get Extra="installer"
rather than being filtered at classification time. The resolver
skips them by default but the full API can still serve them.
Add unofficial_url to node/releases.conf and update the nodedist
fetcher/classifier to fetch from both URLs. Raw entries are stored
with "official/" or "unofficial/" tag prefixes so they don't overwrite
each other. The classifier picks the correct base URL from the prefix.
This matches the Node.js releases.js behavior which merges both sources,
adding musl, riscv64, loong64, and 7z builds from unofficial.
Document decisions made during the comparison review:
- Package configuration (releases.conf format, source types, Extra map)
- Asset.Extra field semantics (build variants, resolver deprioritization)
- Format filtering (non-extractable installer formats)
- Legacy export filtering (strip variants for Node.js compat)
- Resolve open questions (no node shelling, Extra is sufficient)
- Add new open questions (multi-source config, variant selection API)
- comparecache: use lexver.Compare for version sorting instead of
lexicographic sort (v9.9.0 was incorrectly ranked above v25.8.0)
- webicached/expandNodeFile: add riscv64, loong64 arch mappings and
7z format support for unofficial Node.js builds
- COMPARISON.md: rewrite with version-level review findings including
format filtering gaps (.pkg/.msi/.deb/.dmg), build variant design
(Extra field for rocm/jetpack/fxdependent), and node multi-source issue
Node.js cache entries from custom sources (flutter, go, terraform, etc.)
use _filename (a path) instead of name. Add effectiveName() that falls
back to _filename basename, then download URL basename.
Eliminates phantom "empty name" diffs. Matches went from 8 to 12.
GitHub has two archive formats:
- legacy: codeload.github.com/.../legacy.tar.gz/... → Owner-Repo-Hash/
- current: github.com/.../archive/refs/tags/TAG.tar.gz → repo-version/
The API's tarball_url redirects to the legacy format. Node.js follows
this redirect. The current format is cleaner: predictable filenames
(repo-version.tar.gz), consistent directory names (repo-version/),
and standard github.com URLs.
Verified: aliasman-1.1.2.tar.gz extracts to aliasman-1.1.2/ which
matches the install script glob (mv ./*aliasman*/aliasman ...).
Use Owner-Repo-Tag naming (e.g. BeyondCodeBootcamp-aliasman-v1.1.2.tar.gz)
and direct codeload.github.com URLs instead of api.github.com tarball_url.
This matches the Node.js behavior for source-only packages (aliasman,
duckdns.sh, serviceman) where the extracted directory name matters for
install script globbing (mv ./*aliasman*/ ...).
Remaining diff: Node.js follows the redirect to get the git short hash
suffix (-0-g{hash}) from Content-Disposition. Go uses the tag name
directly. Both resolve to the same archive content.
- Add -src.{tar.gz,tar.xz,zip} pattern to isMetaAsset (alongside _src.)
- Set os=posix_2017, arch=* on source archives (no-binary-asset releases)
instead of leaving them empty. These are shell scripts/vim plugins that
work on any POSIX system.
- Remove "source" Extra tag from source archives (os/arch tells the story)
Add fetch + classify functions for all custom source types:
- chromedist (chromedriver): Chrome for Testing JSON index
- flutterdist (flutter): Google Storage per-OS release indexes
- golang (go): golang.org/dl JSON API
- gpgdist (gpg): SourceForge RSS scraping
- hashicorp (terraform): releases.hashicorp.com product index
- iterm2dist (iterm2): HTML scraping of downloads page
- juliadist (julia): S3 versions.json with platform files
- mariadbdist (mariadb): two-step REST API (majors → releases)
- zigdist (zig): mixed-schema JSON with platform keys
All 9 fetcher packages already existed in internal/releases/ but
were not wired into webicached's fetchRaw/classifyPackage switches.
Now all 103 packages produce classified cache output.
- cmd/comparecache: compares Go cache vs Node.js LIVE_cache at filename
level, categorizes differences (meta-filtering, version depth, source
tarballs, unsupported sources, real asset differences)
- COMPARISON.md: per-package checklist with 91 live packages categorized
- webicached: add -no-fetch flag to classify from existing raw data only
- GO_WEBI.md: update Phase 1 checkboxes for completed items
Combines fetch + classify + write into one pipeline:
1. Reads releases.conf to discover packages
2. Fetches raw upstream data to rawcache
3. Classifies assets (OS, arch, libc, format)
4. Applies config transforms (exclude, version prefix strip)
5. Writes to fsstore in Node.js-compatible _cache/ format
Supports github, nodedist, gittag, and gitea sources. Other sources
(golang, zigdist, flutter, etc.) are skipped with a log message —
they'll be added as needed.
Can run as a one-shot (-once) or periodic daemon (-interval 15m).
storage.Store is the read/write interface for release asset storage.
storage.Asset uses correct terminology (Filename, Format) internally.
storage.LegacyAsset / LegacyCache preserve the Node.js wire format
("releases", "name", "ext") for backward compatibility.
fsstore writes to _cache/YYYY-MM/{pkg}.json with atomic rename,
matching the existing Node.js layout. The Node.js server can read
files written by Go and vice versa.
Tag conventions can change across versions of the same project
(e.g. "jq-1.7.1" → bare "1.8.0"). A comma-separated list lets
the config express all historical prefixes. The parser tries each
in order and strips the first match.
Back-compat: singular "version_prefix" still works (parsed as a
single-element list).
Replace conf.Get("key") and conf.Source() calls with direct struct
field access (conf.Owner, conf.Repo, conf.TagPrefix, conf.BaseURL,
conf.Source) and conf.Extra["key"] for non-standard keys.
Conf is now a plain struct with typed fields (Source, Owner, Repo,
TagPrefix, VersionPrefix, Exclude, BaseURL) instead of a generic
map[string]string with accessor methods. Unrecognized keys go into
an Extra map for forward compatibility.
Config stays flat key=value — covers the common patterns (simple
github, version prefix stripping, monorepo tag prefix, filename
exclusions). Complex cases belong in Go code, not config.
Add cmd/uaparse — analyzes User-Agent strings from webi.sh logs,
deduplicates by (os, arch, libc), extracts platform hints (cloud
provider, container runtime, distro), and flags malformed UAs.
Fix uadetect issues discovered by running against 2,186 live UAs:
- Msys/MINGW/Cygwin now correctly detected as Windows (was Linux)
- FreeBSD detection added
- s390x and riscv64 arch detection added
- WSL libc no longer falsely detected as MSVC ("microsoft" in kernel
version string was triggering the MSVC check)
.tgz is a legitimate archive format (used by ollama darwin releases).
Remove it from the meta-asset filter and add a .tgz → .tar.gz mapping
in detectFormat.
internal/resolve: picks the best release for a platform query.
Handles arch compatibility fallbacks (Rosetta 2, Windows ARM64
emulation, amd64 micro-arch levels), format preferences, variant
filtering (prefers base over rocm/jetpack GPU variants), and
universal (arch-less) binaries.
cmd/e2etest: fetches releases for goreleaser, ollama, and node,
classifies them, resolves for 9 test queries across linux/darwin/
windows x86_64/arm64, then compares against the live webi.sh API.
Results: 8/9 exact match, 1 warn where the Go resolver is more
correct than the live API (ollama arm64 base vs jetpack variant).
Edge cases fixed during development:
- .tgz is a valid archive format (not npm metadata)
- Empty arch in filename = universal binary (ranked below native)
- GPU variants (rocm, jetpack) ranked below base binaries
All 116+ packages audited. Documents which scripts were updated
(completions, man pages, archive handling fixes) and which were
verified correct with no changes needed.
ollama: handle all 3 distribution eras (linux tar.zst, macOS .app
bundle, bare binary), fix test -f to test -d for lib/, extract GPU
libs from .app bundle.
yq: install man page to versioned $pkg_src_dir instead of global
~/.local/share/man/man1/.
Updated install.sh for bat, fd, gh, goreleaser, lsd, rg, sd, watchexec,
and zoxide to extract and install shell completions (bash, fish, zsh) and
man pages from their release archives. Completions go to standard XDG
locations under the versioned opt directory. All moves use 2>/dev/null
fallbacks for older versions that don't include completions.