Commit Graph

1314 Commits

Author SHA1 Message Date
AJ ONeal
14f588f4d9 version-level comparison: fix lexver sorting, add riscv64/7z, update findings
- comparecache: use lexver.Compare for version sorting instead of
  lexicographic sort (v9.9.0 was incorrectly ranked above v25.8.0)
- webicached/expandNodeFile: add riscv64, loong64 arch mappings and
  7z format support for unofficial Node.js builds
- COMPARISON.md: rewrite with version-level review findings including
  format filtering gaps (.pkg/.msi/.deb/.dmg), build variant design
  (Extra field for rocm/jetpack/fxdependent), and node multi-source issue
2026-03-10 12:27:16 -06:00
AJ ONeal
9dcf10c996 fix comparecache: use _filename fallback for Node.js entries
Node.js cache entries from custom sources (flutter, go, terraform, etc.)
use _filename (a path) instead of name. Add effectiveName() that falls
back to _filename basename, then download URL basename.

Eliminates phantom "empty name" diffs. Matches went from 8 to 12.
2026-03-10 12:16:31 -06:00
AJ ONeal
83748185bd use current (non-legacy) GitHub archive format for source archives
GitHub has two archive formats:
- legacy: codeload.github.com/.../legacy.tar.gz/... → Owner-Repo-Hash/
- current: github.com/.../archive/refs/tags/TAG.tar.gz → repo-version/

The API's tarball_url redirects to the legacy format. Node.js follows
this redirect. The current format is cleaner: predictable filenames
(repo-version.tar.gz), consistent directory names (repo-version/),
and standard github.com URLs.

Verified: aliasman-1.1.2.tar.gz extracts to aliasman-1.1.2/ which
matches the install script glob (mv ./*aliasman*/aliasman ...).
2026-03-10 11:46:36 -06:00
AJ ONeal
47f0f7bbb6 fix source archive filenames and download URLs
Use Owner-Repo-Tag naming (e.g. BeyondCodeBootcamp-aliasman-v1.1.2.tar.gz)
and direct codeload.github.com URLs instead of api.github.com tarball_url.

This matches the Node.js behavior for source-only packages (aliasman,
duckdns.sh, serviceman) where the extracted directory name matters for
install script globbing (mv ./*aliasman*/ ...).

Remaining diff: Node.js follows the redirect to get the git short hash
suffix (-0-g{hash}) from Content-Disposition. Go uses the tag name
directly. Both resolve to the same archive content.
2026-03-10 11:44:12 -06:00
AJ ONeal
d22be16a69 fix isMetaAsset and source archive classification
- Add -src.{tar.gz,tar.xz,zip} pattern to isMetaAsset (alongside _src.)
- Set os=posix_2017, arch=* on source archives (no-binary-asset releases)
  instead of leaving them empty. These are shell scripts/vim plugins that
  work on any POSIX system.
- Remove "source" Extra tag from source archives (os/arch tells the story)
2026-03-10 11:31:53 -06:00
AJ ONeal
6f6046afef update COMPARISON.md: custom sources now wired, refresh counts 2026-03-10 11:24:18 -06:00
AJ ONeal
2e052fa553 wire 9 custom source fetchers into webicached
Add fetch + classify functions for all custom source types:
- chromedist (chromedriver): Chrome for Testing JSON index
- flutterdist (flutter): Google Storage per-OS release indexes
- golang (go): golang.org/dl JSON API
- gpgdist (gpg): SourceForge RSS scraping
- hashicorp (terraform): releases.hashicorp.com product index
- iterm2dist (iterm2): HTML scraping of downloads page
- juliadist (julia): S3 versions.json with platform files
- mariadbdist (mariadb): two-step REST API (majors → releases)
- zigdist (zig): mixed-schema JSON with platform keys

All 9 fetcher packages already existed in internal/releases/ but
were not wired into webicached's fetchRaw/classifyPackage switches.
Now all 103 packages produce classified cache output.
2026-03-10 11:23:41 -06:00
AJ ONeal
b51e9e2998 add comparecache tool and LIVE_cache comparison checklist
- cmd/comparecache: compares Go cache vs Node.js LIVE_cache at filename
  level, categorizes differences (meta-filtering, version depth, source
  tarballs, unsupported sources, real asset differences)
- COMPARISON.md: per-package checklist with 91 live packages categorized
- webicached: add -no-fetch flag to classify from existing raw data only
- GO_WEBI.md: update Phase 1 checkboxes for completed items
2026-03-10 11:17:37 -06:00
AJ ONeal
0e6d90e011 feat: add webicached — release cache daemon
Combines fetch + classify + write into one pipeline:
1. Reads releases.conf to discover packages
2. Fetches raw upstream data to rawcache
3. Classifies assets (OS, arch, libc, format)
4. Applies config transforms (exclude, version prefix strip)
5. Writes to fsstore in Node.js-compatible _cache/ format

Supports github, nodedist, gittag, and gitea sources. Other sources
(golang, zigdist, flutter, etc.) are skipped with a log message —
they'll be added as needed.

Can run as a one-shot (-once) or periodic daemon (-interval 15m).
2026-03-10 10:58:17 -06:00
AJ ONeal
a553b0f407 feat: add storage interface and fsstore implementation
storage.Store is the read/write interface for release asset storage.
storage.Asset uses correct terminology (Filename, Format) internally.
storage.LegacyAsset / LegacyCache preserve the Node.js wire format
("releases", "name", "ext") for backward compatibility.

fsstore writes to _cache/YYYY-MM/{pkg}.json with atomic rename,
matching the existing Node.js layout. The Node.js server can read
files written by Go and vice versa.
2026-03-10 10:53:19 -06:00
AJ ONeal
e6dc349b83 fix: remove accidentally committed build artifacts, update .gitignore 2026-03-10 10:47:14 -06:00
AJ ONeal
8b9d101132 ref(installerconf): make VersionPrefixes a list, not a single string
Tag conventions can change across versions of the same project
(e.g. "jq-1.7.1" → bare "1.8.0"). A comma-separated list lets
the config express all historical prefixes. The parser tries each
in order and strips the first match.

Back-compat: singular "version_prefix" still works (parsed as a
single-element list).
2026-03-10 10:46:47 -06:00
AJ ONeal
0f7e0f3286 ref(cmd): update callers for typed installerconf.Conf fields
Replace conf.Get("key") and conf.Source() calls with direct struct
field access (conf.Owner, conf.Repo, conf.TagPrefix, conf.BaseURL,
conf.Source) and conf.Extra["key"] for non-standard keys.
2026-03-10 10:45:14 -06:00
AJ ONeal
8cdc00b2d8 ref(installerconf): use typed struct instead of string map
Conf is now a plain struct with typed fields (Source, Owner, Repo,
TagPrefix, VersionPrefix, Exclude, BaseURL) instead of a generic
map[string]string with accessor methods. Unrecognized keys go into
an Extra map for forward compatibility.

Config stays flat key=value — covers the common patterns (simple
github, version prefix stripping, monorepo tag prefix, filename
exclusions). Complex cases belong in Go code, not config.
2026-03-10 10:42:37 -06:00
AJ ONeal
090fb9e242 simplify(uaparse): remove copy-paste typo detection from malformed check 2026-03-10 10:27:09 -06:00
AJ ONeal
3626a04a48 feat: add UA analysis tool and fix uadetect gaps from live data
Add cmd/uaparse — analyzes User-Agent strings from webi.sh logs,
deduplicates by (os, arch, libc), extracts platform hints (cloud
provider, container runtime, distro), and flags malformed UAs.

Fix uadetect issues discovered by running against 2,186 live UAs:
- Msys/MINGW/Cygwin now correctly detected as Windows (was Linux)
- FreeBSD detection added
- s390x and riscv64 arch detection added
- WSL libc no longer falsely detected as MSVC ("microsoft" in kernel
  version string was triggering the MSVC check)
2026-03-10 10:24:26 -06:00
AJ ONeal
3965e993f5 fix(classify): treat .tgz as .tar.gz, not as meta asset
.tgz is a legitimate archive format (used by ollama darwin releases).
Remove it from the meta-asset filter and add a .tgz → .tar.gz mapping
in detectFormat.
2026-03-10 10:10:07 -06:00
AJ ONeal
8aeda55e3b feat: add resolve package and end-to-end test
internal/resolve: picks the best release for a platform query.
Handles arch compatibility fallbacks (Rosetta 2, Windows ARM64
emulation, amd64 micro-arch levels), format preferences, variant
filtering (prefers base over rocm/jetpack GPU variants), and
universal (arch-less) binaries.

cmd/e2etest: fetches releases for goreleaser, ollama, and node,
classifies them, resolves for 9 test queries across linux/darwin/
windows x86_64/arm64, then compares against the live webi.sh API.

Results: 8/9 exact match, 1 warn where the Go resolver is more
correct than the live API (ollama arm64 base vs jetpack variant).

Edge cases fixed during development:
- .tgz is a valid archive format (not npm metadata)
- Empty arch in filename = universal binary (ranked below native)
- GPU variants (rocm, jetpack) ranked below base binaries
2026-03-10 10:09:32 -06:00
AJ ONeal
d2c8bec80d doc: add install script audit results to INSTALLER-NOTES.md
All 116+ packages audited. Documents which scripts were updated
(completions, man pages, archive handling fixes) and which were
verified correct with no changes needed.
2026-03-10 09:24:11 -06:00
AJ ONeal
a7b3a5726a feat(pandoc): install man pages from release archive
Pandoc archives include share/man/man1/pandoc*.1.gz alongside the
binary. Move them into the versioned opt directory so they're
available via MANPATH.
2026-03-10 09:22:28 -06:00
AJ ONeal
b08e678a83 fix(ollama,yq): improve archive handling and man page location
ollama: handle all 3 distribution eras (linux tar.zst, macOS .app
bundle, bare binary), fix test -f to test -d for lib/, extract GPU
libs from .app bundle.

yq: install man page to versioned $pkg_src_dir instead of global
~/.local/share/man/man1/.
2026-03-10 09:18:17 -06:00
AJ ONeal
1803c208c3 feat: install shell completions and man pages from archives
Updated install.sh for bat, fd, gh, goreleaser, lsd, rg, sd, watchexec,
and zoxide to extract and install shell completions (bash, fish, zsh) and
man pages from their release archives. Completions go to standard XDG
locations under the versioned opt directory. All moves use 2>/dev/null
fallbacks for older versions that don't include completions.
2026-03-10 09:15:23 -06:00
AJ ONeal
e1529f5949 doc: complete inspection of all remaining large packages
cmake (Pattern G: SDK), pwsh (Pattern H: .NET bundle), dashcore (Pattern I:
multi-binary), mutagen (Pattern I: binary + embedded agents), tinygo (Pattern G).
All 116 packages now categorized into patterns A through I.
2026-03-10 09:10:16 -06:00
AJ ONeal
47e843640b feat: add cmd/inspect for package structure inspection
Downloads release archives, unpacks them, and reports internal structure.
Uses httpclient for downloads with content-disposition awareness.
Supports tar.gz, tar.xz, tar.zst, zip, and DMG formats.
Caches downloads in _cache/downloads/{pkg}/{version}/.
2026-03-10 01:24:25 -06:00
AJ ONeal
7abf15e1ef doc: add inspection results for terraform, deno, k9s, pandoc
terraform/deno/k9s confirmed as Pattern A (flat single binary).
pandoc confirmed as Pattern E (FHS-like bin/ + share/man/).
2026-03-10 01:24:07 -06:00
AJ ONeal
ca1b121b24 doc: add format change analysis across all package versions
Track year-by-year format changes for all packages. Identify structurally
significant changes (sd, ollama, caddy, deno, gh, hugo) vs cosmetic ones.
Most packages have stable formats — only ~11 have changes requiring
different install script eras.
2026-03-10 01:22:48 -06:00
AJ ONeal
36a1df2791 doc: add batch inspection results for all packages
Inspected archive contents of 60+ packages and categorized into patterns:
A) bare binary in archive (most common, ~28 packages)
B) subdirectory with binary only
C) subdirectory with completions/man pages (Rust tools)
D) complex with shared libraries (ollama, psql, sass, syncthing)
E) FHS-like layout with bin/ (gh, ollama)
F) renamed binary needing install-time rename (pathman, yq)
2026-03-10 01:21:33 -06:00
AJ ONeal
e3db0899a0 doc: add INSTALLER-NOTES.md with sd and ollama package inspection results
Document format evolution and install structure for sd (simple single-binary
with completions) and ollama (complex multi-lib GPU-accelerated server).
Track archive layouts, format changes across versions, and draft install
scripts targeting ~/.local/opt/<pkg>-<ver>/bin/.
2026-03-10 01:16:29 -06:00
AJ ONeal
f9f0045259 fix: handle GitHub source-tarball packages (serviceman, aliasman, duckdns.sh)
When a GitHub release has no binary assets, fall back to tarball_url and
zipball_url. These are source distributions (platform-independent), marked
with extra=source.

- serviceman: 12 distributables (6 releases × tar.gz + zip)
- aliasman: 8 distributables (4 releases × tar.gz + zip)
- duckdns.sh: 6 distributables (3 releases × tar.gz + zip)

Total: 170,213 rows across 116 packages (no more zeros).
2026-03-10 00:33:05 -06:00
AJ ONeal
28dab7dade feat: complete classification of all 116 packages (169,867 rows)
- Add asset_filter/asset_exclude conf keys for shared-repo packages
- Split hugo/hugo-extended: exclude/require "extended" in asset name
- Add macosx, ia32, .snap, .appx classifier patterns
- Fix zig Platform.Size JSON string type (was int64, upstream sends string)
- Filter install scripts, cosign keys, compat.json as meta-assets
- Add riscv64, loong64, armv5, mipsle, mips64le to buildmeta

Full classification produces 169,867 distributable rows across 116 packages.
2026-03-10 00:27:57 -06:00
AJ ONeal
e78a721b51 fix: infer macOS from .app.zip/.dmg, filter npm tarballs and .d.ts
- .app.zip and .dmg formats now infer darwin OS when absent
- Filter .tgz (npm packages) and .d.ts (TypeScript defs) as meta-assets
- Reduces bun false positives by 64, deno by 294
2026-03-10 00:24:15 -06:00
AJ ONeal
f7a6db53b3 fix: zig platform data lost in cache, expand classifier coverage
- Fix zig Platform.Size type: string in upstream JSON (json.Number)
- Fix zig Platforms json tag: was "-" (dropped in cache), now serializes
- Add riscv64, loong64, armv5 archs to buildmeta and classifier
- Add mipsle, mips64le arch detection patterns
- Add plan9 OS detection
- Add "mac" (word boundary) → darwin OS detection
- Add armhf → armv7, arm7 → armv7 patterns
- Infer Linux from .deb/.rpm format when OS absent
- Filter source archives and buildable-artifact meta-assets

Batch 2 tested: zig (246), flutter (2082), chromedriver (10300),
terraform (5550), julia (1783), iterm2 (262), mariadb (207), gpg (45)
serviceman/aliasman: 0 (source-only, no binary assets)
2026-03-10 00:22:33 -06:00
AJ ONeal
d398625f5d feat: add cmd/classify and improve classifier coverage
- Add cmd/classify: reads raw cached releases and produces a CSV of all
  distributables with sortable version columns (ver_major/minor/patch/pre)
- Export rawcache.ActivePath() for use by cmd/classify
- Add OS detection: openbsd, netbsd, dragonflybsd, plan9, mac→darwin
- Add arch detection: armv5, armhf→armv7, arm7→armv7, 386→x86,
  32bit/64bit (no hyphen), universal→universal2, riscv64, loong64,
  mipsle, mips64le
- Infer Linux from .deb/.rpm format when OS not in filename
- Add .deb and .rpm as recognized formats
- Normalize all per-source values to buildmeta vocabulary (x86_64, aarch64)
- Filter source archives and buildable-artifact meta-assets
- Add CAT-RULES.md tracking classifier learnings
- Add CATEGORIZED.md and LINKS.md for reference

Batch 1 tested: go, node, hugo, caddy, pathman (35,919 rows)
2026-03-10 00:17:17 -06:00
AJ ONeal
efda7c60aa add gittag conf for vim plugins, alias confs, fix psql as own package
Vim plugins with gittag source:
- vim-airline, vim-airline-themes, vim-ale, vim-devicons, vim-go
- vim-nerdtree, vim-prettier, vim-rust, vim-sensible, vim-shfmt
- vim-syntastic

rust.vim is a directory symlink to vim-rust, so it shares the same
releases.conf automatically.

Alias confs (alias_of):
- postgresql → postgres
- postgresql-client, postgres-client → psql
- mariadb-server, mariadbd → mariadb
- gnupg → gpg, iterm → iterm2, ziglang → zig
- trippy → trip, powershell → pwsh

Fix: psql is its own package (postgres client), not an alias of
postgres (server). Both use the same GitHub repo
(bnnanet/postgresql-releases) but install different binaries.
2026-03-09 23:24:35 -06:00
AJ ONeal
7f0c92e262 add releases.conf for all remaining packages and wire new fetchers
New fetcher packages:
- chromedist: Chrome for Testing API (googlechromelabs.github.io)
- gpgdist: SourceForge RSS for GPG macOS
- mariadbdist: MariaDB downloads REST API

New releases.conf files for:
- GitHub: aliasman, awless, duckdns.sh, hugo-extended, kubens, rg, postgres
- gittag: vim-commentary, vim-zig
- gitea: pathman
- chromedist: chromedriver
- gpgdist: gpg
- mariadbdist: mariadb
- nodedist: node

Alias support (alias_of key):
- golang → go, dashd → dashcore, psql → postgres, zig.vim → vim-zig
- Aliases skip fetching and share cache with their target

Every package with a releases.js now has a releases.conf (except the
dead macos package). fetchraw dispatches to all 13 source types.
2026-03-09 22:48:11 -06:00
AJ ONeal
990221454e add fetchers for non-GitHub release sources
New fetcher packages:
- golang: golang.org/dl/?mode=json&include=all
- zigdist: ziglang.org/download/index.json
- flutterdist: Google Storage per-OS release indexes
- iterm2dist: scrapes iterm2.com/downloads.html
- hashicorp: releases.hashicorp.com/{product}/index.json
- juliadist: julialang-s3.julialang.org/bin/versions.json

Each follows the same iter.Seq2 pattern as the existing nodedist/github
fetchers. Added releases.conf files for all six packages and wired them
into cmd/fetchraw.

Fixed latest-version detection for sources that return unordered data
(hashicorp, zigdist, juliadist) by comparing all versions with lexver
instead of taking the first stable one found.
2026-03-09 22:39:16 -06:00
AJ ONeal
c4a358e5a5 add example releases.conf and skip _-prefixed dirs in fetchraw
The discover() function now skips directories starting with _ (like
_example, _webi, _common) so infrastructure dirs aren't treated as
packages to fetch.
2026-03-09 22:33:15 -06:00
AJ ONeal
caae40df27 ref(fetchraw): read from releases.conf instead of hardcoded list
Discovers packages by globbing {confDir}/*/releases.conf. Adding a
new package is now just creating a conf file — no Go code changes.
Dispatches to the right fetcher based on source= (github, nodedist).
2026-03-09 22:29:03 -06:00
AJ ONeal
b98cbc975c feat: add releases.conf files and installerconf parser
Simple key=value config per package declaring the fetch source and
its parameters. Greppable, no dependencies needed to parse.

  grep 'source = github' */releases.conf
  grep 'owner = therootcompany' */releases.conf

70 packages configured. installerconf package provides the reader.
fetchraw will be updated to read these instead of a hardcoded list.
2026-03-09 22:27:26 -06:00
AJ ONeal
2c3b21a5ae add releases.conf for all GitHub and nodedist packages
Declarative key=value config files that specify the release source
(github or nodedist), owner/repo, and optional tag_prefix for
monorepo packages. These replace the per-package releases.js logic
for the Go rewrite.
2026-03-09 22:27:02 -06:00
AJ ONeal
69a23f3592 feat: add audit log, merge strategy, and all GitHub packages
- rawcache: add Merge() that skips unchanged releases, logs added/
  changed events to an append-only JSONL audit log with SHA-256
- rawcache: drop .json extension from filenames — raw cache stores
  opaque bytes (upstream may be JSON, CSV, XML, or bespoke)
- fetchraw: add all 68 GitHub packages, use Merge instead of Put
- fetchraw: log format shows +added ~changed =skipped
2026-03-09 22:19:11 -06:00
AJ ONeal
215c996eb7 fix(fetchraw): use merge strategy instead of full replace
Put directly into the active slot instead of BeginRefresh. Existing
releases are skipped (Has check), new ones are added, _latest is
only updated if the candidate is newer. Safe to run repeatedly —
backports and delayed releases accumulate without losing history.
2026-03-09 22:12:31 -06:00
AJ ONeal
c8e5a007f5 feat: add fetchraw tool for populating raw release cache
Fetches complete release histories from upstream APIs and stores
them in rawcache. Supports GitHub (with pagination, auth, monorepo
tag prefix filtering) and Node.js dist API (official + unofficial
as separate caches to avoid version collisions).

Tested with: node-official (834), node-unofficial (387),
hugo (365), caddy (134), monorel (3).
2026-03-09 22:11:05 -06:00
AJ ONeal
bdf7ad4a56 docs: update GO_WEBI.md with current progress and design decisions
Reflect completed work (all fetchers, rawcache, classify, platlatest,
CompatArches), update repo layout to match actual packages, document
the fallback/compatibility design (classifier is 80/20 default,
per-installer config is the authority), add open questions for CPU
micro-arch detection and installer config format.
2026-03-09 22:07:26 -06:00
AJ ONeal
5dba2de20b feat(buildmeta): add CompatArches and universal binary arch types
CompatArches returns what a given OS+arch can execute — OS-level
facts like Rosetta 2 (darwin arm64 runs x86_64), Windows ARM
emulation, and x86-64 micro-arch backward compat. Also adds
ArchUniversal1 (PPC+x86) and ArchUniversal2 (x86_64+ARM64).

Per-package/per-version overrides (libc compat, nonstandard naming)
remain the installer config's responsibility.
2026-03-09 21:57:43 -06:00
AJ ONeal
1253fcd671 ref: remove universal fallback chains from buildmeta and platlatest
Arch and libc fallbacks are not universal — they depend on the OS,
the package, and even the version. ARM64 on macOS/Windows can run
x64 (Rosetta/emulation) but not on Linux. Musl can be static or
dynamically linked depending on the package version. Windows GNU
may or may not need mingw.

These rules belong in per-installer config, not in shared types.
platlatest stays as a simple fact store (triplet → version).
Resolution with fallbacks will be the caller's job.
2026-03-09 21:50:10 -06:00
AJ ONeal
34cfe32492 feat: add arch/libc fallback chains and version waterfall resolution
Prefer latest version over best CPU match. An amd64v4 machine gets
v2.0.0 (baseline only) instead of v1.0.0 (which had a v4 build)
because recency beats specificity.

- buildmeta: add amd64v2/v3/v4 micro-levels, ArchFallbacks, LibcFallbacks
- classify: detect micro-arch levels, treat Windows "arm" as ARM64
- platlatest: add Resolve() that walks fallback chains picking newest
2026-03-09 21:44:06 -06:00
AJ ONeal
1e26a3e5ec feat: add classify and platlatest packages
classify extracts OS, arch, libc, and format from release asset
filenames using regex pattern matching with priority ordering
(x86_64 before x86, arm64 before armv7, etc.).

platlatest tracks the newest release version per build target
(OS+arch+libc triplet) to handle the common case where Windows
or macOS releases lag behind Linux by several versions.
2026-03-09 21:33:59 -06:00
AJ ONeal
ae39837145 feat(rawcache): add double-buffered raw release cache
Stores one JSON file per release, named by tag. Supports:
- Incremental updates: atomic writes to the active slot
- Full refresh: write to standby slot, atomic symlink swap
- O(1) existence check and latest-tag lookup
2026-03-09 21:28:03 -06:00
AJ ONeal
574e5be929 feat(releases): add source archive fetchers for GitHub, Gitea, GitLab
For packages installed from auto-generated source tarballs rather
than uploaded binary assets (shell scripts, vim plugins, etc.).
Each delegates to its respective forge fetcher — the distinction
is organizational, signaling which fields the consumer should use.
2026-03-09 21:10:18 -06:00