Commit Graph

1408 Commits

Author SHA1 Message Date
AJ ONeal
bd3bd85e43 feat(installerconf): github_source packages include git_url for clone fallback
git_url is now a standalone field that can appear alongside any source
type. For githubsource packages, it adds a git clone entry per release
in addition to the tarball and zipball. Updated aliasman, duckdns.sh,
and serviceman configs.
2026-03-11 11:46:59 -06:00
AJ ONeal
bbcaa0f464 docs: update answers with three-strategy fix details 2026-03-11 11:42:59 -06:00
AJ ONeal
0ae4d01d75 fix(classifypkg): separate github, githubsource, and gittag strategies
Three distinct fetch/classify strategies:
- github: binary assets only, no source entries
- githubsource: tarball + zipball from GitHub releases API
- gittag: git clone + tag enumeration (existing)

GitHub binary packages (caddy, jq, shellcheck, etc.) no longer get
spurious .git and source tarball entries for old releases that had
no binary uploads. Source-installable packages (aliasman, duckdns.sh,
serviceman) now use github_source in releases.conf.
2026-03-11 11:42:35 -06:00
AJ ONeal
5858a9fefd docs: confirm .git resolution is a Node.js resolver issue, not cache data 2026-03-11 11:34:49 -06:00
AJ ONeal
a5f2dc87cf fix(comparecache): -sample picks random assets, not packages
-sample N now randomly samples N assets from each package's diff list,
giving a representative view of classification differences instead of
showing only the first alphabetical entries. Implies -windowed -diffs
to filter out version-depth noise and focus on real bugs.
2026-03-11 11:31:58 -06:00
AJ ONeal
47081c6e17 fix(installerconf): align tests with actual config format
Tests were using separate source/owner/repo keys but the parser expects
github_repo=owner/repo, gitea_repo=owner/repo, etc. Fixed all test
configs to match. Also answered Issue 4 (darwin-universal) for other agent.
2026-03-11 11:29:30 -06:00
AJ ONeal
2b488693b0 feat(comparecache): add -sample flag to pick random extra packages
Usage: go run ./cmd/comparecache -sample 8 -diffs
Picks 8 random packages beyond any explicitly named ones, logs which
ones were sampled for reproducibility.
2026-03-11 11:20:43 -06:00
AJ ONeal
5606773945 fix(webid): add missing imports to bootstrap_test.go
The getWithUA helper needs io, net/http, and net/http/httptest imports.
All 4 bootstrap/installer tests pass.
2026-03-11 11:18:43 -06:00
AJ ONeal
c1a5f2485d feat(webid): split bootstrap and installer routes
Production has two separate flows:
1. /{pkg} (curl-pipe bootstrap) — minimal script that sets WEBI_PKG,
   WEBI_HOST, WEBI_CHECKSUM and downloads+runs webi
2. /api/installers/{pkg}.sh — full installer with resolved release
   and embedded install.sh

Previously handleBootstrap served the full installer. Now:
- handleBootstrap: curl-pipe bootstrap (reads curl-pipe-bootstrap.tpl.sh)
- handleInstaller: full installer (/api/installers/{pkg}.sh)

Also:
- Export render.InjectVar for use by bootstrap handler
- Add webi.sh checksum calculation (SHA-1 first 8 chars)
- Add /api/installers/ route to mux and test server
2026-03-11 02:42:46 -06:00
AJ ONeal
d46cb313cb fix(v1api): use proper csv.Writer with tab delimiter instead of commaToTab
The commaToTab byte replacement was fragile — URLs containing commas
would break. Now uses csv.Writer with Comma='\t' as the backend for
csvutil.Encoder, producing correct TSV output regardless of field content.
2026-03-11 02:39:19 -06:00
AJ ONeal
5eab504c3c test(webid): add jq resolve test, skip upstream gaps in resolve tests
- Added TestV1ResolveJQ to verify jq resolves to binary, not git
- Changed upstream gap detection in resolve_cache_test to t.Skipf
  (shellcheck/windows and xz/linux-arm64 don't have upstream builds)
- Updated ANSWERS.md with git assets investigation results
2026-03-11 02:38:08 -06:00
AJ ONeal
ac6b74a5d8 docs: answer inter-agent questions about libc and git assets 2026-03-11 02:35:57 -06:00
AJ ONeal
dd5f941eca feat(webid): add v1 API with TSV-first format and resolver endpoint
New API routes:
- GET /v1/releases/{pkg}.tab — list releases as TSV (with header)
- GET /v1/releases/{pkg}.json — list releases as JSON array
- GET /v1/resolve/{pkg}.tab — resolve best asset for platform (TSV)
- GET /v1/resolve/{pkg}.json — resolve best asset for platform (JSON)

Key design decisions:
- TSV as primary format via csvutil (easy for cut/grep/sort/agents)
- Go-native naming: darwin, x86_64, aarch64 (no legacy mapping)
- No quoted fields — spaces for lists within fields
- Always includes header row in TSV output
- Resolve endpoint returns single best match with triplet info

Query params: os, arch, libc, channel, version, lts, format, variant, limit
2026-03-11 02:34:32 -06:00
AJ ONeal
9269c32b9c fix(webid): match production API format for legacy endpoints
- JSON response returns bare array (not wrapped in {"releases": [...]})
- OS names mapped to Node.js conventions: darwin → macos
- Arch names mapped: x86_64 → amd64, aarch64 → arm64
- Version strings stripped of "v" prefix
- Extension stripped of "." prefix
- Empty libc defaults to "none"
- Tab format uses actual TSV (not comma-separated)
- Tab LTS field uses "lts" / "-" (not "true" / "false")
- Tab shows header row only with ?pretty=true
- Releases sorted newest-first by version (using lexver)
- Added comprehensive format tests and production comparison test
2026-03-11 02:31:04 -06:00
AJ ONeal
a24d361289 feat(render): add installer script renderer and bootstrap route
Renders package-install.tpl.sh with WEBI_* variable injection and
install.sh splicing. Bootstrap route at /{package}@{version} detects
UA, resolves best release, and returns rendered installer script.
2026-03-11 02:03:58 -06:00
AJ ONeal
9d3d28704e feat(webid): add HTTP API server with legacy release routes
Serves /api/releases/{pkg}@{version}.json and .tab matching the
Node.js format. Supports query params for os, arch, libc, channel,
formats, lts, limit. Handles selfhosted packages (install.sh only).

Pre-loads all cached packages on startup. Includes /api/debug for
UA detection and /api/health endpoint.
2026-03-11 02:00:46 -06:00
AJ ONeal
f02b38255b feat(resolver): add new resolver for new API routes
Triplet-based resolution with indexed lookup for fast matching.
Supports channel hierarchy (alpha > beta > rc > stable), LTS filtering,
variant selection, format preferences, and arch fallback via CompatArches.

All 13 unit tests and cache integration tests pass against real data
for 100+ packages.
2026-03-11 01:51:12 -06:00
AJ ONeal
ed38c63e91 docs: add HANDOFF.md for Node.js cache-only migration
Detailed instructions for the next step: making the Node.js server
read only from Go-generated _cache/ files, removing all upstream
API fetching from the Node.js code path.
2026-03-11 01:22:05 -06:00
AJ ONeal
f167f32aa2 docs: update GO_WEBI.md to reflect current state
- releases.conf format updated (source inferred from key)
- Phase 1 checklist complete except resolver
- All release fetchers listed (18 source packages)
- Per-package releases packages documented
- Legacy export filtering description corrected (Variants not Extra)
- Resolved questions updated (rate limiting, config format, normalization)
- Stale open question removed (rate limiting solved via round-robin)
2026-03-11 01:11:55 -06:00
AJ ONeal
86e73937cd ref(installerconf): remove old source/owner/repo fallback
The default branch now only handles one-off dist sources that use
source= with url=. No config file uses owner=/repo= anymore.
2026-03-11 01:07:44 -06:00
AJ ONeal
0861ebc8b8 ref(releases.conf): collapse source/owner/repo into single keys
Source type is now inferred from the primary key:
  github_repo = owner/repo   (was source=github + owner + repo)
  git_url = https://...      (was source=gittag + url)
  gitea_repo = owner/repo    (was source=gitea + owner + repo)
  hashicorp_product = name   (was source=hashicorp + product)

One-off dist sources (nodedist, zigdist, etc.) keep the explicit
source= key since they're already one-liners.

Parser still accepts the old format via the default fallback branch.
2026-03-11 01:05:08 -06:00
AJ ONeal
d0801d0952 fix(classifypkg): handle gittag HEAD entries for legacy cache
Tagless repos (only HEAD, no real version tags): rewrite HEAD version
to Node.js-compatible format (v2023.10.10-18.42.21) with full UTC
datetime.

Repos with real tags + HEAD: tag HEAD entries with "head" variant so
ExportLegacy filters them out (they shouldn't appear in legacy cache).
2026-03-11 00:57:16 -06:00
AJ ONeal
695df60a9d fix(postgres): add legacy EnterpriseDB releases and appendLegacy pipeline step
Hardcode the old 10.12, 10.13, 11.8, 12.3 releases from EnterpriseDB
that predate the bnnanet/postgresql-releases GitHub repo. Both postgres
and psql now match the live cache exactly.
2026-03-11 00:43:18 -06:00
AJ ONeal
44721b9aa8 fix(postgres/psql): normalize REL_17_0 tag format to 17.0
Strip REL_ prefix and convert underscores to dots in a per-package
normalizer rather than config, matching the convention for watchexec.
2026-03-11 00:41:41 -06:00
AJ ONeal
d53f4ee16f fix(ripgrep): make alias of rg instead of duplicate package
ripgrep and rg had identical releases.conf pointing to the same
GitHub repo. The canonical name is rg (matches live cache).
2026-03-11 00:37:26 -06:00
AJ ONeal
90149ac945 ref(webicached): round-robin refresh, skip aliases, rate limit API
- Default mode: classify all from rawcache on startup, then
  fetch+refresh one package per tick (round-robin).
- --eager flag for the old behavior (fetch all on startup).
- Skip aliases and symlinked dirs — legacy cache doesn't create
  entries for them (resolved at request time by the server).
- Add --page-delay (default 2s) to rate-limit paginated API requests.
- Add delayTransport wrapper on http.Client.
2026-03-11 00:29:40 -06:00
AJ ONeal
413ec722f2 fix(webicached): detect symlinked package dirs as aliases
Symlinked directories (e.g. rust.vim → vim-rust) are now treated as
aliases instead of being independently fetched and classified. Creates
cache symlinks just like alias_of config entries.
2026-03-11 00:24:51 -06:00
AJ ONeal
c173873bac fix(pwsh): tag win-version-specific and AppImage builds as variants
Early PowerShell releases (pre-6.1) used Windows-version-specific
filenames (win10-win2016, win81-win2012r2) that the legacy cache
can't resolve. Tag them as variants so they're filtered from legacy
export but preserved for future Go resolver use.
2026-03-11 00:13:29 -06:00
AJ ONeal
ec30b34241 fix(gittag): use HEAD-{date} format for tagless repos
Avoids HEAD date-versions (2024.06.08) sorting ahead of real semver
tags (v1.2) since they measure different things.
2026-03-11 00:10:30 -06:00
AJ ONeal
dfb76794be add(pg-essentials): add releases.conf with gittag source
Shell script collection installed via git clone. Filename format
differs from Node.js cache (which uses GitHub source archive naming
with owner prefix and git describe suffix).
2026-03-11 00:04:55 -06:00
AJ ONeal
795fff1bb4 fix(iterm2dist): fix version extraction for preview releases and deduplicate URLs
The regex captured the beta/preview number but not the keyword itself,
so "3.0.0-preview" collapsed to "3.0.0". Also deduplicate by version
since the downloads page has duplicate links with different URL formats
(e.g. iTerm2-3_5_1beta1.zip and iTerm2-3_5_1_beta1.zip).
2026-03-10 23:59:11 -06:00
AJ ONeal
402cd6d6c2 fix(flutter): include arch in rawcache tag to prevent collisions
Flutter's API returns separate entries for universal (x64) and arm64
macOS builds under the same version/channel/os. The rawcache tag
was version-channel-os, so arm64 overwrote universal. Now extracts
arch from the archive path and appends it to the tag.

Re-fetched flutter: +218 entries recovered.
2026-03-10 23:33:04 -06:00
AJ ONeal
f53c508303 style: one entry per line in map/slice literals
Put each entry on its own line for readability — no staggering
multiple entries per line.
2026-03-10 23:29:22 -06:00
AJ ONeal
b8c67491fe feat: resolve alias_of in cache pipeline
Packages with alias_of in releases.conf (e.g. dashd → dashcore,
golang → go) now get symlinked cache files so they resolve to the
same JSON as their target. 13 aliases total.

Added AliasOf as a proper field in installerconf.Conf, LinkAlias
method to fsstore, and alias handling in webicached's Run loop.
2026-03-10 23:28:36 -06:00
AJ ONeal
f36e734539 fix: infer release channel from version string
GitHub's prerelease boolean is often not set for rc/beta/alpha/dev/pre
releases. Add channelFromVersion() to detect these from the version
string as a fallback. Applied to github, gitea, gittag, and hashicorp
classifiers. Hashicorp's inline checks replaced with the shared helper.

-pre maps to beta (prerelease), -preview stays preview.
2026-03-10 23:18:11 -06:00
AJ ONeal
f963b35e01 ref(watchexec): move cli- prefix stripping from config to code
The cli- prefix is a watchexec-specific monorepo artifact, not a generic
config concern. Move it to internal/releases/watchexec/versions.go
alongside other per-package normalizers (git, lf).
2026-03-10 23:11:14 -06:00
AJ ONeal
07d5f36ed4 fix: postgres/psql cross-contamination, watchexec tag filter, meta assets
- postgres/psql: add asset_filter to separate assets from shared repo
  (bnnanet/postgresql-releases contains postgres-*, postgresql-*, psql-*)
- watchexec: change tag_prefix to version_prefixes so old plain-tagged
  releases (v1.20.6+) aren't filtered out — only strip the cli- prefix
- classify: add .minisig, b3sums, dist-manifest.json to IsMetaAsset
  filter to prevent checksum/signature files from leaking into cache
2026-03-10 18:56:19 -06:00
AJ ONeal
dbe3632df4 fix(bun): add tag_prefix to strip bun- from version tags
Bun releases use tags like bun-v1.2.3. Without tag_prefix, the version
included the bun- prefix, causing mismatches. Also update comparecache
with bun version normalizer for accurate comparison.
2026-03-10 18:39:17 -06:00
AJ ONeal
7e22ba01a0 fix: ffmpeg version prefix, .gz legacy format, iterm2 regex
- ffmpeg: add version_prefix = b to strip 'b' from tags (b6.0 → 6.0)
- legacy.go: add .gz to legacyFormats for bare gzipped binaries
- iterm2: broaden regex to handle preview/beta variants, skip empty
  versions

Match count: 75/106
2026-03-10 18:35:51 -06:00
AJ ONeal
2d01a1cf54 fix: jq version prefix, watchexec monorepo tag filter
- jq: add version_prefixes = jq- to strip jq- from version strings
- watchexec: add tag_prefix = cli- to filter monorepo tags correctly
- classifyGitHub: skip tags not matching tag_prefix in monorepos
- comparecache: add watchexec version normalization

Match count: 74/106
2026-03-10 18:33:26 -06:00
AJ ONeal
a4e9f875cd fix(go): pad versions to 3 parts, filter -arm6. oddity
Node.js pads Go versions like "1.10" to "1.10.0". Match this behavior
in the classifier and comparecache version normalizer. Also filter
-arm6. malformed arch and .src. source tarballs from comparison noise.

Match count: 73/106
2026-03-10 18:30:57 -06:00
AJ ONeal
56a8a8ea71 fix(fish): add .app.zip to legacy formats, exclude noise assets
- Add .app.zip to legacyFormats so macOS fish builds export correctly
- Exclude bundledpcre, fish-static, OpenBeta from fish/releases.conf
- Add fish Linux binaries to comparecache noise (Go improvement)

Match count: 72/106
2026-03-10 18:29:35 -06:00
AJ ONeal
13798de1b0 fix(lf): normalize rN version tags to 0.N.0
lf uses tags like "r21", "r33". Node.js converts these to "0.21.0".
Add version normalization in both classifier and comparecache.

Match count: 71/106
2026-03-10 18:28:13 -06:00
AJ ONeal
05abb1ffd2 fix(git): normalize .windows.N version suffix
Git for Windows uses tags like v2.53.0.windows.1. Node.js strips
".windows.1" and replaces ".windows.N" (N>1) with ".N".

Add NormalizeVersions to the git package and wire it into the classify
pipeline. Also add version normalization to comparecache so the
comparison uses canonical versions for both caches.

Remaining git diffs: data freshness (.windows.2 releases Go hasn't
fetched) and RC versions in Go that live doesn't have.
2026-03-10 18:26:41 -06:00
AJ ONeal
ada10ed43a fix(comparecache): filter GPU variant assets as known noise
rocm and jetpack variants are tagged by Go's variant system but kept
by Node.js with special arch names. Filter them from comparison noise
to avoid false positives.

Match count: 70/106
2026-03-10 18:22:37 -06:00
AJ ONeal
e1bfad0bb8 fix(git): filter to MinGit assets only, exclude busybox
Node.js releases.js only keeps MinGit assets and excludes busybox.
Add asset_filter and exclude to releases.conf to match.

Remaining diff: version normalization (.windows.N suffix stripping)
and data freshness (Go missing .windows.2 releases).
2026-03-10 18:21:19 -06:00
AJ ONeal
8f9cf8e487 fix: exclude known noise from cache comparison and configs
- Hugo: exclude Linux-64bit legacy filename alias
- Hugo-extended: exclude Linux-64bit legacy filename alias
- Gitea: exclude -src- and -docs- tarballs
- Pathman: exclude armv8 legacy alias
- UUID v7: exclude exotic architectures (thumb, armeb, loong, gnux32, risc)
- comparecache: filter bare executables and docs tarballs as noise,
  apply noise filter to both live and Go sides
- legacy.go: add .tar.bz2 to legacyFormats

Match count: 69/106 (up from 58)
2026-03-10 18:18:38 -06:00
AJ ONeal
2ebecb644e feat(gitea): add gogit variant tagger
Tag assets with "-gogit-" in the filename as the "gogit" variant.
These use a pure-Go Git backend instead of the default C Git library.
2026-03-10 18:08:19 -06:00
AJ ONeal
86e3d8f969 ref: extract classification pipeline into internal/classifypkg
Move all source-specific classifiers, variant tagging, config filtering,
and readAllRaw out of cmd/webicached into internal/classifypkg. The new
Package() function runs the full classify pipeline: source dispatch →
tag variants → apply config.

webicached now only handles fetching raw data and writing to fsstore.
The classification logic is reusable by comparecache and future tools.
2026-03-10 18:06:02 -06:00
AJ ONeal
c1b81157dc fix(gittag): produce correct filenames, versions, and format for git assets
- gittag classifier: use "{repo}-{tag}" filenames (matching Node.js),
  strip "v" prefix from version, synthesize date-based version for
  tagless repos (HEAD of master/main)
- GitHub source-only: use "git" format (no dot) and "{repo}-{tag}"
  filename for clone assets
- Legacy export: add "git" to recognized formats so gittag packages
  appear in the legacy cache
- Derives repo name from the git URL in releases.conf

vim-commentary now matches. vim-zig matches on format but has newer
data (expected — Go fetched more recently than Node.js).
2026-03-10 18:00:43 -06:00