Reflect completed work (all fetchers, rawcache, classify, platlatest, CompatArches), update repo layout to match actual packages, document the fallback/compatibility design (classifier is 80/20 default, per-installer config is the authority), add open questions for CPU micro-arch detection and installer config format.
16 KiB
Go Webi — Rewrite Plan
This is the planning and tracking document for rewriting the Webi server in Go. This is not a straight port — we're redesigning internals while preserving the public API surface.
Guiding Principles
- Incremental migration. Rewrites fail when they try to replace everything at once. We integrate piece by piece, endpoint by endpoint, into the live system.
- Library over framework. The Go code should be composable pieces the caller controls — not a framework that calls your code.
- stdlib + pgx, nothing else. No third-party SDKs. Dependencies: stdlib,
golang.org/x,github.com/jackc/pgx,github.com/therootcompany/golib. - Resilient by default. The HTTP client, caching, and storage layers are built for failure — timeouts, retries, circuit breaking, graceful fallback.
- Simpler classification. Standard toolchains (goreleaser, cargo-dist, etc.) produce predictable filenames. Match those patterns directly; push esoteric naming into release-fetcher tagging/filtering rather than classifier heuristics.
Repository Layout
cmd/
webid/ # main HTTP server
webicached/ # release cache daemon (fetches + stores releases)
internal/
buildmeta/ # OS, arch, libc, format constants and enums + CompatArches
classify/ # build artifact classification (filename/URL → target)
httpclient/ # resilient net/http client with best-practice defaults
lexver/ # lexicographic version parsing and sorting
platlatest/ # per-platform latest version index (triplet → version)
rawcache/ # double-buffered raw upstream API response storage
releases/ # release fetching — one package per source type
github/ # GitHub (thin wrapper over githubish)
githubish/ # generic GitHub-compatible API with Link header pagination
githubsrc/ # GitHub source archives (tarball/zipball URLs)
gitea/ # Gitea/Forgejo (own types, limit param, Link header)
giteasrc/ # Gitea source archives
gitlab/ # GitLab (own types, X-Total-Pages pagination)
gitlabsrc/ # GitLab source archives
gittag/ # bare git clone + tag listing
node/ # Node.js (official + unofficial builds)
nodedist/ # generic Node.js-style dist/index.json API
render/ # installer script template rendering
storage/ # release storage interface + implementations
storage.go # interface definition
fsstore/ # filesystem (JSON cache, like current _cache/)
pgstore/ # PostgreSQL (via sqlc + pgx)
uadetect/ # User-Agent → OS/arch/libc detection (regex-based)
Public API Surface (Must Remain Stable)
These are the endpoints that clients depend on. The URLs, query parameters, and response formats must not change.
Bootstrap (curl-pipe entry point)
GET /{package} # User-Agent dispatch:
GET /{package}@{version} # curl/wget/POSIX → bash bootstrap script
# PowerShell → ps1 bootstrap script
# Browser → HTML cheat sheet (separate app)
Installer Scripts
GET /api/installers/{package}.sh # POSIX installer
GET /api/installers/{package}@{version}.sh
GET /api/installers/{package}.ps1 # PowerShell installer
GET /api/installers/{package}@{version}.ps1
Query: ?formats=tar,zip,xz,git,dmg,pkg
&libc=msvc (ps1 only)
Release Metadata
GET /api/releases/{package}.json
GET /api/releases/{package}@{version}.json
GET /api/releases/{package}.tab
GET /api/releases/{package}@{version}.tab
Query: ?os=linux&arch=amd64&libc=musl
&channel=stable&limit=10&formats=tar,xz
&pretty=true
Package Assets
GET /packages/{package}/README.md
GET /packages/{package}/{filename}
Debug
GET /api/debug # returns detected OS/arch from User-Agent
Query: ?os=...&arch=... # overrides
Response Formats
JSON — { oses, arches, libcs, formats, releases: [{ version, date, os, arch, libc, ext, download, channel, lts, name }] }
TSV (.tab) — version \t lts \t channel \t date \t os \t arch \t ext \t - \t download \t name \t comment
Architecture
Two Servers
webid— the HTTP API server. Renders templates and serves responses. On each request, looks up releases by package name in storage (filesystem and/or Postgres, configurable). No package registry — if releases exist in storage for that name, it's a valid package. No restart needed when packages are added.webicached— the cache daemon. Built with its package set compiled in. Periodically fetches releases from upstream sources, classifies builds, and writes to both Postgres and the filesystem. Adding a new package means rebuilding and redeployingwebicached.
Adding a new installer requires rebuilding webicached, but not webid. The
API server discovers packages from storage — when the new webicached writes a
package's releases to Postgres or the filesystem, webid sees it on the next
read. No restart, no config reload.
This means webid never blocks on upstream API calls. It serves from whatever is
in storage — always fast, always available.
Double-Buffer Storage
The storage layer uses a double-buffer strategy so that a full release-history rewrite never disrupts active downloads:
Slot A: [current — being read by webid]
Slot B: [next — being written by webicached]
On completion: atomic swap A ↔ B
For fsstore: two directories per package, swap via atomic rename. For pgstore: two sets of rows per package (keyed by generation), swap via updating an active-generation pointer in a single transaction.
Storage Interface
type Store interface {
// Read path (used by webid)
GetPackageMeta(ctx context.Context, name string) (*PackageMeta, error)
GetReleases(ctx context.Context, name string, filter ReleaseFilter) ([]Release, error)
// Write path (used by webicached)
BeginRefresh(ctx context.Context, name string) (RefreshTx, error)
}
type RefreshTx interface {
PutReleases(ctx context.Context, releases []Release) error
Commit(ctx context.Context) error // atomic swap
Rollback(ctx context.Context) error
}
Resilient HTTP Client (internal/httpclient)
A net/http client with best-practice defaults, used as the base for all
upstream API calls:
- Timeouts: connect, TLS handshake, response header, overall request
- Connection pooling: sensible
MaxIdleConns,IdleConnTimeout - TLS:
MinVersion: tls.VersionTLS12, system cert pool - Redirects: limited redirect depth, no cross-scheme downgrades
- User-Agent: identifies as Webi with contact info
- Retries: exponential backoff with jitter for transient errors (429, 502,
503, 504), respects
Retry-Afterheaders - Context: all calls take
context.Contextfor cancellation - No global state: created as instances, not
http.DefaultClient
Release Fetchers (internal/releases/)
Each upstream source (GitHub, Gitea, git-tag) is a small package that uses
httpclient and returns a common []Release slice. No SDK dependencies.
// internal/releases/github/github.go
func FetchReleases(ctx context.Context, client *httpclient.Client,
owner, repo string, opts ...Option) ([]Release, error)
Build Classification (internal/classify)
The classifier is the 80/20 default — it handles the happy path where standard toolchains (goreleaser, cargo-dist, Zig, Rust) produce predictable filenames. It is not the authority; the per-installer config can override anything it detects.
- Regex-based detection with priority ordering (x86_64 before x86, arm64 before armv7, amd64v4/v3/v2 before baseline).
- OS-aware fixups: bare "arm" on Windows → ARM64.
- Accepts filenames or full download URLs (signal may be in path segments).
- Undetected fields are empty, not guessed.
Target triplet format: {os}-{arch}-{libc}.
Fallback & Compatibility
Arch and libc fallbacks are not universal rules. They vary by OS, package, and even package version:
- OS-level arch compat (
buildmeta.CompatArches): universal facts like "darwin arm64 runs x86_64 via Rosetta 2", "windows arm64 emulates x86_64". Includes macOS Universal1 (PPC+x86) and Universal2 (x86_64+ARM64). - Libc compat: per-package, per-version. Musl can be static (runs anywhere) or dynamically linked (needs polyfill). Windows GNU can be dependency-free or need mingw. This changes between versions of the same package.
- Arch micro-levels: amd64v4→v3→v2→v1 fallback is universal, but a package may drop specific micro-arch builds between versions.
Per-installer config declares the package-specific rules. The resolver combines installer config + platlatest + CompatArches to pick the right binary.
Installer Rendering (internal/render)
Replaces installers.js. Reads template files, substitutes variables, injects
the per-package install.sh / install.ps1.
The current template variable set (30+ env vars) is the contract with the
client-side scripts. We must produce identical output for package-install.tpl.sh
and package-install.tpl.ps1.
Reworking install.sh / install.ps1
Long-term, the per-package install scripts should feel like library users, not framework callbacks:
- Current (framework): define
pkg_install(),pkg_get_current_version(), etc. and the framework calls them. - Goal (library): source a helpers file, call functions like
webi_download,webi_extract,webi_linkexplicitly from a linear script.
This is a separate migration from the Go rewrite — it changes the client-side contract. Plan it but don't block the server rewrite on it.
Migration Strategy
Each phase produces something that works in production alongside the existing Node.js server.
Phase 0: Foundation
internal/buildmeta— shared vocabulary (OS, arch, libc, format, channel)internal/buildmeta—CompatArches(os, arch)— OS-level arch compat factsinternal/buildmeta— amd64 micro-arch levels (v1–v4), universal binary typesinternal/lexver— version strings → comparable stringsinternal/httpclient— resilient HTTP client for upstream API callsinternal/uadetect— User-Agent → OS/arch/libc (regex-based)- Go module init (
go 1.26.1, stdlib only) - CI setup
- CPU micro-arch detection in bootstrap scripts (POSIX + PowerShell)
Phase 1: Release Fetching & Caching
internal/releases/githubish— generic GitHub-compatible API fetcherinternal/releases/github— GitHub releases (thin wrapper)internal/releases/githubsrc— GitHub source archivesinternal/releases/gitea— Gitea/Forgejo releases (own types)internal/releases/giteasrc— Gitea source archivesinternal/releases/gitlab— GitLab releases (own types, X-Total-Pages)internal/releases/gitlabsrc— GitLab source archivesinternal/releases/gittag— git tag listing (bare clone)internal/releases/nodedist— Node.js-style dist/index.json APIinternal/releases/node— Node.js (official + unofficial builds)internal/rawcache— double-buffered raw upstream response storageinternal/classify— build artifact classifier (80/20, filename→target)internal/platlatest— per-platform latest version index- End-to-end: fetch complete histories for a few real packages
- Per-installer config format (fallback rules, version-ranged overrides)
- Resolver (platlatest + installer config + CompatArches → pick binary)
internal/storage— interface definitioninternal/storage/fsstore— filesystem implementationcmd/webicached— cache daemon that can replace the Node.js caching
Integration point: webicached writes the same _cache/ JSON format. The
Node.js server can read from it. Zero-risk cutover for release fetching.
Phase 2: Release API
cmd/webid— HTTP server skeleton with middlewareGET /api/releases/{package}.jsonendpointGET /api/releases/{package}.tabendpointGET /api/debugendpoint
Integration point: reverse proxy specific /api/releases/ paths to the Go
server. Node.js handles everything else.
Phase 3: Installer Rendering
internal/render— template engineGET /api/installers/{package}.shendpointGET /api/installers/{package}.ps1endpoint- Bootstrap endpoint (
GET /{package})
Integration point: reverse proxy installer paths to Go. Node.js only serves the website/cheat sheets (if it ever did — that may be a separate app).
Phase 4: PostgreSQL Storage
internal/storage/pgstore— sqlc-generated queries, double-buffer- Schema design and migrations
webicachedwrites to Postgreswebidreads from Postgres
Phase 5: Client-Side Rework
- Design new library-style install.sh helpers
- Migrate existing packages one at a time
- Update
package-install.tpl.shto support both old and new styles
Key Design Decisions
Version: Go 1.26+
Using http.ServeMux with PathValue for routing (available since Go 1.22).
Middleware via github.com/therootcompany/golib/http/middleware/v2.
No ORM
PostgreSQL access via pgx + sqlc. Queries are hand-written SQL, type-safe
Go code is generated.
Template Rendering
Use text/template or simple string replacement (matching current behavior).
The templates are shell scripts — they need literal $ and {} — so
text/template may be the wrong tool. Likely better to stick with the current
regex-replacement approach, ported to Go.
Error Handling
The current system returns a synthetic "error release" (version: 0.0.0,
channel: error) when no match is found, rather than an HTTP error. This
behavior must be preserved for backward compatibility.
Open Questions
- Should
webicachedshell out tonode releases.jsduring migration, or do we rewrite every releases.js as Go config/code from the start? (Shelling out preserves hot-add compatibility during the transition — a newreleases.jsjust works without any Go changes.) - What's the deployment topology? Single binary serving both roles? Separate processes? Kubernetes pods?
- Rate limiting for GitHub API calls in
webicached— how to coordinate across multiple instances? - Per-installer config format: what structure best expresses version-ranged libc overrides, arch fallback overrides, and nonstandard asset naming? Go struct + TOML/YAML? Go code (compiled into webicached)?
- CPU micro-arch detection: how should POSIX and PowerShell bootstrap scripts detect amd64v1/v2/v3/v4? Check /proc/cpuinfo flags (Linux), sysctl hw.optional (macOS), .NET intrinsics (Windows)?
Current Node.js Architecture (Reference)
For context, the current system's key files:
| File | Role |
|---|---|
_webi/serve-installer.js |
Main request handler — dispatches to builds + rendering |
_webi/builds.js |
Thin wrapper around builds-cacher |
_webi/builds-cacher.js |
Release fetching, caching, classification, version matching |
_webi/transform-releases.js |
Legacy release API (filter + cache + serve) |
_webi/normalize.js |
OS/arch/libc/ext regex detection from filenames |
_webi/installers.js |
Template rendering (bash + powershell) |
_webi/ua-detect.js |
User-Agent → OS/arch/libc |
_webi/projects.js |
Package metadata from README frontmatter |
_webi/frontmarker.js |
YAML frontmatter parser |
_common/github.js |
GitHub releases fetcher |
_common/gitea.js |
Gitea releases fetcher |
_common/git-tag.js |
Git tag listing |
{pkg}/releases.js |
Per-package release config (fetcher + filters + transforms) |
{pkg}/install.sh |
Per-package POSIX installer |
{pkg}/install.ps1 |
Per-package PowerShell installer |