Commit Graph

13 Commits

Author SHA1 Message Date
AJ ONeal
7f901fc9d5 fix(classify): fix amd64_vN regression — exclude dash form to avoid version number matches
amd64[_-]?v2 matched syncthing filenames like 'amd64-v2.0.5' where '-v2' is
the start of the release version, not an arch micro-level. Changed to amd64_?v2
(underscore optional, dash excluded) which correctly matches:
- amd64v2 (no separator, original form)
- amd64_v2 (underscore, pathman form)

But NOT amd64-v2.0.5 (dash + version number, syncthing).
2026-03-11 17:59:57 -06:00
AJ ONeal
4f09649d30 fix(legacy): fix PACKAGE FORMAT CHANGE warnings in legacy cache export
Reduce PACKAGE FORMAT CHANGE warnings from 6,149 to ~3,200 by aligning
the legacy export field values with what the Node build-classifier extracts
from filenames.

classify.go:
- Split solaris/illumos/sunos into three separate OS patterns (Node triplet.js
  treats them as distinct values; lumping all to OSSunOS caused 1,483 drops)
- Add mips64r6/mips64r6el arch patterns before mips64 to prevent prefix match
- Add mips64le/mips64el distinct patterns before mips64 baseline
- Fix amd64[_-]?v2/v3/v4 regex to match underscore form (e.g. pathman amd64_v2)

buildmeta.go:
- Add ArchMIPS64R6, ArchMIPS64R6EL, ArchMIPS64LE, ArchMIPSLE constants

legacy.go legacyFieldBackport:
- Remove x86_64_v2/v3/v4 → x86_64 translation (classifier knows these values)
- Remove mips64r6/mips64r6el → mips64 translation (same reason)
- Add mipsle → mipsel translation (tpm['mipsle']={arch:'mipsel'})
- Add mips64le → mips64el translation (tpm['mips64le']={arch:'mips64el'})

legacy.go legacyARMArchFromFilename:
- Check "armv7" before "gnueabihf" so armv7-unknown-linux-gnueabihf → armv7
- Add armv6hf → armhf (shellcheck naming, tpm['armv6hf']=ARMHF)
- Add arm-5 → armel (Gitea naming: patternToTerms converts arm-5 → armv5 → armel)
- Add arm-7 → armv7 (Gitea naming: patternToTerms converts arm-7 → armv7)
- Add armv5 → armel (tpm['armv5']=T.ARMEL)
2026-03-11 17:27:37 -06:00
AJ ONeal
aec68692a1 fix(classify): fix ARM, ppc64el, winx64 detection; fix legacy universal2/solaris export
Classifier fixes:
- Remove Windows arm→arm64 auto-promotion; packages like caddy/fzf/goreleaser
  have genuine arm32 Windows builds (windows_armv6) that were wrongly promoted
- Add armel and gnueabihf as ARMv6 aliases (jq, caddy and others use these)
- Add winx64 to Windows OS pattern (MariaDB uses winx64 in filenames)
- Add ppc64el as ppc64le alias (Debian/Ubuntu naming, used by jq)
- Normalize armv6l → armv6 in normalizeGoArch (Go dist had armv6l filenames)
- Fix classifyGPGDist hardcoded "amd64" → buildmeta.ArchAMD64 ("x86_64")

Legacy export fixes:
- Map solaris/illumos → sunos globally (Node.js only knows "sunos")
- Expand universal2 → two entries (aarch64 + x86_64) so Hugo/cmake/gh/syncthing
  work on both Apple Silicon and Intel Mac in the legacy resolver
- Remove double-application of legacyFieldBackport (toLegacy no longer calls it)
2026-03-11 14:54:25 -06:00
AJ ONeal
b236c8ac6b ref: move legacy field backport from classifypkg to ExportLegacy; add .apk/.AppImage formats
- Remove LegacyBackport from classifypkg and webicached; canonical values
  now flow through storage untouched
- Add legacyFieldBackport() in storage/legacy.go, called only at export time
  (go: armv6→arm, ffmpeg windows: .gz/.empty→.exe)
- ExportLegacy now takes pkg name and returns LegacyDropStats (variants + formats dropped)
- fsstore.Commit logs dropped assets so filtering is visible
- Add FormatAPK (.apk) and FormatAppImage (.AppImage) to buildmeta and classify
  so those files are properly classified and then correctly dropped from legacy export
  rather than passing through as empty-format
2026-03-11 14:41:30 -06:00
AJ ONeal
f53c508303 style: one entry per line in map/slice literals
Put each entry on its own line for readability — no staggering
multiple entries per line.
2026-03-10 23:29:22 -06:00
AJ ONeal
07d5f36ed4 fix: postgres/psql cross-contamination, watchexec tag filter, meta assets
- postgres/psql: add asset_filter to separate assets from shared repo
  (bnnanet/postgresql-releases contains postgres-*, postgresql-*, psql-*)
- watchexec: change tag_prefix to version_prefixes so old plain-tagged
  releases (v1.20.6+) aren't filtered out — only strip the cli- prefix
- classify: add .minisig, b3sums, dist-manifest.json to IsMetaAsset
  filter to prevent checksum/signature files from leaking into cache
2026-03-10 18:56:19 -06:00
AJ ONeal
72fec20fb0 ref: move IsMetaAsset to classify package, share between tools
Moved isMetaAsset from cmd/webicached to classify.IsMetaAsset so
both webicached and comparecache use the same logic. Removed
duplicated isMetaFile from comparecache. The comparecache
isLiveNoise now delegates to classify.IsMetaAsset and adds
live-specific filters (.deb, .rpm, -src-).
2026-03-10 17:28:44 -06:00
AJ ONeal
a1714e0598 update comparison after variant tagging and legacy filter
Add .tar.bz2 to classifier format detection (was slipping through
as empty format). Update COMPARISON.md with fresh results: 21 exact
matches, .deb/.rpm/.tar.zst/.tar.bz2 now correctly filtered from
legacy export. Document remaining items for review.
2026-03-10 14:04:00 -06:00
AJ ONeal
28dab7dade feat: complete classification of all 116 packages (169,867 rows)
- Add asset_filter/asset_exclude conf keys for shared-repo packages
- Split hugo/hugo-extended: exclude/require "extended" in asset name
- Add macosx, ia32, .snap, .appx classifier patterns
- Fix zig Platform.Size JSON string type (was int64, upstream sends string)
- Filter install scripts, cosign keys, compat.json as meta-assets
- Add riscv64, loong64, armv5, mipsle, mips64le to buildmeta

Full classification produces 169,867 distributable rows across 116 packages.
2026-03-10 00:27:57 -06:00
AJ ONeal
e78a721b51 fix: infer macOS from .app.zip/.dmg, filter npm tarballs and .d.ts
- .app.zip and .dmg formats now infer darwin OS when absent
- Filter .tgz (npm packages) and .d.ts (TypeScript defs) as meta-assets
- Reduces bun false positives by 64, deno by 294
2026-03-10 00:24:15 -06:00
AJ ONeal
d398625f5d feat: add cmd/classify and improve classifier coverage
- Add cmd/classify: reads raw cached releases and produces a CSV of all
  distributables with sortable version columns (ver_major/minor/patch/pre)
- Export rawcache.ActivePath() for use by cmd/classify
- Add OS detection: openbsd, netbsd, dragonflybsd, plan9, mac→darwin
- Add arch detection: armv5, armhf→armv7, arm7→armv7, 386→x86,
  32bit/64bit (no hyphen), universal→universal2, riscv64, loong64,
  mipsle, mips64le
- Infer Linux from .deb/.rpm format when OS not in filename
- Add .deb and .rpm as recognized formats
- Normalize all per-source values to buildmeta vocabulary (x86_64, aarch64)
- Filter source archives and buildable-artifact meta-assets
- Add CAT-RULES.md tracking classifier learnings
- Add CATEGORIZED.md and LINKS.md for reference

Batch 1 tested: go, node, hugo, caddy, pathman (35,919 rows)
2026-03-10 00:17:17 -06:00
AJ ONeal
34cfe32492 feat: add arch/libc fallback chains and version waterfall resolution
Prefer latest version over best CPU match. An amd64v4 machine gets
v2.0.0 (baseline only) instead of v1.0.0 (which had a v4 build)
because recency beats specificity.

- buildmeta: add amd64v2/v3/v4 micro-levels, ArchFallbacks, LibcFallbacks
- classify: detect micro-arch levels, treat Windows "arm" as ARM64
- platlatest: add Resolve() that walks fallback chains picking newest
2026-03-09 21:44:06 -06:00
AJ ONeal
1e26a3e5ec feat: add classify and platlatest packages
classify extracts OS, arch, libc, and format from release asset
filenames using regex pattern matching with priority ordering
(x86_64 before x86, arm64 before armv7, etc.).

platlatest tracks the newest release version per build target
(OS+arch+libc triplet) to handle the common case where Windows
or macOS releases lag behind Linux by several versions.
2026-03-09 21:33:59 -06:00