Tests all 99 cached packages x 2 platforms (macOS arm64, Linux amd64). 192/198 pass. 6 failures all match production behavior (git, gpg, iterm2, mariadb — packages without downloadable binaries for these platforms).