Compare commits

..

3 Commits

Author SHA1 Message Date
CamrynCarter
3cd32fc210 chunk the haul 2026-02-28 13:03:13 -08:00
dependabot[bot]
e255eda007 bump github.com/theupdateframework/go-tuf/v2 (#517)
bumps the go_modules group with 1 update in the / directory: [github.com/theupdateframework/go-tuf/v2](https://github.com/theupdateframework/go-tuf).

updates `github.com/theupdateframework/go-tuf/v2` from 2.3.1 to 2.4.1
- [Release notes](https://github.com/theupdateframework/go-tuf/releases)
- [Commits](https://github.com/theupdateframework/go-tuf/compare/v2.3.1...v2.4.1)

---

updated-dependencies:
- dependency-name: github.com/theupdateframework/go-tuf/v2
  dependency-version: 2.4.1
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-23 17:19:37 -05:00
Camryn Carter
16f47999b1 keep registry on image rewrite if not specified (#501)
* keep registry on rewrite if not specified
* better logic
* add test
* accurate info output for rewrite references
* apply suggestions from code review

comment format and improved test

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Camryn Carter <camryn.carter@ranchergovernment.com>

---------

Signed-off-by: Camryn Carter <camryn.carter@ranchergovernment.com>
Co-authored-by: Zack Brady <zackbrady123@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-23 17:18:50 -05:00
10 changed files with 266 additions and 155 deletions

View File

@@ -207,6 +207,10 @@ jobs:
hauler store add image ghcr.io/hauler-dev/library/busybox:stable --rewrite /custom-path/busybox
# confirm tag
hauler store info | grep ':stable'
# confirm old registry used if not specified
hauler store add image ghcr.io/hauler-dev/library/nginx:1.25-alpine --rewrite custom-path/nginx
# verify existing registry associated with rewritten image in store
hauler store info | grep 'ghcr.io/custom-path/nginx'
- name: Verify - hauler store copy
run: |
@@ -246,6 +250,13 @@ jobs:
hauler store save --filename store.tar.zst
# verify via save with filename and platform (amd64)
hauler store save --filename store-amd64.tar.zst --platform linux/amd64
# verify via save with chunk-size (splits into haul-chunked_0.tar.zst, haul-chunked_1.tar.zst, ...)
hauler store save --filename haul-chunked.tar.zst --chunk-size 50M
# verify chunk files exist and original is removed
ls haul-chunked_*.tar.zst
! test -f haul-chunked.tar.zst
# verify at least two chunks were produced
[ $(ls haul-chunked_*.tar.zst | wc -l) -ge 2 ]
- name: Remove Hauler Store Contents
run: |
@@ -265,6 +276,14 @@ jobs:
hauler store load --filename store.tar.zst --tempdir /opt
# verify via load with filename and platform (amd64)
hauler store load --filename store-amd64.tar.zst
# verify via load from chunks using explicit first chunk
rm -rf store
hauler store load --filename haul-chunked_0.tar.zst
hauler store info
# verify via load from chunks using base filename (auto-detect)
rm -rf store
hauler store load --filename haul-chunked.tar.zst
hauler store info
- name: Verify Hauler Store Contents
run: |
@@ -287,7 +306,7 @@ jobs:
- name: Remove Hauler Store Contents
run: |
rm -rf store haul.tar.zst store.tar.zst store-amd64.tar.zst
rm -rf store haul.tar.zst store.tar.zst store-amd64.tar.zst haul-chunked_*.tar.zst
hauler store info
- name: Verify - hauler store sync

View File

@@ -1,144 +0,0 @@
# Development Guide
This document covers how to build Hauler locally and how the project's branching strategy works. It's intended for contributors making code changes or maintainers managing releases.
---
## Local Build
### Prerequisites
- **Go** — check `go.mod` for the minimum required version
- **Make**
- **Docker** (optional, for container image builds)
- **Git**
### Clone the Repository
```bash
git clone https://github.com/hauler-dev/hauler.git
cd hauler
```
### Build the Binary
Using Make:
```bash
make build
```
Or directly with Go:
```bash
go build -o hauler ./cmd/hauler
```
The compiled binary will be output to the project root. You can run it directly:
```bash
./hauler version
```
### Run Tests
```bash
make test
```
Or with Go:
```bash
go test ./...
```
### Useful Tips
- The `--store` flag defaults to `./store` in the current working directory during local testing, so running `./hauler store add ...` from the project root is safe and self-contained. Use `rm -rf store` in the working directory to clear.
- Set `--log-level debug` when developing to get verbose output.
---
## Branching Strategy
Hauler uses a **main-first, release branch** model. All development flows through `main`, and release branches are maintained for each minor version to support patching older release lines in parallel.
### Branch Structure
```
main ← source of truth, all development targets here
release/1.3 ← 1.3.x patch line
release/1.4 ← 1.4.x patch line
```
Release tags (`v1.4.1`, `v1.3.2`, etc.) are always cut from the corresponding `release/X.Y` branch, never directly from `main`.
### Where to Target Your Changes
All pull requests should target `main` by default. Maintainers are responsible for cherry-picking fixes onto release branches as part of the patch release process.
| Change type | Target branch |
|---|---|
| New features | `main` |
| Bug fixes | `main` |
| Security patches | `main` (expedited backport to affected branches) |
| Release-specific fix (see below) | `release/X.Y` directly |
### Creating a New Release Branch
When `main` is ready to ship a new minor version, a release branch is cut:
```bash
git checkout main
git pull origin main
git checkout -b release/1.4
git push origin release/1.4
```
The first release is then tagged from that branch:
```bash
git tag v1.4.0
git push origin v1.4.0
```
Development on `main` immediately continues toward the next minor.
### Backporting a Fix to a Release Branch
When a bug fix merged to `main` also needs to apply to an active release line, cherry-pick the commit onto the release branch and open a PR targeting it:
```bash
git checkout release/1.3
git pull origin release/1.3
git checkout -b backport/fix-description-to-1.3
git cherry-pick <commit-sha>
git push origin backport/fix-description-to-1.3
```
Open a PR targeting `release/1.3` and reference the original PR in the description. If the cherry-pick doesn't apply cleanly, resolve conflicts and note them in the PR.
### Fixes That Only Apply to an Older Release Line
Sometimes a bug exists in an older release but the relevant code has been removed or significantly changed in `main` — making a forward-port unnecessary or nonsensical. In these cases, it's acceptable to open a PR directly against the affected `release/X.Y` branch.
When doing this, the PR description must explain:
- Which versions are affected
- Why the fix does not apply to `main` or newer release lines (e.g., "this code path was removed in 1.4 when X was refactored")
This keeps the history auditable and prevents future contributors from wondering why the fix never made it forward.
### Summary
```
┌─────────────────────────────────────────► main (next minor)
│ cherry-pick / backport PRs
│ ─────────────────────────► release/1.4 (v1.4.0, v1.4.1 ...)
│ ─────────────────────────► release/1.3 (v1.3.0, v1.3.1 ...)
│ direct fix (older-only bug)
│ ─────────────────────────► release/1.2 (critical fixes only)
```

View File

@@ -146,26 +146,28 @@ func storeImage(ctx context.Context, s *store.Layout, i v1.Image, platform strin
func rewriteReference(ctx context.Context, s *store.Layout, oldRef name.Reference, newRef name.Reference) error {
l := log.FromContext(ctx)
l.Infof("rewriting [%s] to [%s]", oldRef.Name(), newRef.Name())
s.OCI.LoadIndex()
//TODO: improve string manipulation
oldRefContext := oldRef.Context()
newRefContext := newRef.Context()
oldRepo := oldRefContext.RepositoryStr()
newRepo := newRefContext.RepositoryStr()
oldTag := oldRef.(name.Tag).TagStr()
newTag := newRef.(name.Tag).TagStr()
oldRegistry := strings.TrimPrefix(oldRefContext.RegistryStr(), "index.")
newRegistry := strings.TrimPrefix(newRefContext.RegistryStr(), "index.")
// If new registry not set in rewrite, keep old registry instead of defaulting to docker.io
if newRegistry == "docker.io" && oldRegistry != "docker.io" {
newRegistry = oldRegistry
}
oldTotal := oldRepo + ":" + oldTag
newTotal := newRepo + ":" + newTag
oldTotalReg := oldRegistry + "/" + oldTotal
newTotalReg := newRegistry + "/" + newTotal
l.Infof("rewriting [%s] to [%s]", oldTotalReg, newTotalReg)
//find and update reference
found := false
if err := s.OCI.Walk(func(k string, d ocispec.Descriptor) error {

View File

@@ -39,8 +39,9 @@ func LoadCmd(ctx context.Context, o *flags.LoadOpts, rso *flags.StoreRootOpts, r
l.Debugf("using temporary directory at [%s]", tempDir)
for _, fileName := range o.FileName {
l.Infof("loading haul [%s] to [%s]", fileName, o.StoreDir)
err := unarchiveLayoutTo(ctx, fileName, o.StoreDir, tempDir)
resolved := resolveHaulPath(fileName)
l.Infof("loading haul [%s] to [%s]", resolved, o.StoreDir)
err := unarchiveLayoutTo(ctx, resolved, o.StoreDir, tempDir)
if err != nil {
return err
}
@@ -85,6 +86,13 @@ func unarchiveLayoutTo(ctx context.Context, haulPath string, dest string, tempDi
}
}
// reassemble chunk files if haulPath matches the chunk naming pattern
joined, err := archives.JoinChunks(ctx, haulPath, tempDir)
if err != nil {
return err
}
haulPath = joined
if err := archives.Unarchive(ctx, haulPath, tempDir); err != nil {
return err
}
@@ -139,6 +147,29 @@ func unarchiveLayoutTo(ctx context.Context, haulPath string, dest string, tempDi
return err
}
// resolveHaulPath returns path as-is if it exists or is a URL. If the file is
// not found, it globs for chunk files matching <base>_*<ext> in the same
// directory and returns the first match so JoinChunks can reassemble them.
func resolveHaulPath(path string) string {
if strings.HasPrefix(path, "http://") || strings.HasPrefix(path, "https://") {
return path
}
if _, err := os.Stat(path); err == nil {
return path
}
base := path
ext := ""
for filepath.Ext(base) != "" {
ext = filepath.Ext(base) + ext
base = strings.TrimSuffix(base, filepath.Ext(base))
}
matches, err := filepath.Glob(base + "_*" + ext)
if err != nil || len(matches) == 0 {
return path
}
return matches[0]
}
func clearDir(path string) error {
entries, err := os.ReadDir(path)
if err != nil {

View File

@@ -4,10 +4,13 @@ import (
"bytes"
"context"
"encoding/json"
"fmt"
"os"
"path"
"path/filepath"
"slices"
"strconv"
"strings"
referencev3 "github.com/distribution/distribution/v3/reference"
"github.com/google/go-containerregistry/pkg/name"
@@ -72,10 +75,47 @@ func SaveCmd(ctx context.Context, o *flags.SaveOpts, rso *flags.StoreRootOpts, r
return err
}
l.Infof("saving store [%s] to archive [%s]", o.StoreDir, o.FileName)
if o.ChunkSize != "" {
maxBytes, err := parseChunkSize(o.ChunkSize)
if err != nil {
return err
}
chunks, err := archives.SplitArchive(ctx, absOutputfile, maxBytes)
if err != nil {
return err
}
for _, c := range chunks {
l.Infof("saving store [%s] to chunk [%s]", o.StoreDir, filepath.Base(c))
}
} else {
l.Infof("saving store [%s] to archive [%s]", o.StoreDir, o.FileName)
}
return nil
}
// parseChunkSize parses a human-readable byte size string (e.g. "1G", "500M", "2GB")
// into a byte count. Suffixes are treated as binary units (1K = 1024).
func parseChunkSize(s string) (int64, error) {
units := map[string]int64{
"K": 1 << 10, "KB": 1 << 10,
"M": 1 << 20, "MB": 1 << 20,
"G": 1 << 30, "GB": 1 << 30,
"T": 1 << 40, "TB": 1 << 40,
}
s = strings.ToUpper(strings.TrimSpace(s))
for suffix, mult := range units {
if strings.HasSuffix(s, suffix) {
n, err := strconv.ParseInt(strings.TrimSuffix(s, suffix), 10, 64)
if err != nil {
return 0, fmt.Errorf("invalid chunk size %q", s)
}
return n * mult, nil
}
}
return strconv.ParseInt(s, 10, 64)
}
type exports struct {
digests []string
records map[string]tarball.Descriptor

2
go.mod
View File

@@ -294,7 +294,7 @@ require (
github.com/tchap/go-patricia/v2 v2.3.3 // indirect
github.com/thales-e-security/pool v0.0.2 // indirect
github.com/theupdateframework/go-tuf v0.7.0 // indirect
github.com/theupdateframework/go-tuf/v2 v2.3.1 // indirect
github.com/theupdateframework/go-tuf/v2 v2.4.1 // indirect
github.com/titanous/rocacheck v0.0.0-20171023193734-afe73141d399 // indirect
github.com/tjfoc/gmsm v1.4.1 // indirect
github.com/transparency-dev/formats v0.0.0-20251017110053-404c0d5b696c // indirect

4
go.sum
View File

@@ -947,8 +947,8 @@ github.com/thales-e-security/pool v0.0.2 h1:RAPs4q2EbWsTit6tpzuvTFlgFRJ3S8Evf5gt
github.com/thales-e-security/pool v0.0.2/go.mod h1:qtpMm2+thHtqhLzTwgDBj/OuNnMpupY8mv0Phz0gjhU=
github.com/theupdateframework/go-tuf v0.7.0 h1:CqbQFrWo1ae3/I0UCblSbczevCCbS31Qvs5LdxRWqRI=
github.com/theupdateframework/go-tuf v0.7.0/go.mod h1:uEB7WSY+7ZIugK6R1hiBMBjQftaFzn7ZCDJcp1tCUug=
github.com/theupdateframework/go-tuf/v2 v2.3.1 h1:fReZUTLvPdqIL8Rd9xEKPmaxig8GIXe0kS4RSEaRfaM=
github.com/theupdateframework/go-tuf/v2 v2.3.1/go.mod h1:9S0Srkf3c13FelsOyt5OyG3ZZDq9OJDA4IILavrt72Y=
github.com/theupdateframework/go-tuf/v2 v2.4.1 h1:K6ewW064rKZCPkRo1W/CTbTtm/+IB4+coG1iNURAGCw=
github.com/theupdateframework/go-tuf/v2 v2.4.1/go.mod h1:Nex2enPVYDFCklrnbTzl3OVwD7fgIAj0J5++z/rvCj8=
github.com/tink-crypto/tink-go-awskms/v2 v2.1.0 h1:N9UxlsOzu5mttdjhxkDLbzwtEecuXmlxZVo/ds7JKJI=
github.com/tink-crypto/tink-go-awskms/v2 v2.1.0/go.mod h1:PxSp9GlOkKL9rlybW804uspnHuO9nbD98V/fDX4uSis=
github.com/tink-crypto/tink-go-gcpkms/v2 v2.2.0 h1:3B9i6XBXNTRspfkTC0asN5W0K6GhOSgcujNiECNRNb0=

View File

@@ -10,6 +10,7 @@ type SaveOpts struct {
FileName string
Platform string
ContainerdCompatibility bool
ChunkSize string
}
func (o *SaveOpts) AddFlags(cmd *cobra.Command) {
@@ -18,5 +19,6 @@ func (o *SaveOpts) AddFlags(cmd *cobra.Command) {
f.StringVarP(&o.FileName, "filename", "f", consts.DefaultHaulerArchiveName, "(Optional) Specify the name of outputted haul")
f.StringVarP(&o.Platform, "platform", "p", "", "(Optional) Specify the platform for runtime imports... i.e. linux/amd64 (unspecified implies all)")
f.BoolVar(&o.ContainerdCompatibility, "containerd", false, "(Optional) Enable import compatibility with containerd... removes oci-layout from the haul")
f.StringVar(&o.ChunkSize, "chunk-size", "", "(Optional) Split the output archive into chunks of the specified size (e.g. 1G, 500M, 2048M)")
}

View File

@@ -3,8 +3,10 @@ package archives
import (
"context"
"fmt"
"io"
"os"
"path/filepath"
"strings"
"github.com/mholt/archives"
"hauler.dev/go/hauler/pkg/log"
@@ -102,3 +104,85 @@ func Archive(ctx context.Context, dir, outfile string, compression archives.Comp
l.Debugf("archive created successfully [%s]", outfile)
return nil
}
// SplitArchive splits an existing archive into chunks of at most maxBytes each.
// Chunks are named <base>_0<ext>, <base>_1<ext>, ... where base is the archive
// path with all extensions stripped, and ext is the compound extension (e.g. .tar.zst).
// The original archive is removed after successful splitting.
func SplitArchive(ctx context.Context, archivePath string, maxBytes int64) ([]string, error) {
l := log.FromContext(ctx)
// derive base path and compound extension by stripping all extensions
base := archivePath
ext := ""
for filepath.Ext(base) != "" {
ext = filepath.Ext(base) + ext
base = strings.TrimSuffix(base, filepath.Ext(base))
}
f, err := os.Open(archivePath)
if err != nil {
return nil, fmt.Errorf("failed to open archive for splitting: %w", err)
}
var chunks []string
buf := make([]byte, 32*1024)
chunkIdx := 0
var written int64
var outf *os.File
for {
if outf == nil {
chunkPath := fmt.Sprintf("%s_%d%s", base, chunkIdx, ext)
outf, err = os.Create(chunkPath)
if err != nil {
f.Close()
return nil, fmt.Errorf("failed to create chunk %d: %w", chunkIdx, err)
}
chunks = append(chunks, chunkPath)
l.Debugf("creating chunk [%s]", chunkPath)
written = 0
chunkIdx++
}
remaining := maxBytes - written
readSize := int64(len(buf))
if readSize > remaining {
readSize = remaining
}
n, readErr := f.Read(buf[:readSize])
if n > 0 {
if _, writeErr := outf.Write(buf[:n]); writeErr != nil {
outf.Close()
f.Close()
return nil, fmt.Errorf("failed to write to chunk: %w", writeErr)
}
written += int64(n)
}
if readErr == io.EOF {
outf.Close()
outf = nil
break
}
if readErr != nil {
outf.Close()
f.Close()
return nil, fmt.Errorf("failed to read archive: %w", readErr)
}
if written >= maxBytes {
outf.Close()
outf = nil
}
}
f.Close()
if err := os.Remove(archivePath); err != nil {
return nil, fmt.Errorf("failed to remove original archive after splitting: %w", err)
}
l.Infof("split archive [%s] into %d chunk(s)", filepath.Base(archivePath), len(chunks))
return chunks, nil
}

View File

@@ -6,6 +6,9 @@ import (
"io"
"os"
"path/filepath"
"regexp"
"sort"
"strconv"
"strings"
"github.com/mholt/archives"
@@ -156,3 +159,77 @@ func Unarchive(ctx context.Context, tarball, dst string) error {
l.Infof("unarchiving completed successfully")
return nil
}
var chunkSuffixRe = regexp.MustCompile(`^(.+)_(\d+)$`)
// chunkInfo checks whether archivePath matches the chunk naming pattern (<base>_N<ext>).
// Returns the base path (without index), compound extension, numeric index, and whether it matched.
func chunkInfo(archivePath string) (base, ext string, index int, ok bool) {
dir := filepath.Dir(archivePath)
name := filepath.Base(archivePath)
// strip compound extension (e.g. .tar.zst)
nameBase := name
nameExt := ""
for filepath.Ext(nameBase) != "" {
nameExt = filepath.Ext(nameBase) + nameExt
nameBase = strings.TrimSuffix(nameBase, filepath.Ext(nameBase))
}
m := chunkSuffixRe.FindStringSubmatch(nameBase)
if m == nil {
return "", "", 0, false
}
idx, _ := strconv.Atoi(m[2])
return filepath.Join(dir, m[1]), nameExt, idx, true
}
// JoinChunks detects whether archivePath is a chunk file and, if so, finds all
// sibling chunks, concatenates them in numeric order into a single file in tempDir,
// and returns the path to the joined file. If archivePath is not a chunk, it is
// returned unchanged.
func JoinChunks(ctx context.Context, archivePath, tempDir string) (string, error) {
l := log.FromContext(ctx)
base, ext, _, ok := chunkInfo(archivePath)
if !ok {
return archivePath, nil
}
matches, err := filepath.Glob(base + "_*" + ext)
if err != nil || len(matches) == 0 {
return archivePath, nil
}
sort.Slice(matches, func(i, j int) bool {
_, _, idxI, _ := chunkInfo(matches[i])
_, _, idxJ, _ := chunkInfo(matches[j])
return idxI < idxJ
})
l.Debugf("joining %d chunk(s) for [%s]", len(matches), base)
joinedPath := filepath.Join(tempDir, filepath.Base(base)+ext)
outf, err := os.Create(joinedPath)
if err != nil {
return "", fmt.Errorf("failed to create joined archive: %w", err)
}
defer outf.Close()
for _, chunk := range matches {
l.Debugf("joining chunk [%s]", chunk)
cf, err := os.Open(chunk)
if err != nil {
return "", fmt.Errorf("failed to open chunk [%s]: %w", chunk, err)
}
if _, err := io.Copy(outf, cf); err != nil {
cf.Close()
return "", fmt.Errorf("failed to copy chunk [%s]: %w", chunk, err)
}
cf.Close()
}
l.Infof("joined %d chunk(s) into [%s]", len(matches), filepath.Base(joinedPath))
return joinedPath, nil
}