Files
hauler/pkg/archives/archiver.go
Zack Brady c0294c733b update release/2.0 from main (#546)
* fix: handling of file referenced dependencies without repository field (#514)

co-authored-by: devleitner <devleitner@protonmail.com>

* bump go.opentelemetry.io/otel/sdk (#520)

bumps the go_modules group with 1 update in the / directory: [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go).

updates `go.opentelemetry.io/otel/sdk` from 1.39.0 to 1.40.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.39.0...v1.40.0)

---

updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.40.0
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* dev.md file (#521)

* smaller changes and updates for v1.4.2 release (#524)

* smaller changes and updates for v1.4.2 release
* removed unused env variable

* over-"haul": replace oras v1 and cosign fork with native containerd-based implementation (#515)

* remove oras from hauler

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* remove cosign fork and use upstream cosign for verification

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* added support for oci referrers

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* updated README.md projects list

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* updates for copilot PR review

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for unsafe type assertions

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for http getter and dead code

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* fixes for more clarity and better error handling

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for resource leaks and unchecked errors

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for rewrite logic for docker.io images due to cosign removal

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for sigs and referrers

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for index.json missing mediatype

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix to make sure manifest.json doesnt include anything other than actual container images

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bump github.com/docker/cli in the go_modules group across 1 directory (#526)

bumps the go_modules group with 1 update in the / directory: [github.com/docker/cli](https://github.com/docker/cli).


updates `github.com/docker/cli` from 29.0.3+incompatible to 29.2.0+incompatible
- [Commits](https://github.com/docker/cli/compare/v29.0.3...v29.2.0)

---

updated-dependencies:
- dependency-name: github.com/docker/cli
  dependency-version: 29.2.0+incompatible
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* removed deprecated code (#528)

* removed deprecated code
* removed all supported for v1alpha1

* fix extract for oci files (#529)

* fix extract for oci files

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* have extract guard against path traversal

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* improved test coverage (#530)

* improved test coverage

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* adjusted mapper_test for oddball oci files

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* adjust extract to handle an image index appropriately (#531)

* adjust extract to handle images and image indices appropriately

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* updates for review feedback

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* fix dockerhub default host bug (#534)

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* adjust hauler's kind annotation to not reflect cosign (#535)

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bump google.golang.org/grpc in the go_modules group across 1 directory (#536)

bumps the go_modules group with 1 update in the / directory: [google.golang.org/grpc](https://github.com/grpc/grpc-go).

updates `google.golang.org/grpc` from 1.78.0 to 1.79.3
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.78.0...v1.79.3)

---

updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add cherry-pick workflow for release branches (#533)

this workflow automates cherry-picking changes from merged pull requests to specified release branches based on comments... it handles permission checks, version parsing, and conflict resolution during the cherry-pick process.

Signed-off-by: Camryn Carter <camryn.carter@ranchergovernment.com>

* images.txt testdata file (#539)

* fix keep registry logic (#537)

* fixed keep registry logic
* trim library/
* updated test
* test updates

* option to sync images.txt files natively (#538)

* sync images.txt files
* test worklflow sync w image list
* images.txt

* chunk the haul (#519)

* chunk the haul
* validate numeric suffix on join
* enforce valid chunk size
* containerd warning
* updated test.go files

* bump github.com/go-jose/go-jose/v4 (#542)

bumps the go_modules group with 1 update in the / directory: [github.com/go-jose/go-jose/v4](https://github.com/go-jose/go-jose).


updates `github.com/go-jose/go-jose/v4` from 4.1.3 to 4.1.4

- [Release notes](https://github.com/go-jose/go-jose/releases)
- [Commits](https://github.com/go-jose/go-jose/compare/v4.1.3...v4.1.4)

---

updated-dependencies:
- dependency-name: github.com/go-jose/go-jose/v4
  dependency-version: 4.1.4
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* allow multiple prefix references (#532)

* allow multiple prefix references
* fixed some duplications

* add optional flag for excluding extra artifacts when pulling from a registry (#541)

* add optional flag for excluding extra artifacts when pulling from a registry

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* add optional flag to charts for excluding extra artifacts when pulling from a registry

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>
Signed-off-by: Camryn Carter <camryn.carter@ranchergovernment.com>
Co-authored-by: devLeitner <87783219+devLeitner@users.noreply.github.com>
Co-authored-by: devleitner <devleitner@protonmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Camryn Carter <camryn.carter@ranchergovernment.com>
Co-authored-by: Adam Martin <adam.martin@ranchergovernment.com>
2026-04-08 12:09:23 -04:00

189 lines
4.9 KiB
Go

package archives
import (
"context"
"fmt"
"io"
"os"
"path/filepath"
"strings"
"github.com/mholt/archives"
"hauler.dev/go/hauler/pkg/log"
)
// maps to handle compression types
var CompressionMap = map[string]archives.Compression{
"gz": archives.Gz{},
"bz2": archives.Bz2{},
"xz": archives.Xz{},
"zst": archives.Zstd{},
"lz4": archives.Lz4{},
"br": archives.Brotli{},
}
// maps to handle archival types
var ArchivalMap = map[string]archives.Archival{
"tar": archives.Tar{},
"zip": archives.Zip{},
}
// check if a path exists
func isExist(path string) bool {
_, statErr := os.Stat(path)
return !os.IsNotExist(statErr)
}
// archives the files in a directory
// dir: the directory to Archive
// outfile: the output file
// compression: the compression to use (gzip, bzip2, etc.)
// archival: the archival to use (tar, zip, etc.)
func Archive(ctx context.Context, dir, outfile string, compression archives.Compression, archival archives.Archival) error {
l := log.FromContext(ctx)
l.Debugf("starting the archival process for [%s]", dir)
// remove outfile
l.Debugf("removing existing output file: [%s]", outfile)
if err := os.RemoveAll(outfile); err != nil {
errMsg := fmt.Errorf("failed to remove existing output file [%s]: %w", outfile, err)
l.Debugf(errMsg.Error())
return errMsg
}
if !isExist(dir) {
errMsg := fmt.Errorf("directory [%s] does not exist, cannot proceed with archival", dir)
l.Debugf(errMsg.Error())
return errMsg
}
// map files on disk to their paths in the archive
l.Debugf("mapping files in directory [%s]", dir)
archiveDirName := filepath.Base(filepath.Clean(dir))
if dir == "." {
archiveDirName = ""
}
files, err := archives.FilesFromDisk(context.Background(), nil, map[string]string{
dir: archiveDirName,
})
if err != nil {
errMsg := fmt.Errorf("error mapping files from directory [%s]: %w", dir, err)
l.Debugf(errMsg.Error())
return errMsg
}
l.Debugf("successfully mapped files for directory [%s]", dir)
// create the output file we'll write to
l.Debugf("creating output file [%s]", outfile)
outf, err := os.Create(outfile)
if err != nil {
errMsg := fmt.Errorf("error creating output file [%s]: %w", outfile, err)
l.Debugf(errMsg.Error())
return errMsg
}
defer func() {
l.Debugf("closing output file [%s]", outfile)
outf.Close()
}()
// define the archive format
l.Debugf("defining the archive format: [%T]/[%T]", archival, compression)
format := archives.CompressedArchive{
Compression: compression,
Archival: archival,
}
// create the archive
l.Debugf("starting archive for [%s]", outfile)
err = format.Archive(context.Background(), outf, files)
if err != nil {
errMsg := fmt.Errorf("error during archive creation for output file [%s]: %w", outfile, err)
l.Debugf(errMsg.Error())
return errMsg
}
l.Debugf("archive created successfully [%s]", outfile)
return nil
}
// SplitArchive splits an existing archive into chunks of at most maxBytes each.
// Chunks are named <base>_0<ext>, <base>_1<ext>, ... where base is the archive
// path with all extensions stripped, and ext is the compound extension (e.g. .tar.zst).
// The original archive is removed after successful splitting.
func SplitArchive(ctx context.Context, archivePath string, maxBytes int64) ([]string, error) {
l := log.FromContext(ctx)
// derive base path and compound extension by stripping all extensions
base := archivePath
ext := ""
for filepath.Ext(base) != "" {
ext = filepath.Ext(base) + ext
base = strings.TrimSuffix(base, filepath.Ext(base))
}
f, err := os.Open(archivePath)
if err != nil {
return nil, fmt.Errorf("failed to open archive for splitting: %w", err)
}
var chunks []string
buf := make([]byte, 32*1024)
chunkIdx := 0
var written int64
var outf *os.File
for {
if outf == nil {
chunkPath := fmt.Sprintf("%s_%d%s", base, chunkIdx, ext)
outf, err = os.Create(chunkPath)
if err != nil {
f.Close()
return nil, fmt.Errorf("failed to create chunk %d: %w", chunkIdx, err)
}
chunks = append(chunks, chunkPath)
l.Debugf("creating chunk [%s]", chunkPath)
written = 0
chunkIdx++
}
remaining := maxBytes - written
readSize := int64(len(buf))
if readSize > remaining {
readSize = remaining
}
n, readErr := f.Read(buf[:readSize])
if n > 0 {
if _, writeErr := outf.Write(buf[:n]); writeErr != nil {
outf.Close()
f.Close()
return nil, fmt.Errorf("failed to write to chunk: %w", writeErr)
}
written += int64(n)
}
if readErr == io.EOF {
outf.Close()
outf = nil
break
}
if readErr != nil {
outf.Close()
f.Close()
return nil, fmt.Errorf("failed to read archive: %w", readErr)
}
if written >= maxBytes {
outf.Close()
outf = nil
}
}
f.Close()
if err := os.Remove(archivePath); err != nil {
return nil, fmt.Errorf("failed to remove original archive after splitting: %w", err)
}
l.Infof("split archive [%s] into %d chunk(s)", filepath.Base(archivePath), len(chunks))
return chunks, nil
}