Files
hauler/pkg/archives/unarchiver.go
Zack Brady c0294c733b update release/2.0 from main (#546)
* fix: handling of file referenced dependencies without repository field (#514)

co-authored-by: devleitner <devleitner@protonmail.com>

* bump go.opentelemetry.io/otel/sdk (#520)

bumps the go_modules group with 1 update in the / directory: [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go).

updates `go.opentelemetry.io/otel/sdk` from 1.39.0 to 1.40.0
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.39.0...v1.40.0)

---

updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-version: 1.40.0
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* dev.md file (#521)

* smaller changes and updates for v1.4.2 release (#524)

* smaller changes and updates for v1.4.2 release
* removed unused env variable

* over-"haul": replace oras v1 and cosign fork with native containerd-based implementation (#515)

* remove oras from hauler

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* remove cosign fork and use upstream cosign for verification

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* added support for oci referrers

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* updated README.md projects list

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* updates for copilot PR review

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for unsafe type assertions

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for http getter and dead code

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* fixes for more clarity and better error handling

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for resource leaks and unchecked errors

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for rewrite logic for docker.io images due to cosign removal

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for sigs and referrers

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix for index.json missing mediatype

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bug fix to make sure manifest.json doesnt include anything other than actual container images

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bump github.com/docker/cli in the go_modules group across 1 directory (#526)

bumps the go_modules group with 1 update in the / directory: [github.com/docker/cli](https://github.com/docker/cli).


updates `github.com/docker/cli` from 29.0.3+incompatible to 29.2.0+incompatible
- [Commits](https://github.com/docker/cli/compare/v29.0.3...v29.2.0)

---

updated-dependencies:
- dependency-name: github.com/docker/cli
  dependency-version: 29.2.0+incompatible
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* removed deprecated code (#528)

* removed deprecated code
* removed all supported for v1alpha1

* fix extract for oci files (#529)

* fix extract for oci files

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* have extract guard against path traversal

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* improved test coverage (#530)

* improved test coverage

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* adjusted mapper_test for oddball oci files

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* adjust extract to handle an image index appropriately (#531)

* adjust extract to handle images and image indices appropriately

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* updates for review feedback

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* fix dockerhub default host bug (#534)

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* adjust hauler's kind annotation to not reflect cosign (#535)

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* bump google.golang.org/grpc in the go_modules group across 1 directory (#536)

bumps the go_modules group with 1 update in the / directory: [google.golang.org/grpc](https://github.com/grpc/grpc-go).

updates `google.golang.org/grpc` from 1.78.0 to 1.79.3
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.78.0...v1.79.3)

---

updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.79.3
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add cherry-pick workflow for release branches (#533)

this workflow automates cherry-picking changes from merged pull requests to specified release branches based on comments... it handles permission checks, version parsing, and conflict resolution during the cherry-pick process.

Signed-off-by: Camryn Carter <camryn.carter@ranchergovernment.com>

* images.txt testdata file (#539)

* fix keep registry logic (#537)

* fixed keep registry logic
* trim library/
* updated test
* test updates

* option to sync images.txt files natively (#538)

* sync images.txt files
* test worklflow sync w image list
* images.txt

* chunk the haul (#519)

* chunk the haul
* validate numeric suffix on join
* enforce valid chunk size
* containerd warning
* updated test.go files

* bump github.com/go-jose/go-jose/v4 (#542)

bumps the go_modules group with 1 update in the / directory: [github.com/go-jose/go-jose/v4](https://github.com/go-jose/go-jose).


updates `github.com/go-jose/go-jose/v4` from 4.1.3 to 4.1.4

- [Release notes](https://github.com/go-jose/go-jose/releases)
- [Commits](https://github.com/go-jose/go-jose/compare/v4.1.3...v4.1.4)

---

updated-dependencies:
- dependency-name: github.com/go-jose/go-jose/v4
  dependency-version: 4.1.4
  dependency-type: indirect
  dependency-group: go_modules

...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* allow multiple prefix references (#532)

* allow multiple prefix references
* fixed some duplications

* add optional flag for excluding extra artifacts when pulling from a registry (#541)

* add optional flag for excluding extra artifacts when pulling from a registry

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

* add optional flag to charts for excluding extra artifacts when pulling from a registry

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Adam Martin <adam.martin@ranchergovernment.com>
Signed-off-by: Camryn Carter <camryn.carter@ranchergovernment.com>
Co-authored-by: devLeitner <87783219+devLeitner@users.noreply.github.com>
Co-authored-by: devleitner <devleitner@protonmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Camryn Carter <camryn.carter@ranchergovernment.com>
Co-authored-by: Adam Martin <adam.martin@ranchergovernment.com>
2026-04-08 12:09:23 -04:00

245 lines
7.1 KiB
Go

package archives
import (
"context"
"fmt"
"io"
"os"
"path/filepath"
"regexp"
"sort"
"strconv"
"strings"
"github.com/mholt/archives"
"hauler.dev/go/hauler/pkg/log"
)
const (
dirPermissions = 0o700 // default directory permissions
filePermissions = 0o600 // default file permissions
)
// ensures the path is safely relative to the target directory
func securePath(basePath, relativePath string) (string, error) {
relativePath = filepath.Clean("/" + relativePath)
relativePath = strings.TrimPrefix(relativePath, string(os.PathSeparator))
dstPath := filepath.Join(basePath, relativePath)
if !strings.HasPrefix(filepath.Clean(dstPath)+string(os.PathSeparator), filepath.Clean(basePath)+string(os.PathSeparator)) {
return "", fmt.Errorf("illegal file path: %s", dstPath)
}
return dstPath, nil
}
// creates a directory with specified permissions
func createDirWithPermissions(ctx context.Context, path string, mode os.FileMode) error {
l := log.FromContext(ctx)
l.Debugf("creating directory [%s]", path)
if err := os.MkdirAll(path, mode); err != nil {
return fmt.Errorf("failed to mkdir: %w", err)
}
return nil
}
// sets permissions to a file or directory
func setPermissions(path string, mode os.FileMode) error {
if err := os.Chmod(path, mode); err != nil {
return fmt.Errorf("failed to chmod: %w", err)
}
return nil
}
// handles the extraction of a file from the archive.
func handleFile(ctx context.Context, f archives.FileInfo, dst string) error {
l := log.FromContext(ctx)
l.Debugf("handling file [%s]", f.NameInArchive)
// validate and construct the destination path
dstPath, pathErr := securePath(dst, f.NameInArchive)
if pathErr != nil {
return pathErr
}
// ensure the parent directory exists
parentDir := filepath.Dir(dstPath)
if dirErr := createDirWithPermissions(ctx, parentDir, dirPermissions); dirErr != nil {
return dirErr
}
// handle directories
if f.IsDir() {
// create the directory with permissions from the archive
if dirErr := createDirWithPermissions(ctx, dstPath, f.Mode()); dirErr != nil {
return fmt.Errorf("failed to create directory: %w", dirErr)
}
l.Debugf("successfully created directory [%s]", dstPath)
return nil
}
// ignore symlinks (or hardlinks)
if f.LinkTarget != "" {
l.Debugf("skipping symlink [%s] to [%s]", dstPath, f.LinkTarget)
return nil
}
// check and handle parent directory permissions
originalMode, statErr := os.Stat(parentDir)
if statErr != nil {
return fmt.Errorf("failed to stat parent directory: %w", statErr)
}
// if parent directory is read only, temporarily make it writable
if originalMode.Mode().Perm()&0o200 == 0 {
l.Debugf("parent directory is read only... temporarily making it writable [%s]", parentDir)
if chmodErr := os.Chmod(parentDir, originalMode.Mode()|0o200); chmodErr != nil {
return fmt.Errorf("failed to chmod parent directory: %w", chmodErr)
}
defer func() {
// restore the original permissions after writing
if chmodErr := os.Chmod(parentDir, originalMode.Mode()); chmodErr != nil {
l.Debugf("failed to restore original permissions for [%s]: %v", parentDir, chmodErr)
}
}()
}
// handle regular files
reader, openErr := f.Open()
if openErr != nil {
return fmt.Errorf("failed to open file: %w", openErr)
}
defer reader.Close()
dstFile, createErr := os.OpenFile(dstPath, os.O_CREATE|os.O_WRONLY, f.Mode())
if createErr != nil {
return fmt.Errorf("failed to create file: %w", createErr)
}
defer dstFile.Close()
if _, copyErr := io.Copy(dstFile, reader); copyErr != nil {
return fmt.Errorf("failed to copy: %w", copyErr)
}
l.Debugf("successfully extracted file [%s]", dstPath)
return nil
}
// unarchives a tarball to a directory, symlinks, and hardlinks are ignored
func Unarchive(ctx context.Context, tarball, dst string) error {
l := log.FromContext(ctx)
l.Debugf("unarchiving temporary archive [%s] to temporary store [%s]", tarball, dst)
archiveFile, openErr := os.Open(tarball)
if openErr != nil {
return fmt.Errorf("failed to open tarball %s: %w", tarball, openErr)
}
defer archiveFile.Close()
format, input, identifyErr := archives.Identify(context.Background(), tarball, archiveFile)
if identifyErr != nil {
return fmt.Errorf("failed to identify format: %w", identifyErr)
}
extractor, ok := format.(archives.Extractor)
if !ok {
return fmt.Errorf("unsupported format for extraction")
}
if dirErr := createDirWithPermissions(ctx, dst, dirPermissions); dirErr != nil {
return fmt.Errorf("failed to create destination directory: %w", dirErr)
}
handler := func(ctx context.Context, f archives.FileInfo) error {
return handleFile(ctx, f, dst)
}
if extractErr := extractor.Extract(context.Background(), input, handler); extractErr != nil {
return fmt.Errorf("failed to extract: %w", extractErr)
}
l.Infof("unarchiving completed successfully")
return nil
}
var chunkSuffixRe = regexp.MustCompile(`^(.+)_(\d+)$`)
// chunkInfo checks whether archivePath matches the chunk naming pattern (<base>_N<ext>).
// Returns the base path (without index), compound extension, numeric index, and whether it matched.
func chunkInfo(archivePath string) (base, ext string, index int, ok bool) {
dir := filepath.Dir(archivePath)
name := filepath.Base(archivePath)
// strip compound extension (e.g. .tar.zst)
nameBase := name
nameExt := ""
for filepath.Ext(nameBase) != "" {
nameExt = filepath.Ext(nameBase) + nameExt
nameBase = strings.TrimSuffix(nameBase, filepath.Ext(nameBase))
}
m := chunkSuffixRe.FindStringSubmatch(nameBase)
if m == nil {
return "", "", 0, false
}
idx, _ := strconv.Atoi(m[2])
return filepath.Join(dir, m[1]), nameExt, idx, true
}
// JoinChunks detects whether archivePath is a chunk file and, if so, finds all
// sibling chunks, concatenates them in numeric order into a single file in tempDir,
// and returns the path to the joined file. If archivePath is not a chunk, it is
// returned unchanged.
func JoinChunks(ctx context.Context, archivePath, tempDir string) (string, error) {
l := log.FromContext(ctx)
base, ext, _, ok := chunkInfo(archivePath)
if !ok {
return archivePath, nil
}
all, err := filepath.Glob(base + "_*" + ext)
if err != nil {
return archivePath, nil
}
var matches []string
for _, m := range all {
if _, _, _, ok := chunkInfo(m); ok {
matches = append(matches, m)
}
}
if len(matches) == 0 {
return archivePath, nil
}
sort.Slice(matches, func(i, j int) bool {
_, _, idxI, _ := chunkInfo(matches[i])
_, _, idxJ, _ := chunkInfo(matches[j])
return idxI < idxJ
})
l.Debugf("joining %d chunk(s) for [%s]", len(matches), base)
joinedPath := filepath.Join(tempDir, filepath.Base(base)+ext)
outf, err := os.Create(joinedPath)
if err != nil {
return "", fmt.Errorf("failed to create joined archive: %w", err)
}
defer outf.Close()
for _, chunk := range matches {
l.Debugf("joining chunk [%s]", chunk)
cf, err := os.Open(chunk)
if err != nil {
return "", fmt.Errorf("failed to open chunk [%s]: %w", chunk, err)
}
if _, err := io.Copy(outf, cf); err != nil {
cf.Close()
return "", fmt.Errorf("failed to copy chunk [%s]: %w", chunk, err)
}
cf.Close()
}
l.Infof("joined %d chunk(s) into [%s]", len(matches), filepath.Base(joinedPath))
return joinedPath, nil
}