Skip to content

Conversation

@baude
Copy link
Member

@baude baude commented Nov 12, 2025

introduction of ArtifactReference and ArtifactStorageReference types for libartifact. ArtifactReference is a theoretical reference to an artifact like for things like pull. ArtifactStorageReference is for looking things up in the existing store for things like rm or inspect. ArtifactStoreReference allows for things like full or partial "id" lookups.

the pr also accomdates a user ask for appending "latest" as a tag when no tag is provided.

and finally, here we introduce a bunch of artifact tests to get the ball rolling. it should not be considered fully complete but a good start.

@github-actions github-actions bot added the common Related to "common" package label Nov 12, 2025
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Nov 12, 2025
@podmanbot
Copy link

✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6497

@TomSweeneyRedHat
Copy link
Member

@baude is this headed for Podman v6.0? If so, I think we need to start creating release branches in this project. Probably a cabal topic sooner, rather than later.

introduction of ArtifactReference and ArtifactStorageReference types for
libartifact.  ArtifactReference is a theoretical reference to an
artifact like for things like pull.  ArtifactStorageReference is for
looking things up in the existing store for things like rm or inspect.
ArtifactStoreReference allows for things like full or partial "id"
lookups.

the pr also accomdates a user ask for appending "latest" as a tag when
no tag is provided.

and finally, here we introduce a bunch of artifact tests to get the ball
rolling.  it should not be considered fully complete but a good start.

Signed-off-by: Brent Baude <[email protected]>
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Nov 12, 2025
Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please start with

  • the ⚠️ comment about the semantics of ArtifactStorageReference vs. locking (maybe separate “parse user input in anticipation of the current getByNameOrDigest”, outside of locks, and “lookup + do action”, inside of a lock)
  • The [MULTI] series of comments about turning ID/digest references into an unspecified one name (pre-existing but, I think, important)

func (a *Artifact) SetName(name string) {
a.Name = name
}

Copy link
Contributor

@mtrmac mtrmac Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Pre-existing and ~unrelated, but GetDigest below does not preserve the correct digest value, e.g. if there is atypical whitespace or field ordering in the original JSON.)

return nil, false, err
}
if storedArtifactNamed.Name() == named.Name() {
artifactDigest, err := a.GetDigest()
Copy link
Contributor

@mtrmac mtrmac Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break for sha256 vs. sha512 references to the same object — but hopefully it can be implemented by re-digesting the original manifest blob using the digest value lookup digests’s algorithm.

}
}
// Before giving up, check by digest
// Before giving up, check by full or partial ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably unrelated: The “partial ID” syntax does not work well for multiple digest algorithms. “ID” references are only sort-of-required if it is possible to create an unnamed object, or pull and untag an object. Maybe we can drop the concept?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont really love it either but "people" screamed about consistency with images and other objects. I doubt we can drop the concept.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate partial IDs, a digest cannot be partial it is only valid in its full form. Script should be using full ids always and for users they can use the name+tag or just press tab for shell completion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about supporting ID prefixes (or even all IDs) only for sha256 digests (and sha256-shaped c/storage IDs)?

That would preserve all existing code, but force users who want sha512 (or any future digest) to migrate.

If we could establish that as a policy for both artifacts and images, that would set clear user expectations — and probably fairly simplify the sha512 migration.

return nil, err
}
artifactDigest, err := arty.GetDigest()
artifactDigest, err := ref.ArtifactFromStore.GetDigest()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️

Hum… a semantics question: if ArtifactStoreReference is looked up while holding the store lock[1], the lock is then released, and this locks the store again, ref.ArtifactFromStore might not be valid any more.

The good case is that the artifact was already removed.

The bad case is that the previously-found artifact DigestA was removed or renamed, and something else with DigestB was pulled into its name. Then, this code deletes by name, deleting DigestB, but returns the in-memory DigestA.


[1] user input lookup happens via artifactStore.getArtifacts, AFAICS that happens without holding any lock. That’s not safe because updates to the index are not currently atomic, so it could fail seeing a partially-written JSON.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw the current code is already broken in that regard, it is super flaky containers/podman#27264
It was so bad that I disabled parallel testing of it.

}
destRef, err := alltransports.ParseImageName("docker://" + dest)
func (as ArtifactStore) Push(ctx context.Context, src ArtifactStoreReference, dest ArtifactReference, opts libimage.CopyOptions) (digest.Digest, error) {
destRef, err := alltransports.ParseImageName("docker://" + dest.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destRef, err = docker.NewReference(dest.Named) is a more efficient ~equivalent.


… both variants fail on a repo:tag@digest input, that is rejected by c/image. Generally, Podman, for Docker compatibility, allows such inputs, ignoring the tag. Do we want to do that here as well??? And what would use as an artifact name in the store?

My preference would be to reject such inputs, because most use cases are dubious (repo:myCommentThatLooksLikeAVersionNumberButTheTagMayMoveIntheFuture@digestThatMatters), but that might not be the right call.

Either way, ArtifactReference should document whether repo:tag@digest values are valid or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way, ArtifactReference should document whether repo:tag@digest values are valid or not.

… and if they are not valid, enforce that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ive taken the code comment about destRef but I believe ArtifactReference does document repo:tag@digest; i think it was on the constructor and you made a comment to move it to the struct which was also done.

// Check if artifact exists; in GetByName not getting an
// error means it exists
_, _, err := artifacts.GetByNameOrDigest(dest)
existingArtifact, _, err := artifacts.getByNameOrDigest(dest.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s a bit clumsy that this triggers the “lookup by ID or ID prefix” heuristics, when we statically know that the input is not in that format. (Also elsewhere.)

@@ -0,0 +1,719 @@
package store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note to self: I didn’t read this.)

@Luap99
Copy link
Member

Luap99 commented Nov 13, 2025

@baude is this headed for Podman v6.0? If so, I think we need to start creating release branches in this project. Probably a cabal topic sooner, rather than later.

@TomSweeneyRedHat There are several changes already here for 6.0 only. I doesn't particular matter when we create the branch as long as it is based of the last release not main.

So yes like in podman we cannot use main here for the next podman 5.8 release

return ar, nil
}

func (ar ArtifactReference) IsDigested() bool {

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way Go works, it is generally better to define interfaces at callers than callees, unless the callee is really an abstraction over multiple implementations.

  • API stability: If we export a struct with methods, we can always add more methods. If we export an interface, adding a method to the interface breaks all other implementations — so there should either be no other implementations, and then we don’t need an interface, or we can’t add a method, and we have to keep adding more and more interfaces (and manual type checks) to the API.
  • If there is only one “real” implementation, an interface is typically used for mocks in tests. In that case the tests can define a private interface, perhaps with a very small subset of the full object, exposing just the methods that need mocking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I didn't realize that and learned something new.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and:

  • Interface conformance is implicit in Go: In other languages, a class may need to explicitly declare its conformance to an interface/protocol — so if anyone ever might want to substitute an implementation via an interface, it’s easier for everyone when the original implementation also declares an interface, and it becomes “a best practice”. In Go, any external consumer can declare an interface without cooperation with the original implementation.

// getByNameOrDigest returns an artifact, if present, by a given name
// Returns an error if not found.
func (al ArtifactList) GetByNameOrDigest(nameOrDigest string) (*Artifact, bool, error) {
func (al ArtifactList) getByNameOrDigest(nameOrDigest string) (*Artifact, bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there is a lot of iteration over lists. In extreme cases, when the store has a lot of artifacts, this will cost us some time. I think there could be an optimization of this search.

return nil, ErrEmptyArtifactName
}

func (as ArtifactStore) Inspect(ctx context.Context, ref ArtifactStoreReference) (*Artifact, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to hold a lock?

}

func getArtifactAndImageSource(ctx context.Context, as ArtifactStore, nameOrDigest string, options *libartTypes.FilterBlobOptions) (*libartifact.Artifact, types.ImageSource, error) {
func (ref ArtifactStoreReference) toLocalImageSource(ctx context.Context, as ArtifactStore, options *libartTypes.FilterBlobOptions) (types.ImageSource, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, move this function into reference.go.

// NewArtifactStorageReference refers to an object already in the artifact store. It
// can be a name or a full or partial digest. Conveniently, it also embeds the artifact
// as part of its return.
func NewArtifactStorageReference(nameOrDigest string, as *ArtifactStore) (ArtifactStoreReference, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to mention that these type of functions require the caller to hold a lock.

@TomSweeneyRedHat
Copy link
Member

@Luap99 Thanks on the branch bit. We don't have any branches in this repository at the moment. It's probably time to start making them, or at least figure out how to name them. Do we make a unique number, a date, a Podman release, or something else? Let's chat in a cabal sometime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to "common" package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants