Auto-install LFS standalone-transfer-agent on first invocation so cloning / submodule-adding LFS repos works out of the box
Summary
When a user runs git clone s3://... (or git submodule add s3://...) against a repo that uses LFS, the initial checkout fails with:
batch request: ssh: Could not resolve hostname s3: Name or service not known: exit status 255
The README (LFS → Clone the repo) acknowledges this and instructs the user to recover with two manual steps:
cd lfs-repo-clone
git-lfs-s3 install
git reset --hard main
This surfaces the integration's seam in the user's face on first contact. It's especially painful for git submodule add, where:
- The "fail, fix, retry" recipe is harder to apply (recovery requires deinit / re-add or a careful manual sequence).
- A user can't even
git submodule add cleanly without first running it with GIT_LFS_SKIP_SMUDGE=1, then running git-lfs-s3 install inside the submodule, then git lfs pull. That's three workarounds for one operation.
I'd like to propose git-remote-s3 install the LFS standalone-transfer-agent config automatically the first time it runs against a repo, so git clone / git submodule add work end-to-end without manual setup.
Happy to follow up with a PR once you confirm the direction. Related: #62 fixes the gitdir-resolution side of the submodule UX; this issue addresses the configuration side.
Why .lfsconfig won't work
The natural-looking answer is "let repo owners commit a .lfsconfig". This is not an option: git-lfs intentionally excludes lfs.customtransfer.<name>.path and lfs.standalonetransferagent from .lfsconfig's allowed keys. From git-lfs-config(5):
The set of keys allowed in this file is restricted for security reasons.
Allowing those keys in a file that's checked in would let a malicious repo execute an arbitrary binary on git checkout. So configuration has to live in the user's local/global git config — which is exactly the git-lfs-s3 install step.
Why auto-install during the remote helper's lifecycle is feasible
When git clone (or the clone phase of git submodule add) invokes git-remote-s3, git sets GIT_DIR to the new repo's gitdir before invoking the helper. The helper runs to completion (capabilities → list → fetch → unbundle) before git proceeds to checkout, which is when the LFS smudge filter runs. So the helper has a clean window to write to the local config and have it take effect before LFS needs it.
A git config --add invoked as a subprocess from the helper inherits GIT_DIR and writes to the right config file — including for submodules, where GIT_DIR resolves through the gitlink to <parent>/.git/modules/<path>/.
Proposed behavior
In S3Remote.__init__ (or cmd_capabilities), call a small helper:
def _maybe_install_lfs_config():
"""Set the LFS standalone-transfer-agent in the local repo's git config
if not already configured. No-op if already set (to anything) so we never
stomp a user's existing setup. Disable with GIT_REMOTE_S3_AUTO_INSTALL_LFS=0."""
if os.environ.get("GIT_REMOTE_S3_AUTO_INSTALL_LFS", "1").lower() in ("0", "false", "no"):
return
try:
existing = subprocess.check_output(
["git", "config", "--get", "lfs.standalonetransferagent"],
text=True, stderr=subprocess.DEVNULL,
).strip()
if existing:
return # already configured (by us or someone else); don't touch
except subprocess.CalledProcessError:
pass
subprocess.run(
["git", "config", "--add", "lfs.customtransfer.git-lfs-s3.path", "git-lfs-s3"],
check=False,
)
subprocess.run(
["git", "config", "--add", "lfs.standalonetransferagent", "git-lfs-s3"],
check=False,
)
Properties:
- Idempotent: subsequent fetches don't re-add or duplicate entries.
- Non-stomping: if a user has configured a different agent, we leave it alone.
- Per-repo only: never writes to global config.
- Harmless on non-LFS repos: the standalone-transfer-agent only fires on files matched by
.gitattributes, so setting it in a repo without LFS has no runtime effect.
- Opt-out:
GIT_REMOTE_S3_AUTO_INSTALL_LFS=0 skips it entirely.
git-lfs-s3 install stays as-is for users who want explicit control or cross-tool scripting.
What this enables
# Today
git clone s3://bucket/lfs-repo # ← fails on smudge
cd lfs-repo
git-lfs-s3 install
git reset --hard main # ← required to retry smudge
# After this change
git clone s3://bucket/lfs-repo # ← just works
# Today
GIT_LFS_SKIP_SMUDGE=1 git submodule add s3://bucket/lfs-repo path
cd path && git-lfs-s3 install && git lfs pull && cd ..
git submodule absorbgitdirs # or whatever's needed to finish init
# After this change
git submodule add s3://bucket/lfs-repo path # ← just works
The README's "Clone the repo" section under LFS can lose its workaround.
Open questions:
- Scope of opt-out: env var is the lightest knob; alternatively a
git-remote-s3.auto-install-lfs git config. Preference?
- Detect LFS first? I lean toward "always try, idempotent, harmless if not LFS" rather than inspecting bundle contents. Inspecting adds complexity for marginal benefit. Agree/disagree?
- Documentation tone: should
git-lfs-s3 install remain documented as the canonical setup, with auto-install treated as a transparent convenience, or should the README lead with auto-install?
If the direction looks right, I'm happy to follow up with a PR including tests (one covering the "config not set → install" path, one covering "existing config preserved", and one integration-ish test that exercises S3Remote.__init__ against a temp git repo).
Auto-install LFS standalone-transfer-agent on first invocation so cloning / submodule-adding LFS repos works out of the box
Summary
When a user runs
git clone s3://...(orgit submodule add s3://...) against a repo that uses LFS, the initial checkout fails with:The README (LFS → Clone the repo) acknowledges this and instructs the user to recover with two manual steps:
cd lfs-repo-clone git-lfs-s3 install git reset --hard mainThis surfaces the integration's seam in the user's face on first contact. It's especially painful for
git submodule add, where:git submodule addcleanly without first running it withGIT_LFS_SKIP_SMUDGE=1, then runninggit-lfs-s3 installinside the submodule, thengit lfs pull. That's three workarounds for one operation.I'd like to propose
git-remote-s3install the LFS standalone-transfer-agent config automatically the first time it runs against a repo, sogit clone/git submodule addwork end-to-end without manual setup.Happy to follow up with a PR once you confirm the direction. Related: #62 fixes the gitdir-resolution side of the submodule UX; this issue addresses the configuration side.
Why
.lfsconfigwon't workThe natural-looking answer is "let repo owners commit a
.lfsconfig". This is not an option: git-lfs intentionally excludeslfs.customtransfer.<name>.pathandlfs.standalonetransferagentfrom.lfsconfig's allowed keys. Fromgit-lfs-config(5):Allowing those keys in a file that's checked in would let a malicious repo execute an arbitrary binary on
git checkout. So configuration has to live in the user's local/global git config — which is exactly thegit-lfs-s3 installstep.Why auto-install during the remote helper's lifecycle is feasible
When
git clone(or the clone phase ofgit submodule add) invokesgit-remote-s3, git setsGIT_DIRto the new repo's gitdir before invoking the helper. The helper runs to completion (capabilities → list → fetch → unbundle) before git proceeds to checkout, which is when the LFS smudge filter runs. So the helper has a clean window to write to the local config and have it take effect before LFS needs it.A
git config --addinvoked as a subprocess from the helper inheritsGIT_DIRand writes to the right config file — including for submodules, whereGIT_DIRresolves through the gitlink to<parent>/.git/modules/<path>/.Proposed behavior
In
S3Remote.__init__(orcmd_capabilities), call a small helper:Properties:
.gitattributes, so setting it in a repo without LFS has no runtime effect.GIT_REMOTE_S3_AUTO_INSTALL_LFS=0skips it entirely.git-lfs-s3 installstays as-is for users who want explicit control or cross-tool scripting.What this enables
The README's "Clone the repo" section under LFS can lose its workaround.
Open questions:
git-remote-s3.auto-install-lfsgit config. Preference?git-lfs-s3 installremain documented as the canonical setup, with auto-install treated as a transparent convenience, or should the README lead with auto-install?If the direction looks right, I'm happy to follow up with a PR including tests (one covering the "config not set → install" path, one covering "existing config preserved", and one integration-ish test that exercises
S3Remote.__init__against a temp git repo).