Add the gitdb and smmap submodule working tree paths to the
`safe.directory` config in the Cygwin CI workflow. Without this, when
GitPython opens a submodule as a `Repo` and runs
`git cat-file --batch-check` against it, git rejects the repository for
dubious ownership. The user-visible failure modes (`ValueError`,
`IndexError`, `AssertionError`) all trace back to this rejection.
Why `[gitdb]` failed and `[smmap]` passed
-----------------------------------------
The trust check in `test_fixture_health.py` failed for `[gitdb]` but
passed for `[smmap]`, even though neither submodule's working tree was
in the workflow's `safe.directory` list before this commit. The
asymmetry comes down to which Owner SID NTFS records on each path, and
which paths git's ownership check requires to be owned.
There are six paths git's check might consider for the two submodules:
gitdb's `.git` gitfile, worktree `git/ext/gitdb`, and gitdir
`<repo>/.git/modules/gitdb`; smmap's `.git` gitfile, worktree
`git/ext/gitdb/gitdb/ext/smmap`, and gitdir
`<repo>/.git/modules/gitdb/modules/smmap`. Of these, only one had NTFS
Owner = `BUILTIN\Administrators` (Cygwin uid 544): gitdb's worktree. The
other five had Owner = `runneradmin`, whose Cygwin uid (197108) was the
value `geteuid()` returned in the test process.
The same Owner pattern held both with and without
`submodules: recursive` in `actions/checkout`. The single
`Administrators`-owned path was gitdb's worktree. All other paths were
`runneradmin`-owned, including the ones that Git for Windows's recursive
submodule clone had just produced when `submodules: recursive` was set.
The differentiator is not which git binary clones the submodules, but
that `git/ext/gitdb` is created by the outer `git clone`'s checkout
phase. When `git checkout` materializes a tree entry of mode 160000, it
calls `mkdir(path, 0777)` to create an empty submodule directory (see
`entry.c::write_entry`, case `S_IFGITLINK` [1] [2]).
On Windows GHA runners, jobs run as `runneradmin`. This is the built-in
local Administrator account (its Cygwin uid 197108 = 196608 + 500
matches Cygwin's mapping [3] for machine-local accounts: 0x30000 plus
the SID suffix 500, the well-known suffix of that account's SID). That
account is exempt from UAC token filtering by default (Admin Approval
Mode for the built-in Administrator account is disabled [4]), so its
processes hold the full administrative access token. `CreateProcessW`
propagates the parent's primary token unchanged through
`actions/checkout`'s process tree. Inside that tree, the outer
`git clone`'s `mkdir(path, 0777)` produces directories whose NTFS Owner
is `BUILTIN\Administrators` -- as observed on every workspace directory
the outer clone materialized, including the `git/ext/gitdb` placeholder.
Subsequent submodule-update operations -- both Git for Windows if
`actions/checkout` does a recursive clone, and Cygwin git if it happens
later due to `init-tests-after-clone.sh` -- produce paths that NTFS
records as `runneradmin`-owned. Both flows go through a process whose
primary token's `TokenOwner` field has been rewritten from
`BUILTIN\Administrators` to the user SID by a Cygwin or Cygwin-derived
runtime at DLL initialization. The rewrite propagates to every
descendant via `CreateProcessW`'s primary-token inheritance [5], so
every `mkdir` issued after that point produces a directory owned by
the user.
- Cygwin git triggers the rewrite directly. `cygheap_user::init`
in `cygwin1.dll` calls
`NtSetInformationToken(hProcToken, TokenOwner, &effec_cygsid, ...)`
at process startup [6].
- Git for Windows triggers it indirectly. `git submodule` is not a
builtin (only `submodule--helper` is) [7]. So it falls through to
`execv_dashed_external` and runs `git-submodule.sh`, a shell script
whose shebang is resolved at runtime to `sh.exe` in the Git for
Windows "Git Bash" environment. That `sh.exe` is an MSYS2 binary
linked against `msys-2.0.dll`, a Cygwin fork that performs the
same `TokenOwner` rewrite. From there, every `git.exe` the script
spawns inherits the user-SID `TokenOwner` and produces user-owned
directories.
Cygwin's `lstat().st_uid` reports the actual NTFS Owner SID mapped
through Cygwin's SID-to-uid table. `is_path_owned_by_current_user`
reduces to `lstat(p).st_uid == geteuid()` on Cygwin (no Administrators
group exemption). `ensure_valid_ownership` returns 1 (accepting the
repository) without consulting `safe.directory` when the gitfile,
worktree, and gitdir ALL pass that owned-by-current-user check.
Otherwise it falls through to comparing the worktree's `real_pathdup`
against each configured `safe.directory` entry.
For gitdb the three Owners were `runneradmin` (gitfile),
**`Administrators` (worktree)**, and `runneradmin` (gitdir), so the
all-paths-owned check failed on the worktree. The workflow's
`safe.directory` before this commit contained only `$(pwd)` and
`$(pwd)/.git`, neither of which exact-matches `git/ext/gitdb`, so the
`safe.directory` comparison also failed, and `ensure_valid_ownership`
returned 0 -- git rejected the repository. For smmap the three Owners
were all `runneradmin`, so the all-paths-owned check passed.
Cygwin's `chown` cannot rewrite the gitdb worktree's Owner SID from
`Administrators` to `runneradmin`: it returns "Permission denied".
Adding both submodule worktree paths to `safe.directory` is the correct
fix and is robust against shifts in what paths inherit which Owner.
Why `actions/checkout`'s own `safe.directory` does not help
-----------------------------------------------------------
`actions/checkout`'s `set-safe-directory: true` default writes the main
repository path to `safe.directory` in a temporary config it points its
own spawned git child at by overriding `HOME` for that child process.
That `HOME` override applies only to git invocations the action itself
spawns; subsequent steps' processes inherit the runner user's real
`HOME` (e.g., `C:\Users\runneradmin` on the Windows runner) and read its
actual `~/.gitconfig`, which never received the entry. So no git in a
later step, whether Git for Windows or Cygwin git, sees it. That's why
the `cygwin-test` workflow sets Cygwin git's `safe.directory` itself.
This commit extends that to cover the gitdb and smmap working trees.
The distinction between Cygwin git and Git for Windows is also why the
bug affected the Cygwin jobs and no other Windows jobs. `compat/mingw.c`
defines `is_path_owned_by_current_sid` [8], which accepts
`BUILTIN\Administrators`-owned paths when the current user is a member
of `Administrators`. Cygwin git compiles against the POSIX path
(`is_path_owned_by_current_uid` in `git-compat-util.h` [9]) without that
leniency. So the same `Administrators`-owned `git/ext/gitdb` that Cygwin
git rejects is silently accepted by Git for Windows, and the main CI
workflow's `windows-latest` jobs never trip the trust check.
Verification
------------
The `reproduce-safe-dir` matrix on the previous commits produces
failures for all three affected tests; this commit's CI run shows those
tests passing instead.
The Owner-SID claim above is verified by the `diag-token` job introduced
for this purpose. That job creates a directory through four code paths
(PowerShell-only, Cygwin-only, Cygwin-bash spawning Win32 `git init`,
and a PowerShell -> Cygwin-bash -> PowerShell sandwich) and reports the
NTFS Owner of each. The observed Owners match the predicted values in
every case, including the load-bearing Cygwin -> Win32 propagation case
(test C) and the sandwich case (test D) showing that the determinant is
whether some process in the ancestry has loaded a Cygwin-family runtime,
not the identity of the file-creating binary.
The commit immediately preceding this one temporarily sets
`submodules: recursive` on `actions/checkout` in every workflow that
runs the test suite. Its CI run shows the bug still triggering on Cygwin
(the gitdb worktree directory itself is created by the outer
`git clone`'s checkout phase, before any submodule init runs, regardless
of which mechanism subsequently populates the submodule contents).
A subsequent commit will revert that change; its CI run shows this fix
continues to hold without `submodules: recursive`, confirming the fix is
independent of submodule source.
[1]:
https://github.com/git/git/blob/v2.51.0/entry.c#L397
[2]:
https://github.com/git-for-windows/git/blob/v2.54.0.windows.1/entry.c#L397
[3]:
https://cygwin.com/cygwin-ug-net/ntsec.html
[4]:
https://learn.microsoft.com/en-us/windows/security/application-security/application-control/user-account-control/settings-and-configuration
[5]:
https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw
[6]:
https://sourceware.org/cgit/newlib-cygwin/tree/winsup/cygwin/uinfo.cc?h=cygwin-3.6.9#n82
[7]:
https://github.com/git-for-windows/git/blob/v2.54.0.windows.1/git.c#L661
[8]:
https://github.com/git-for-windows/git/blob/v2.54.0.windows.1/compat/mingw.c#L3931
[9]:
https://github.com/git/git/blob/v2.51.0/git-compat-util.h#L346
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In #1455, which got Cygwin tests running on GitHub Actions, DWesl mentioned:
I have three test failures left that I don't understand, and hope someone here knows what's going on.
Byron's review mentioned:
The failures are all related to submodule handling, and I think that functionality isn't necessarily widely used anymore.
Further discussion raised the question of whether this was the case, as well as workarounds for using submodules on Cygwin even if submodule-specific functionality is broken. The tests were marked xfail (54bae76, 0eda0a5, 7f3689d) and the PR was merged.
It turns out the GitPython submodule functionality is not broken on Cygwin, and neither are the tests! Instead, the problem is that the rorepo fixture is GitPython's own repository, and when actions/checkout clones GitPython, the top-level gitdb directory is owned by the Administrators rather than the runneradmin user. Git for Windows special-cases this, judging that repositories owned by the Administrators group are safe for members of that group to trust. Cygwin git does not special-case this, so more safe.directory entries are needed to assuage it.
The key to identifying the cause of the problem is that only tests that use self.rorepo and attempt submodule operations, but not those using @with_rw_repo and attempting submodule operations, failed in this way. The nature of the problem was obscured by several oddities, though the fourth and final oddity is also what revealed it:
BrokenPipeError is a subclass of IOError, and we currently swallow IOError in some of the places where this happens. The ValueError case is overwhelmingly more common. That's the xfail decorations from #1455 covered. Recently, EPIPE wins the race slightly more often than in the past, and we've had to rerun tests a number of times. It would be possible to add more exception types to the xfail decorations, but a better approach is to verify the complete details of what is going wrong and why, add more regression tests, and fix the problem properly, removing the xfails. This PR does those things.
The fix is very simple: we already have a CI step that adds paths to the Cygwin git safe.directory configuration, and the fix is to add the missing paths related to submodules. But the tests are somewhat nontrivial, and the partially reverted instrumentation to fully confirm the cause and facilitate easier debugging in the future is also somewhat nontrivial.
The code changes and commit messages in this PR were made with Claude Code, as disclosed in commit message trailers. I've reviewed and substantially adjusted the code changes. I've also reviewed and honed the commit messages through many rounds of revision, including manual edits. A few of the commit messages are long and dense. I have made sure to spend more time with those, to ensure I believe the details are warranted, since even though I am well known for writing long detailed commit messages, I understand this is something people are more wary of when LLMs are involved (since they can generate large amounts of text quickly). As for this PR description, no part of it is LLM-generated, though I did use Claude for proofreading.
The commit messages describe the situation, the evidence for it, and the fix from the perspective of tracing what has occurred. One aspect of the bug--the behavior of safe.directory protections on Windows when multiple git implementations are used and the repository has submodules that it must operate on--is particularly non-obvious, unintuitive, and interesting, and it may end up being relevant to future improvements here in GitPython, as well as to submodule portability subtleties that might arise in the future in gitoxide. Hence this section.
On POSIX, the owner and group owner of a file are separate concepts, with each file being owned by one user and group-owned by one group. But on Windows there is a unified concept of Owner. Unlike on a Unix-like system, on Windows a user or a group can own a file in the same sense of "own." Users usually own the files they create, but one of the exceptions to this is that users who are members of the Administrators group create files that are owned by the Administrators group. This exception is actually more specific than that: it only applies when the user is actually running with their unfiltered (full) token in which the Administrators group is active. Usually, members of the Administrators group on Windows run with UAC enabled and configured to require elevation to act with their full administrative powers. But the runneradmin user that runs Windows CI jobs runs with a full admin token, so files it creates are owned by Administrators.
Git uses ownership as a powerful trust signal. It will operate on repositories it thinks the user running it owns, since the configuration and hooks in such a repository are presumably safe. Git checks if some important repository paths, such as the repo's top-level directory and its .git dir, have the user running it as their owner. If they do, it trusts the repository. For any not owned, it checks if they match any values of the safe.directory configuration variable. If any are neither owned by the user nor match entries in safe.directory, then Git refuses to operate. But this gets weird on Windows, where the directories might be owned by something that isn't a user at all:
Git for Windows lets members of the Administrators group operate on repositories whose ownership matches what those users might very well create themselves. If the user is in Administrators, and a directory is owned by Administrators, then Git for Windows considers it to be owned by the user for purpose of assessing trust.
Cygwin git does not do this. It doesn't generally need to. If you start a program built against cygwin1.dll--whether that program is git or anything else--it changes the owner of the process token to the user, and then the owner inherited by securable objects (such as files) the process creates is the user instead of whatever it was before. So a member of the Administrators group, acting with the full powers thereof, can run Cygwin git with Administrators as the process token owner initially--but the process token owner is changed to the user, almost immediately, before Cygwin git's main() function is called. For this reason, Cygwin git typically has no need to treat repositories owned by the Administrators group specially--it never creates such a repository.
When we clone a repository that has a submodule, we may or may not also clone the submodule. If we do, we might clone it at the same time, or later. But whatever happens, so long as the top-level repository is able to be checked out, we first get the top-level repository with an empty directory at the submodule root. Whatever ownership we are creating files with, we create the submodule root with that ownership.
In all our Windows CI jobs, including the Cygwin jobs, we use actions/checkout to clone. While actions/checkout is capable of cloning repositories using the GitHub REST API, it only uses this as a fallback strategy. It first checks if a recent enough version of git is available and, if so, uses it. On all our Windows CI jobs, including the Cygwin jobs, actions/checkout clones the GitPython repository using Git for Windows.
Thus, no matter how its submodules are cloned, the top-level directory of gitdb is owned by the Administrators group. Cygwin git would therefore refuse to operate on it. But the GitPython test suite uses the GitPython repository itself, as well as its direct submodule gitdb and its nested submodule smmap, as test fixtures. Because Cygwin git wouldn't operate on the gitdb submodule, tests that use it were failing.
Most submodule tests didn't have a problem, because they use @with_rw_repo, which creates a new clone, instead of self.rorepo, which uses the repository in place. On Cygwin, @with_rw_repo clones the repository with Cygwin git. As described above, when Cygwin git clones repositories, it clones them as the user.
More remarkably, while I have added safe.directory entries for the nested smmap module, this is actually not strictly necessary--smmap never had dubious ownership! The reason is that only the gitdb directory created in the top-level GitPython checkout is owned by Administrators. No contents of submodules, not even of the gitdb submodule, are checked out as owned by Administrators. This is the case even if actions/checkout is made to clone them all by setting submodules: recursive. (I tested with this to be sure. But I did not keep this, since we have good reasons to validate that our init script will clone the submodules even if they were not cloned before. See #1713 and #1715 on this.) That is to say that none of the files created when cloning submodules are owned by Administrators even when Git for Windows creates them in the same top-level git clone command.
How can this be? Historically, the machinery that operated on submodules was implemented in scripts. Over time, it basically all came to be implemented in C, except that the git-submodule subcommand itself remains implemented by git-submodule.sh. Today, the major functionality of the script is to parse options, apply defaults, look up some information about the current repository, and call git submodule--helper to do the real work. In Git for Windows, the C code that does the real work is in native Windows programs such as git.exe. But git-submodule.sh is a shell script. Its interpreter is sh.exe from the MSYS2 "Git Bash" environment that Git for Windows ships.
MSYS2 is like Cygwin (MSYS2 is a fork of MSYS, which is a fork of Cygwin). Just as Cygwin programs link to cygwin1.dll, which does various setup--including, as described above, resetting the process's token owner to the user--MSYS2 programs link to msys-2.0.dll, which does that very same thing. Therefore, a user who is a member of the Administrators group can run git with Administrators as its process token owner, but in any chain of subprocesses that goes through a shell invocation, everything at or below that invocation will operate with it reset to the user.