← 返回首页
Add AI-disclosure and quality requirements to the contribution guidelines by Byron · Pull Request #2143 · gitpython-developers/GitPython · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

Add AI-disclosure and quality requirements to the contribution guidelines#2143

Merged
Byron merged 2 commits into
mainfrom
contrbuting
May 6, 2026
Merged

Add AI-disclosure and quality requirements to the contribution guidelines#2143
Byron merged 2 commits into
mainfrom
contrbuting

Conversation

Copy link
Copy Markdown
Member

Byron commented May 6, 2026
edited
Loading

Tasks

  • Eliah's review

…ines. Co-authored-by: GPT 5.5 <codex@openai.com>
Byron requested a review from EliahKagan May 6, 2026 07:01
Split the quality-expectations section into two paragraphs (the warning about low-quality contributions being declined was visually merged with the preceding paragraph). Replace "and the pull request closed without warning" with a note that maintainers may not always be able to provide detailed feedback, which conveys the same practical reality. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

EliahKagan left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

I've pushed a commit (92ff6df). How do you feel about this wording? I worry this may soften the point you are making too much, which I would not want to happen. But this also explains the reasoning why feedback may be absent -- and I am hoping it may also avoid discouraging people from contributing, while at the same time hopefully making the situation at least as clear, and setting the exact same expectations, as in the original wording.

EliahKagan requested a review from Copilot May 6, 2026 14:19
EliahKagan changed the title Add AI-disclusure and quality requirements to the contribution guidelines. Add AI-disclosure and quality requirements to the contribution guidelines May 6, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason Spam Abuse Off Topic Outdated Duplicate Resolved Low Quality Hide comment

Pull request overview

This PR updates the project’s contribution guidelines to set explicit quality expectations for contributions and to require disclosure/identification for AI-assisted work when it meaningfully affects a PR, commits, or GitHub interactions.

Changes:

  • Add a “Quality expectations” section defining baseline standards (readability, maintainability, tests where practical, documentation, consistency).
  • Add an “AI-assisted contributions” section describing disclosure requirements and agent identification expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CONTRIBUTING.md Show resolved Hide resolved
Comment thread CONTRIBUTING.md Show resolved Hide resolved
Byron marked this pull request as ready for review May 6, 2026 21:38
Hide details View details Byron merged commit c7648c0 into main May 6, 2026
64 of 65 checks passed
Copy link
Copy Markdown
Member Author

Byron commented May 6, 2026

Thanks a lot! Let's use this.

Byron deleted the contrbuting branch May 6, 2026 21:39
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause hypothesis --------------------- The CI's safe.directory list (set in the "Special configuration for Cygwin git" step) covers the main repo and its .git directory but not the gitdb submodule's working tree at git/ext/gitdb. Git matches safe.directory exactly, not by prefix, so when GitPython spawns "git cat-file --batch-check" in the gitdb submodule (via Repo(submodule_path).odb.info(sha)), git rejects the repository for dubious ownership and exits. Three observed failure modes, all from the same root cause ---------------------------------------------------------- The git subprocess exits soon after starting. What Python sees depends on a race between Python writing to the subprocess's stdin and the subprocess exiting and closing its stdout pipe. 1. ValueError ("SHA is empty, possible dubious ownership..."): Python's write/flush completes before git has finished exiting. The buffered write succeeds, then stdout.readline() returns b"" (EOF). _parse_object_header(b"") in git/cmd.py raises this ValueError. The error message names the rejected directory and even suggests the safe.directory fix. This propagates uncaught from Object.new_from_sha through name_to_object (line 229 of git/repo/fun.py is outside the try/except loop) through repo.commit("HEAD") to iter_items, where the "except (IOError, BadName)" clause does not catch ValueError. 2. IndexError ("list index out of range"): Git exits before Python's write/flush runs. cmd.stdin.write or cmd.stdin.flush raises BrokenPipeError, a subclass of OSError (IOError). This time iter_items's "except (IOError, BadName)" catches it, returning an empty iterator. children() therefore returns an empty list, and "[0]" in test_submodules raises IndexError. 3. AssertionError ("1 not greater than or equal to 2"): Same BrokenPipeError-caught-as-IOError mechanism as case 2, but manifesting in test_repo.py::test_submodules, which does "assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)". The recursive traversal finds gitdb (via the main repo, which is in safe.directory) but cannot enumerate gitdb's children, so only one submodule is yielded. Evidence -------- Recent main-branch CI runs show all three xfailed tests consistently matching their raises=ValueError xfail, e.g. job 74546730688 in run 25415738988. Verified across the five most recent main runs. PR gitpython-developers#2143 attempt 1 (job 74630986491, run 25440735020 attempt 1) had test_docs.py FAILED with IndexError while test_repo.py XFAILed with ValueError in the same job. PR gitpython-developers#2143 attempt 2 (job 74633063805) had both XFAIL with ValueError. Same commit, same workflow, same runner image -- a flaky race. The 3-job trial of these reproduce-safe-dir jobs (run 25454533092, commit baf3526) produced 8 ValueError and 1 IndexError out of 9 test runs. The 30-job run (run 25454836713) produced 90 ValueError and 0 IndexError in the reproduce jobs, but the same run's test (fast) job exhibited the AssertionError variant in test_repo.py::test_submodules. That AssertionError is the third manifestation of the BrokenPipeError race. Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. The 256-job matrix is intended to characterize the rate of each variant. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). Verify that the tests now pass. 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause hypothesis --------------------- The CI's safe.directory list (set in the "Special configuration for Cygwin git" step) covers the main repo and its .git directory but not the gitdb submodule's working tree at git/ext/gitdb. Git matches safe.directory exactly, not by prefix, so when GitPython spawns "git cat-file --batch-check" in the gitdb submodule (via Repo(submodule_path).odb.info(sha)), git rejects the repository for dubious ownership and exits. Three observed failure modes, all from the same root cause ---------------------------------------------------------- The git subprocess exits soon after starting. What Python sees depends on a race between Python writing to the subprocess's stdin and the subprocess exiting and closing its stdout pipe. 1. ValueError ("SHA is empty, possible dubious ownership..."): Python's write/flush completes before git has finished exiting. The buffered write succeeds, then stdout.readline() returns b"" (EOF). _parse_object_header(b"") in git/cmd.py raises this ValueError. The error message names the rejected directory and even suggests the safe.directory fix. This propagates uncaught from Object.new_from_sha through name_to_object (line 229 of git/repo/fun.py is outside the try/except loop) through repo.commit("HEAD") to iter_items, where the "except (IOError, BadName)" clause does not catch ValueError. 2. IndexError ("list index out of range"): Git exits before Python's write/flush runs. cmd.stdin.write or cmd.stdin.flush raises BrokenPipeError, a subclass of OSError (IOError). This time iter_items's "except (IOError, BadName)" catches it, returning an empty iterator. children() therefore returns an empty list, and "[0]" in test_submodules raises IndexError. 3. AssertionError ("1 not greater than or equal to 2"): Same BrokenPipeError-caught-as-IOError mechanism as case 2, but manifesting in test_repo.py::test_submodules, which does "assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)". The recursive traversal finds gitdb (via the main repo, which is in safe.directory) but cannot enumerate gitdb's children, so only one submodule is yielded. Evidence -------- Recent main-branch CI runs show all three xfailed tests consistently matching their raises=ValueError xfail, e.g. job 74546730688 in run 25415738988. Verified across the five most recent main runs. PR gitpython-developers#2143 attempt 1 (job 74630986491, run 25440735020 attempt 1) had test_docs.py FAILED with IndexError while test_repo.py XFAILed with ValueError in the same job. PR gitpython-developers#2143 attempt 2 (job 74633063805) had both XFAIL with ValueError. Same commit, same workflow, same runner image -- a flaky race. The 3-job trial of these reproduce-safe-dir jobs (run 25454533092, commit baf3526) produced 8 ValueError and 1 IndexError out of 9 test runs. The 30-job run (run 25454836713) produced 90 ValueError and 0 IndexError in the reproduce jobs, but the same run's test (fast) job exhibited the AssertionError variant in test_repo.py::test_submodules. That AssertionError is the third manifestation of the BrokenPipeError race. Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. The 256-job matrix is intended to characterize the rate of each variant. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). Verify that the tests now pass. 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 7, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Remove the Cygwin xfail decorations from test_submodules in test_docs.py and test_repo.py, and from test_root_module in test_submodule.py, so the tests surface the underlying failure directly. Add 256 reproduce-safe-dir matrix jobs to cygwin-test.yml, each running these three tests under the current safe.directory configuration. All or most of these reproduce-safe-dir jobs must be removed before this work is integrated. The existing test job's env, defaults, and setup steps gain YAML anchors so the new job can reference them without duplication. Root cause ---------- The CI workflow adds the main repo and its .git directory to safe.directory but not the gitdb or smmap submodule working trees. Git matches safe.directory by exact path. When GitPython opens a submodule as a Repo and runs "git cat-file --batch-check" against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to __get_object_header (git/cmd.py:1697) hits one of two outcomes depending on whether Python's cmd.stdin.flush() runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next flush() raises BrokenPipeError.) If Python wins the race, flush() succeeds (data goes into the kernel buffer); the subsequent stdout.readline() returns b"" (EOF); _parse_object_header at git/cmd.py:1659 raises a ValueError whose message names the rejected directory. If git wins, flush() raises BrokenPipeError, which is OSError, which is IOError. In both paths, the exception travels from Object.new_from_sha at git/repo/fun.py:229 (outside name_to_object's try/except for ValueError-from-dereference_recursive) up through repo.commit("HEAD") into iter_items at git/objects/submodule/base.py. That function's "except (IOError, BadName)" at base.py:1597 catches BrokenPipeError but not ValueError. So Path 1 propagates ValueError all the way to the test; Path 2 ends with iter_items returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - test_docs::test_submodules does sm.children()[0].name, so the [0] on the empty list raises IndexError at git/util.py:1212. - test_repo::test_submodules does assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2). The recursive traversal yields gitdb (its iter_items on the *main* repo succeeds, because the main repo IS in safe.directory) but not smmap, so length 1 fails the assertion at test_repo.py:882. - test_submodule::test_root_module does "assert len(rsmsp) >= 2" on a similar traversal result, failing at test_submodule.py:513. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: test_docs: ValueError (Python wins) or IndexError (git wins). test_repo: ValueError (Python wins) or AssertionError (git wins). test_submodule: ValueError (Python wins) or AssertionError (git wins). In particular, test_docs never produces AssertionError, and test_repo and test_submodule never produce IndexError. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures traces to one of four source lines: git/cmd.py:1659 (ValueError, ~98.7%), git/util.py:1212 (IndexError, only test_docs), test/test_repo.py:882 (AssertionError, only test_repo), or test/test_submodule.py:513 (AssertionError, only test_submodule). Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current safe.directory configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the safe.directory configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the reproduce-safe-dir jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current `safe.directory` configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the `safe.directory` configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the `reproduce-safe-dir` jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current `safe.directory` configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the `safe.directory` configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the `reproduce-safe-dir` jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 1057 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current `safe.directory` configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the `safe.directory` configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the `reproduce-safe-dir` jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 8, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current `safe.directory` configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the `safe.directory` configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the `reproduce-safe-dir` jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 10, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Plan ---- 1. (this commit) Reproduce the failure under the current `safe.directory` configuration, with xfails removed so failures surface directly. 2. Apply the fix: extend the `safe.directory` configuration to cover the gitdb submodule's working tree (and smmap's, recursively). 3. Remove the `reproduce-safe-dir` jobs and the YAML anchors that only the reproduce job needed. The existing test job retains the permanent fix. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 11, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 11, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 12, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 12, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 12, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs must be removed before this work is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will likely be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2403 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. Reproduce-safe-dir runs on the fork: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The independently observed race in the field that prompted this investigation, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 14, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test/test_docs.py` and `test/test_repo.py`, and from `test_root_module` in `test/test_submodule.py`, so the tests surface the underlying failure directly. Temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs shall be removed before the associated bugfix is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (Same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. If git wins, `flush()` raises `BrokenPipeError`, which is `OSError`, which is `IOError`. In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s try/except for `ValueError`-from-`dereference_recursive`) up through `repo.commit("HEAD")` into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So Path 1 propagates `ValueError` all the way to the test; Path 2 ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in Path 2 is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test, so each test has exactly two possible failure types -- one per side of the race: - `test_docs`: `ValueError` (Python wins) or `IndexError` (git wins). - `test_repo`: `ValueError` (Python wins) or `AssertionError` (git wins). - `test_submodule`: `ValueError` (Python wins) or `AssertionError` (git wins). In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- Across 801 reproduce-safe-dir jobs and 5 buggy-config test (fast) jobs in the runs cited below, every one of 2418 test failures matches the per-test prediction. `ValueError` accounts for ~98.7%; the race-win exception types from the list above account for the remainder. Zero violations of the per-test prediction. `reproduce-safe-dir` runs: https://github.com/EliahKagan/GitPython/actions/runs/25454533092 https://github.com/EliahKagan/GitPython/actions/runs/25454836713 https://github.com/EliahKagan/GitPython/actions/runs/25472029324 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/1 https://github.com/EliahKagan/GitPython/actions/runs/25473762375/attempts/2 The first run with the fix applied (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25473807645 The race condition, from a test (fast) job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 16, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test_docs.py` and `test_repo.py`, and from `test_root_module` in `test_submodule.py`, so the tests surface the underlying failure directly. Also temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs shall be removed before the associated bugfix is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (This is the same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) 1. If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. 2. If git wins, `flush()` raises `BrokenPipeError`, a subclass of `OSError` (a.k.a. `IOError`). In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s `except ValueError`, which covers only the `dereference_recursive` call), up through `repo.commit("HEAD")`, into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So path (1) propagates `ValueError` all the way to the test, while path (2) ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in path (2) is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test. So each test has exactly two possible failure types, one per side of the race: - `test_docs`: `ValueError` -> Python wins, OR `IndexError` -> git wins. - `test_repo`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. - `test_submodule`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- In 1024 `reproduce-safe-dir` jobs and 4 buggy-config "test (fast)" job runs, 100% of 3084 target-test failures match the per-test prediction. `ValueError` accounts for ~99.0%, while the race-win exception types from the list above account for the others. `reproduce-safe-dir` runs: https://github.com/EliahKagan/GitPython/actions/runs/25836741324 https://github.com/EliahKagan/GitPython/actions/runs/25836329241 https://github.com/EliahKagan/GitPython/actions/runs/25836334196 https://github.com/EliahKagan/GitPython/actions/runs/25836339345 Run on the fix commit (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25836344886 The race condition, from a "test (fast)" job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 16, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test_docs.py` and `test_repo.py`, and from `test_root_module` in `test_submodule.py`, so the tests surface the underlying failure directly. Also temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs shall be removed before the associated bugfix is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (This is the same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) 1. If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. 2. If git wins, `flush()` raises `BrokenPipeError`, a subclass of `OSError` (a.k.a. `IOError`). In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s `except ValueError`, which covers only the `dereference_recursive` call), up through `repo.commit("HEAD")`, into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So path (1) propagates `ValueError` all the way to the test, while path (2) ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in path (2) is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test. So each test has exactly two possible failure types, one per side of the race: - `test_docs`: `ValueError` -> Python wins, OR `IndexError` -> git wins. - `test_repo`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. - `test_submodule`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- In 1024 `reproduce-safe-dir` jobs and 4 buggy-config "test (fast)" job runs, 100% of 3084 target-test failures match the per-test prediction. `ValueError` accounts for ~99.0%, while the race-win exception types from the list above account for the others. `reproduce-safe-dir` runs: https://github.com/EliahKagan/GitPython/actions/runs/25836741324 https://github.com/EliahKagan/GitPython/actions/runs/25836329241 https://github.com/EliahKagan/GitPython/actions/runs/25836334196 https://github.com/EliahKagan/GitPython/actions/runs/25836339345 Run on the fix commit (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25836344886 The race condition, from a "test (fast)" job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 16, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test_docs.py` and `test_repo.py`, and from `test_root_module` in `test_submodule.py`, so the tests surface the underlying failure directly. Also temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs shall be removed before the associated bugfix is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by exact path. When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (This is the same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) 1. If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. 2. If git wins, `flush()` raises `BrokenPipeError`, a subclass of `OSError` (a.k.a. `IOError`). In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s `except ValueError`, which covers only the `dereference_recursive` call), up through `repo.commit("HEAD")`, into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So path (1) propagates `ValueError` all the way to the test, while path (2) ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in path (2) is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test. So each test has exactly two possible failure types, one per side of the race: - `test_docs`: `ValueError` -> Python wins, OR `IndexError` -> git wins. - `test_repo`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. - `test_submodule`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- In 1024 `reproduce-safe-dir` jobs and 4 buggy-config "test (fast)" job runs, 100% of 3084 target-test failures match the per-test prediction. `ValueError` accounts for ~99.0%, while the race-win exception types from the list above account for the others. `reproduce-safe-dir` runs: https://github.com/EliahKagan/GitPython/actions/runs/25836741324 https://github.com/EliahKagan/GitPython/actions/runs/25836329241 https://github.com/EliahKagan/GitPython/actions/runs/25836334196 https://github.com/EliahKagan/GitPython/actions/runs/25836339345 Run on the fix commit (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25836344886 The race condition, from a "test (fast)" job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EliahKagan added a commit to EliahKagan/GitPython that referenced this pull request May 18, 2026
Remove the Cygwin xfail decorations from `test_submodules` in `test_docs.py` and `test_repo.py`, and from `test_root_module` in `test_submodule.py`, so the tests surface the underlying failure directly. Also temporarily add 256 `reproduce-safe-dir` matrix jobs to `.github/workflows/cygwin-test.yml`, each running these three tests under the current `safe.directory` configuration. All or most of these `reproduce-safe-dir` jobs shall be removed before the associated bugfix is integrated. The existing test job's `env`, `defaults`, and setup steps gain YAML anchors so the new job can reference them without duplication. The anchors will be removed when the `reproduce-safe-dir` jobs are. Root cause ---------- The CI workflow adds the main repo and its `.git` directory to `safe.directory` but not the gitdb or smmap submodule working trees. Git matches `safe.directory` by specific paths (or wildcards, but we are not using wildcards). When GitPython opens a submodule as a `Repo` and runs `git cat-file --batch-check` against it, git rejects the repository for dubious ownership and exits. The race -------- GitPython caches the cat-file process. When git has exited, the next call to `__get_object_header` hits one of two outcomes depending on whether Python's `cmd.stdin.flush()` runs before or after the kernel marks the read end of the pipe as closed. (This is the same mechanism as the long-standing gitpython-developers#427, in which SIGINT kills the cat-file process and the next `flush()` raises `BrokenPipeError`.) 1. If Python wins the race, `flush()` succeeds (data goes into the kernel buffer); the subsequent `stdout.readline()` returns `b""` (EOF); `_parse_object_header` raises a `ValueError` whose message names the rejected directory. 2. If git wins, `flush()` raises `BrokenPipeError`, a subclass of `OSError` (a.k.a. `IOError`). In both paths, the exception travels from `Object.new_from_sha` (outside `name_to_object`'s `except ValueError`, which covers only the `dereference_recursive` call), up through `repo.commit("HEAD")`, into `iter_items` in `git/objects/submodule/base.py`. That function's `except (IOError, BadName)` clause catches `BrokenPipeError` but not `ValueError`. So path (1) propagates `ValueError` all the way to the test, while path (2) ends with `iter_items` returning early and the test seeing an empty submodule list. Per-test outcomes ----------------- What the test sees in path (2) is determined by what the test does with the empty list: - `test_docs::test_submodules` does `sm.children()[0].name`, so the `[0]` on the empty list raises `IndexError`. - `test_repo::test_submodules` does `assertGreaterEqual(len(list(self.rorepo.iter_submodules())), 2)`. The recursive traversal yields gitdb (its `iter_items` on the *main* repo succeeds, because the main repo IS in `safe.directory`) but not smmap, so length 1 fails the assertion. - `test_submodule::test_root_module` does `assert len(rsmsp) >= 2` on a similar traversal result, failing. The race itself is non-deterministic, but the mapping from race outcome to exception type is deterministic per test. So each test has exactly two possible failure types, one per side of the race: - `test_docs`: `ValueError` -> Python wins, OR `IndexError` -> git wins. - `test_repo`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. - `test_submodule`: `ValueError` -> Python wins, OR `AssertionError` -> git wins. In particular, `test_docs` never produces `AssertionError`, and `test_repo` and `test_submodule` never produce `IndexError`. Empirical confirmation ---------------------- In 1024 `reproduce-safe-dir` jobs and 4 buggy-config "test (fast)" job runs, 100% of 3084 target-test failures match the per-test prediction. `ValueError` accounts for ~99.0%, while the race-win exception types from the list above account for the others. `reproduce-safe-dir` runs: https://github.com/EliahKagan/GitPython/actions/runs/25836741324 https://github.com/EliahKagan/GitPython/actions/runs/25836329241 https://github.com/EliahKagan/GitPython/actions/runs/25836334196 https://github.com/EliahKagan/GitPython/actions/runs/25836339345 Run on the fix commit (256 jobs, 768 test outcomes, all PASSED): https://github.com/EliahKagan/GitPython/actions/runs/25836344886 The race condition, from a "test (fast)" job in a PR gitpython-developers#2143 CI run: https://github.com/gitpython-developers/GitPython/actions/runs/25440735020/attempts/1 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Footer

© 2026 GitHub, Inc.