← 返回首页
Fix race condition in OpenCL kernel by frasercrmck · Pull Request #3535 · arrayfire/arrayfire · GitHub
Skip to content

Navigation Menu

Toggle navigation
Sign in
Appearance settings
Search or jump to...

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Resetting focus

Fix race condition in OpenCL kernel#3535

Merged
christophe-murphy merged 1 commit into
arrayfire:masterfrom
frasercrmck:fix-flood-fill
Feb 20, 2025
Merged

Fix race condition in OpenCL kernel#3535
christophe-murphy merged 1 commit into
arrayfire:masterfrom
frasercrmck:fix-flood-fill

Conversation

Copy link
Copy Markdown
Contributor

Description

Without the barrier at the end of barrierOR, it is possible for work-item 0 to start the next loop iteration and update predicates[0] while other work-items are still inside barrierOR reading predicates, meaning they read the next loop iteration's exit condition. This results in a divergent loop, where not all work-items reach the same barriers.

A previous fix identified this as a problem only on NVIDIA platforms, but strictly speaking a barrier is required in all cases to avoid a spec violation and undefined behaviour.

Changes to Users

The kernel should produce correct results on more OpenCL implementations.

Locally I tested both Intel(R) FPGA Emulation Device and various oneAPI Construction Kit devices, which all previously failed the confidence_connected_opencl --gtest_filter="SingleSeed/ConfidenceConnectedDataTest.SegmentARegion/_prefix_background_radius_0_multiplier_1_iterations_5_replace_255" unit test.

I'm unable to test other OpenCL implementations, sorry.

Checklist

  • Rebased on latest master
  • Code compiles
  • Tests pass
  • [ ] Functions added to unified API
  • [ ] Functions documented

Without the barrier at the end of barrierOR, it is possible for work-item 0 to start the next loop iteration and update predicates[0] while other work-items are still inside barrierOR reading `predicates`, meaning they read the next loop iteration's exit condition. This results in a divergent loop, where not all work-items reach the same barriers. A previous fix identified this as a problem only on NVIDIA platforms, but strictly speaking a barrier is required in all cases to avoid a spec violation and undefined behaviour.
Copy link
Copy Markdown
Member

umar456 commented Feb 21, 2024

Took me a bit to figure out the problem but I see the issue now. The we can ignore the errors in the CI because they are not related. I will test it on a couple of other systems before merge this PR. Thank you for your contribution!

melonakos added this to the 3.10 milestone Feb 5, 2025
christophe-murphy self-requested a review February 20, 2025 20:17
christophe-murphy merged commit 6cea4d3 into arrayfire:master Feb 20, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Footer

© 2026 GitHub, Inc.