Skip to content

Threadpool race condition#111

Open
remifontan wants to merge 4 commits intomadmann91:masterfrom
remifontan:threadpool
Open

Threadpool race condition#111
remifontan wants to merge 4 commits intomadmann91:masterfrom
remifontan:threadpool

Conversation

@remifontan
Copy link
Copy Markdown

Hi,

I've been using bvh for a while, it's been great. easy to use and performant. Unfortunately, I have hit an issue recently. My application is highly concurrent and needs to build multiple bvh concurrently. I therefore created one bvh::v2::ThreadPool and use it for all bvh construction. In some cases, it hits a race condition and the bvh construction hangs.

I believe the problem is coming from ThreadPool::worker . pool->done_.notify_one(); wakes up one worker thread, and in some cases, when multiple tasks are finishing at the same time, a pool->done_.notify_one(); could get missed.

Changing pool->done_.notify_one(); to pool->done_.notify_all(); does seem fix the issue. However that is not a great solution.
For example, 2 clients may ask to build a bvh at the same time. They will be both waiting for each others tasks be completed, due to how ThreadPool::wait is implemented. Ideally, a client would only be blocked until the completion of its tasks, regardless of what the other clients are scheduling to the threadpool.

With this in mind, I decided to modify the threadpool implementation so that multiple bvh can be built concurrently on the same threadpool.

For this I extended the threadpool with the concept of TaskGroup.

The idea is to scope all tasks into taskgroups, and a client now has to wait for the completion of a taskgroup.

ThreadPool threadpool;

ThreadPool::TaskGroup tgroup_1;
threadpool.push(tgroup_1, [&](size_t) { ... }); // task 1 in group 1
threadpool.push(tgroup_1, [&](size_t) { ... }); // task 2 in group 1

ThreadPool::TaskGroup tgroup_2;
threadpool.push(tgroup_2, [&](size_t) { ... }); // task 3 in group 2

threadpool.wait(tgroup_1); // wait for task 1 and task 2, from group 1

2 clients can push tasks into their own taskgroup and therefore not wait for each other anymore.

in ThreadPool::worker , worker threads are being notified only once a taskgroup goes to completion, hopefully keeping the overhead to a minimum.

I also added a ThreadPool::wait_all, which similarly to previous implementation, wait for completion of all tasks.

I have added a test (threadpool_test.cpp) that hangs with previous implementation, and works with new implementation.
note: threadpool_test.cpp will need to be modified a bit to make it compile with previous implementation:

  • remove TaskGroup
  • replace threadpool.wait_all() with threadpool.wait()

I am hoping that you would find this improvements useful.

Note : I experimented with another solution based of task stealing . Instead of having theadpool.wait be blocking and doing nothing, it would actually be actively stealing tasks from the pool and evaluating them. That was promising but needed some modifications in the ParallelExecutor::reduce with how thread_id is handled.

If you are interested I could do another PR with that solution.

remifontan and others added 4 commits April 29, 2026 15:55
- add TaskGroup to scope multiple tasks together
- add wait(TaskGroup) and wait_all() for blocking client until tasks completions
- minor changes regarding missing virtual destructor error with gcc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant