Threadpool race condition#111
Open
remifontan wants to merge 4 commits intomadmann91:masterfrom
Open
Conversation
- add TaskGroup to scope multiple tasks together - add wait(TaskGroup) and wait_all() for blocking client until tasks completions - minor changes regarding missing virtual destructor error with gcc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi,
I've been using bvh for a while, it's been great. easy to use and performant. Unfortunately, I have hit an issue recently. My application is highly concurrent and needs to build multiple bvh concurrently. I therefore created one
bvh::v2::ThreadPooland use it for all bvh construction. In some cases, it hits a race condition and the bvh construction hangs.I believe the problem is coming from
ThreadPool::worker.pool->done_.notify_one();wakes up one worker thread, and in some cases, when multiple tasks are finishing at the same time, apool->done_.notify_one();could get missed.Changing
pool->done_.notify_one();topool->done_.notify_all();does seem fix the issue. However that is not a great solution.For example, 2 clients may ask to build a bvh at the same time. They will be both waiting for each others tasks be completed, due to how
ThreadPool::waitis implemented. Ideally, a client would only be blocked until the completion of its tasks, regardless of what the other clients are scheduling to the threadpool.With this in mind, I decided to modify the threadpool implementation so that multiple bvh can be built concurrently on the same threadpool.
For this I extended the threadpool with the concept of
TaskGroup.The idea is to scope all tasks into taskgroups, and a client now has to wait for the completion of a taskgroup.
2 clients can push tasks into their own taskgroup and therefore not wait for each other anymore.
in
ThreadPool::worker, worker threads are being notified only once a taskgroup goes to completion, hopefully keeping the overhead to a minimum.I also added a
ThreadPool::wait_all, which similarly to previous implementation, wait for completion of all tasks.I have added a test (
threadpool_test.cpp) that hangs with previous implementation, and works with new implementation.note:
threadpool_test.cppwill need to be modified a bit to make it compile with previous implementation:threadpool.wait_all()withthreadpool.wait()I am hoping that you would find this improvements useful.
Note : I experimented with another solution based of task stealing . Instead of having
theadpool.waitbe blocking and doing nothing, it would actually be actively stealing tasks from the pool and evaluating them. That was promising but needed some modifications in theParallelExecutor::reducewith howthread_idis handled.If you are interested I could do another PR with that solution.