Elizabeth/disable sigterm in child threads#10
Conversation
MMeent
left a comment
There was a problem hiding this comment.
I don't think this is anywhere close to safe, but don't have enough of an understanding of the underlying primitives to make a solid argument.
Presumably, a better approach would be to have a separate thread exclusively used for this blocking wait_latch operation.
|
Having a separate thread for wait_latch might not fix the problem, because even if we left signals enabled for that thread it would mean that the signal handler executed in a non-main thread, which is what is causing the panic. Right now I think that async is tied to the main thread, which is why it's ok for the signal to fire in that context. Would it be any better if I disabled only SIGTERM and not all signals? |
tristan957
left a comment
There was a problem hiding this comment.
I think this needs SAFETY comments.
There are different threads running in the rag background workers. Only the main thread should receive the SIGTERM signal because the signal handler for it has an assert that checks for this condition, and that signal handler is part of the BackgroundWorker module. So disable signals before creating/running any new threads, and re-enable them only when we poll wait_latch. Signals received while other threads are active (and have signals blocked) will set the signals to pending, and they will be delivered once we poll. The async that polls wait_latch is tied to the thread where it was created, and that is safe to run the SIGTERM signal handler from pgrx v0.14.1.