-
-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
Async::Service::Policy default behavior (6 failures / 60s) is reasonable, but
in multi-service Falcon setups a failure loop in one service can stop the entire
container, including healthy critical services (e.g. web).
This is expected by policy design, but easy to hit with common integrations (for
example ActiveJob Inline service mode).
In my case, this was triggered by ActiveJob Inline service mode restart churn (tracked separately in async-job-adapter-active_job).
Observed behavior
With:
async-service 0.20.1async-container 0.34.2falcon 0.54.2
When one child repeatedly exits non-success, logs show:
Process is blocking, sending kill signal...Failure rate exceeded threshold, stopping container!
Then container.stop(true) stops all services.
Why this is problematic in practice
In mixed service containers, one non-critical service loop can take down the web
process even when web itself is healthy.
Request
Could you consider one or more of:
- Stronger docs/examples for
container_policyin Falcon configs (service-scoped handling). - Clear guidance for integrations that should not run under strict default escalation without
custom policy. - Optional policy mode or helper pattern for service-aware escalation (e.g. ignore/soft-fail selected service names).
Repro context
The loop source in my case is tracked separately in async-job-adapter-active_job
(Inline processor in service mode), but this issue is about operability of
default policy behavior in multi-service containers.
Workaround
Using container_policy do ... end in falcon.rb with a custom policy that ignores escalation for background-jobs but keeps strict behavior for web/supervisor.
TL;DR: We were using :async_job, but with its default Inline backend; in Falcon service mode that worker exits immediately, gets restarted in a tight loop, and async-service’s failure-rate policy escalated those repeated exits into a full container stop. We mitigated by making policy service-aware (ignore background-jobs escalation, keep strict web/supervisor enforcement), so background jobs still run and the web app no longer gets taken down.