feat: add production-readiness features (CORS, cache, health check, structured errors, request logging, graceful shutdown)#21
Conversation
…tructured errors, request logging, graceful shutdown) - Add CORS middleware with configurable origins/methods/headers - Add structured JSON error responses for all API errors - Add request logging middleware with latency, status, IP tracking - Add /health endpoint with engine status, uptime, and system metrics - Add in-memory response cache with TTL and automatic eviction - Add /cache/stats endpoint for cache monitoring - Add graceful shutdown signal handling (SIGINT/SIGTERM) - Add HEALTHCHECK directive to Dockerfile - Add cache_ttl, cache_max_size, cors config options and CLI flags - Add unit tests for cache and middleware
…ngine fallback) Add fault-tolerance and self-healing capabilities to handle the inherent disadvantages of web scraping (blocking, CAPTCHAs, engine downtime): - Exponential backoff retry: retries transient failures with increasing delays (1s->2s->4s...), skips retries on CAPTCHAs (IP-level issue) - Circuit breaker: auto-disables engines after consecutive failures, periodically tests recovery via half-open state - Proxy rotation: round-robin proxy pool with auto-disable after 3 consecutive failures, re-enables all when none are available - Engine fallback: when primary engine fails, automatically tries alternative engines transparently - Resilient megasearch: parallel search with circuit breaker protection - New /resilience/stats endpoint for monitoring breaker states - Health endpoint enhanced with circuit breaker degradation status - Configurable via CLI flags and config.yaml New files: core/retry.go, core/proxy.go, core/circuit_breaker.go, core/resilient.go, and comprehensive unit tests for all three.
karust
left a comment
There was a problem hiding this comment.
Thanks for the PR! I like the direction overall, and there are several useful additions here
Good ideas:
- CORS support is useful for browser-based clients
- caching identical requests can reduce repeated scraping pressure
- healthcheck and stats endpoints are useful for Docker/ops visibility
One product-level question:
- Should dedicated engine endpoints like
/google/searchor/bing/searchfall back to a different engine at all? That may be surprising for users who explicitly want results from a specific engine. Fallback feels more appropriate formega/*or for an explicitly enabled resilient mode.
A few suggestions:
- move cache and resilience into separate sections in
config.yamlinstead of adding many flat app flags - make resilience optional
- if resilience stays, let users configure fallback engines explicitly for more control
- make cache optional as well, or disable it when
cache_max_size=0
There are also a few issues I think should be fixed before merge:
- fallback results are cached under the requested engine key, not the actual serving engine. Example: first
/google/searchfails over to Yandex and returnsX-Fallback-Engine: yandex,X-Cache: MISS; second/google/searchreturnsX-Cache: HIT, but now there is no indication the cached result actually came from Yandex - search requests still appear to be allowed during circuit-breaker open / half-open flow in ways that don’t match the intended behavior, and
retry_instats do not seem to reflect new failures correctly - rate limits via
engine.GetRateLimiter()do not seem to be applied inside the resilient searcher - proxy rotation is configured, but the proxy pool does not seem to be used in actual resilient searches
/healthreturns HTTP 503 for a merely degraded state, which could cause unnecessary container restarts if only one engine is failing
I’m going to leave this open for now rather than merge it in the current form.
If you’d like to continue with it, I’d be happy to review another revision.
Based in part on work from PR #21 by @Sai-Prashanth123, adapted and integrated with project-specific fixes.
|
Thanks again for PR and the amount of groundwork you put into it. I went through the ideas and implementation in detail and ported/reworked the useful parts into the current codebase in a way that better matches OpenSERP’s direction and config model. The I’m closing this PR because the final implementation landed through a different integration/rework path rather than by merging this branch directly, but your work absolutely helped shape the release. Credit is also included in the Thanks for the contribution. |
Summary
This PR adds 6 production-readiness features to make OpenSERP more robust, observable, and frontend-friendly.
Changes
1. CORS Middleware (
core/middleware.go)Access-Control-Allow-Origin/Methods/HeadersheadersOPTIONSrequests automatically--corsflag orcors:in config2. Structured JSON Error Responses (
core/middleware.go){"error": "...", "code": 503, "message": "..."}3. Request Logging Middleware (
core/middleware.go)4.
/healthEndpoint (core/server.go)HEALTHCHECKdirective to Dockerfile for container orchestration5. In-Memory Response Cache (
core/cache.go)X-Cache: HIT/MISSresponse header for transparencyGET /cache/statsendpoint for monitoring6. Graceful Shutdown (
main.go)SIGINT/SIGTERMsignals for clean exitCtrl+Cordocker stopNew Config Options
cache_ttl--cache_ttl300cache_max_size--cache_max_size1000cors--corstrueFiles Changed (10 files, +634 lines)
core/middleware.gocore/cache.gocore/cache_test.gocore/middleware_test.gocore/server.gocmd/root.gocmd/serve.goconfig.yamlDockerfilemain.goBackward Compatibility
All changes are fully backward-compatible:
cache_ttl: 0--cors=false/health,/cache/stats) are additive only